Linux (desktop or server) with md RAID-10

Current setup on Ubuntu 9.10, installed via the alternate CD on x86_64.

Contrary to most posts online that say you can only boot from a partition on RAID1, not RAID10, after setting up both partitions on RAID10, I was able to boot just fine. There may be a difference between RAID10 in an n=2 (2 near-copies) setup rather than a f=2 setup (f2, 2 far-copies) since the far copies results in a much more complex striped dataset across all drives which grub may not be able to handle. I haven’t tested this, and have just set up raid10 in n2 (near, 2 copies) on the /boot partition since there’s no need for much more read speed on that particular partition.

I’ve set up the root / partition as f2 (far, 2 copies) which yields close to 4x read speeds when using 4 disks, which is quite impressive.  The write speeds are about 2x speed rather than 2-3x speeds with a n2 setup.  The only downside to f2 is that if you lose a disk, your write speeds can seriously suffer, but disks are cheap and there’s no point to plan on running a system for a long time with a degraded array!

Caveat:  This was written for setup with sda, sdb, sdc, sdd – but my setup was slightly different since I had two drives on-board and two drives on a sata controller card, which meant I had out of order disks.  So sda, b, d, e is roughly equivalent to a, b, c, d.  Once you set up your raid10 array, mdraid is smart enough to know what drive is what even if you move them around on different controllers or add other disks, so this ‘order’ doesn’t much matter in any case.  Enough confusion, please continue.

Setup:

This setup uses four 500GB drives

Boot with the Ubuntu 9.10 Alternate CD – you need this CD to get RAID support in any fashion.  The issue here is that raid10 is not supported by the installer, you’ll have to do some of these steps yourself.  You can create partitions a

When you get to the partition screen (or prior to this, as you’ll have to back out of this and and re-enter after you do the partitioning so that the system sees it) you can press ctrl-alt-F1 to get into a console.  Alternatively, you can do this partition step from the partitioner and relabel from the console to Linux raid autodetect.

From the console you’ll want to create your partitions as follows:

fdisk on each drive:  sda, sdb, sdc, sdd:

Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *           1          66      530113+  fd  Linux raid autodetect
/dev/sdc2              67         328     2104515   82  Linux swap / Solaris
/dev/sdc3             329       60801   485749372+  fd  Linux raid autodetect

Now to set up the raid 10 arrays on all four disks:

/boot partition: raid10 (near – 2 copy, 64k chunk size)

mdadm –create /dev/md0 –chunk=64 -R -l 10 -n 4 -p n2 /dev/sd[abcd]1

/ partition: raid10 (far – 2 copy, 256k chunk size)

mdadm –create /dev/md1 –chunk=256 -R -l 10 -n 4 -p f2 /dev/sd[abcd]3

At this point, you should be able to go back to the installer and see this md0 and md1 drive ready for action, if not, you can back out of the partitioner, and go back into it since it should re-read labels and other partition information.

You can now create your filesystems on md0 and md1 as follows:

/dev/md0  – partition /boot, filesystem ext4

/dev/md1 – partition /, filesystem ext4

/dev/sd[abcd]2 as your swap partitions (the system will create these as a striped swap file, there’s no real need here to use raid unless this is a server)

If you’re concerned about system availability, you should set up your swap filesystems similar to our /boot and / partitions.  Keep in mind that you’ll want raid10, and you can go with n2 or f2 depending on what you think your read/write situation will be.  If you expect swap to end up mainly being read operations, f2 may be your best bet.  If you never expect to use swap but want good performance in case an application goes crazy and you start using a lot of swap, use a large stripe size with n2 so your reads and writes won’t suffer.  Raid10 for swap is *expensive* (as far as CPU and I/O time) if you are going to use it a lot, so try to avoid it.

Grub setup

Grub should be installed on /dev/sda – at the end of the install it will usually install itself on all mbr’s of the drives.  You can manually install this in case it doesn’t do this, from the command line in Linux as follows (upon first boot):

grub-install /dev/sda

(replace /dev/sda with other drives and run this)

Make sure the bios is set to boot /dev/sda as the first choice.  You’ll need grub on all the drives in case you lose the first drive, and have the bios set up to try a number of drives so that you can boot even if /dev/sda has failed.

Setup as seen in system:

md0 – /boot/
md1 – /

[~]# cat /proc/mdstat
Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
md1 : active raid10 sdb3[1] sda3[0] sdd3[2] sde3[3]
971498496 blocks 64K chunks 2 near-copies [4/4] [UUUU]

md0 : active raid10 sda1[0] sdb1[1] sdd1[2] sde1[3]
1060096 blocks 64K chunks 2 near-copies [4/4] [UUUU]

mdadm –misc –detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Fri Feb 5 11:19:54 2010
Raid Level : raid10
Array Size : 1060096 (1035.42 MiB 1085.54 MB)
Used Dev Size : 530048 (517.71 MiB 542.77 MB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Fri Feb 5 14:31:54 2010
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

Layout : near=2, far=1
Chunk Size : 64K

UUID : 535fbdb4:65c76745:5bf5b0bf:9aa6b5c2
Events : 0.18

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 49 2 active sync /dev/sdd1
3 8 65 3 active sync /dev/sde1

Acoustic Management turned on:

hdparm -M 128 /dev/sda
hdparm -M 128 /dev/sdb
hdparm -M 128 /dev/sdd
hdparm -M 128 /dev/sde

Benchmarks

Seek times, using Seeker v2.0, 2007-01-15, http://www.linuxinsight.com/how_fast_is_your_disk.html

Single drive

Laptop 320GB Seagate, 5400RPM
[email protected]:~# ./seeker /dev/sda
Benchmarking /dev/sda [305245MB], wait 30 seconds…………………………
Results: 43 seeks/second, 23.06 ms random access time

mdraid RAID10 n2

[~]# ./seekerNat /dev/md0
Seeker v2.0(Nat1), 2007-12-18, http://www.linuxinsight.com/how_fast_is_your_disk.html
Benchmarking /dev/md0 [2120192 blocks, 1085538304 bytes, 1 GiB], wait 30 seconds
…………………………
Results: 154 seeks/second, 6.48 ms random access time (25553 < offsets < 1085219284)

avg-cpu: %user %nice %system %iowait %steal %idle
6.47 0.00 3.46 45.96 0.00 44.11

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 95.00 1160.00 0.00 2320 0
sdb 22.50 716.00 0.00 1432 0
sdd 94.50 1144.00 0.00 2288 0
sde 21.50 716.00 0.00 1432 0

[~]# ./seeker /dev/md0
Benchmarking /dev/md0 [1035MB], wait 30 seconds…………………………
Results: 163 seeks/second, 6.12 ms random access time

avg-cpu: %user %nice %system %iowait %steal %idle
6.13 0.00 1.72 45.83 0.00 46.32

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 34.50 276.00 0.00 552 0
sdb 5.50 44.00 0.00 88 0
sdd 26.00 208.00 0.00 416 0
sde 2.00 16.00 0.00 32 0

[[email protected]] [(tty)3] [Sun Feb 07 02:24 PM]
[~]# ./seekerNat /dev/md1
Seeker v2.0(Nat1), 2007-12-18, http://www.linuxinsight.com/how_fast_is_your_disk.html
Benchmarking /dev/md1 [1942996992 blocks, 994814459904 bytes, 994 GiB], wait 30 seconds
…………………………
Results: 58 seeks/second, 17.02 ms random access time (1844908697 < offsets < 994352688762)

avg-cpu: %user %nice %system %iowait %steal %idle
9.86 0.00 2.54 43.19 0.00 44.41

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 25.60 1.20 0.10 0.00 8.00 0.48 17.76 17.84 47.80
sdb 0.00 0.00 2.80 1.20 0.01 0.00 8.00 0.07 18.00 18.00 7.20
sdd 0.00 0.60 32.00 0.80 0.1

Leave a Reply

Your email address will not be published. Required fields are marked *

*