Table of Contents

RAID - mdadm (Software RAID)

(Re)build RAID

Replace a failing or failed disk

  1. Check the RAID was created
    cat /proc/mdstat
  2. Check disk status
    smartctl -H /dev/sdX
  3. Find serial number
    smartctl -i /dv/sdX
  4. If the new disk contains partitions
    1. Stop any Raid partitions with
      mdadm --stop /dev/md1
    2. Remove the superblocks
      mdadm --zero-superblock /dev/sdX1
    3. Remove existing partitions with gdisk /dev/sdX
      Command (m for help): d
    4. Create new RAID partition (if asked remove the existing signature)
      Command (m for help): d
      Command (m for help): n
      Command (m for help): t,fd00
  5. Add the new drive to the RAID
    mdadm /dev/md0 --add /dev/sdc1
  6. If the system does not need to use the disks during resync you may want to (temporarily) increase the sync speed:
    echo 1000000 > /proc/sys/dev/raid/speed_limit_max

    Before doing this check the current sync speed:

    cat /proc/sys/dev/raid/speed_limit_max
  7. If the RAID is incomplete, rebuilding (resyncing) of the RAID starts instantly. If the RAID is complete including the bad drive, and you just added a spare drive, you can proceed as follows (requires mdadm 3.3+ and a 3.2+ kernel)
    mdadm /dev/md0 --replace /dev/sdX1 --with /dev/sdc1

    sdX1 is the device you want to replace, sdc1 is the preferred device to do so and must be declared as a spare on your array. The –with option is optional, if not specified, any available spare will be used. After resyncing the RAID the replaced drive will be marked as failed.

  8. Remove the replaced disk which is marked as failed after resyncing has completed
    mdadm --manage /dev/md0 --remove /dev/sdX1
  9. Compare ouput of mdadm -Es to the contents of /etc/mdadm/mdadm/conf
    mdadm -Es >> /etc/mdadm/mdadm.conf

Check wether all volumes get mounted during system boot

Resync

Most Debian and Debian-derived distributions create a cron job which issues an array check at 0106 hours each first Sunday of the month in /etc/cron.d/mdadm. This task appears as resync in /proc/mdstat and syslog. So if you suddenly see RAID-resyncing for no apparent reason, this might be a place to take a look.

Normally the kernel will throttle the resync activity (c.f. nice) to avoid impacting the raid device performance.

However, it is a good idea to manage the resync parameters to get optimal performance.

Raid 1, 5, 6

Rebuild speed

read-ahead

Disable NCQ

Raid 5, 6 only

stripe_cache_size

It records the size (in pages per device) of the stripe cache which is used for synchronising all write operations to the array and all read operations if the array is degraded. The default is 256 which equals to 3MB memory consumption. Valid values are 17 to 32768. Make sure your system has enough memory available: memory_consumed = system_page_size * nr_disks * stripe_cache_size.

Prepare RAID with single disk

Prepare new disk

  1. If the new disk contains partitions
    1. Stop any Raid partitions with
      mdadm --stop /dev/md1
      mdadm --remove /dev/md1
    2. Remove the superblocks
      mdadm --zero-superblock /dev/sdX1
    3. Remove existing partitions with fdisk /dev/sdX
  2. Create a new partition utilizing the full disk space. When asked, remove the existing signature. Change partition type to Linux RAID
    sudo fdisk /dev/sdX
    Command (m for help): d
    Command (m for help): n
    Command (m for help): t,29
  3. Create the RAID
    mdadm --create /dev/mdX --level=raid1 --raid-devices=2 /dev/sdX1 missing
  4. Check the RAID was created
    cat /proc/mdstat
    ls /dev/md*
  5. Add a second disk
    mdadm --manage /dev/mdX --add /dev/sdX1

Move RAID to a new machine

  1. Scan for the old raid disks
    sudo mdadm --assemble --scan
  2. Mount the raid manually to confirm
    blkid
    sudo mount /dev/md0 /mnt
  3. Append info to mdadm.conf
    mdadm --detail --scan >> /etc/mdadm/mdadm.conf
  4. Update initramfs
    update-initramfs -u

Troubleshooting

Links

Increase drive capacity in a RAID -> LVM -> CRYPT setup

Replace drives in RAID

  1. Follow Replace a failing or failed disk at the top of this page
  2. Resize the array to the maximum supported by the underlying partitions
    mdadm --grow /dev/md5 -z max
  3. Follow the progress with
    watch -d cat /proc/mdstat

Increase LVM

  1. Check size of physical volume with
    pvdisplay
  2. Increase physical volume to utilize all available space
    pvresize /dev/mdX
  3. Increase logical volume to utilize all available space
    lvextend -l +100%FREE /dev/<volume_group>/<logical_volume>

Increase LUKS

  1. Inform LUKS to utilize all available space, you need the backup key to do this
    cryptsetup resize /dev/mapper/<volume_group>/<logical_volume>_crypt

Increase file system

  1. On-line resize file system
    resize2fs -p /dev/mapper/<volume_group>/<logical_volume>_crypt

Links