Friday, November 10, 2017

Raspberry pi RAID array 3: health check & building a level 0 array to use for additional backup

I wanted to build a level-0 array out of two 1 TB drives (total size should be 2 TB) to act as a backup for my existing level-1 array (size: 2 TB), but I first discovered a problem with the level-1 array - it was only listing one of the constituent drives as actually being present, and it's state was "degraded".  Here's how I fixed that problem and built the level-0 array, and started my backup.

Repairing the existing array

Everything seemed to be working fine as far as normal operations on the raspberry pi were concerned - I routinely use the disk to store temperature sensor data, act as a web server to display that, store music and video.  But when I went to check on it and it appeared to be using only one of the drives:
mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Mon Mar  9 01:48:54 2015
     Raid Level : raid1
     Array Size : 1953349440 (1862.86 GiB 2000.23 GB)
  Used Dev Size : 1953349440 (1862.86 GiB 2000.23 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent
    Update Time : Fri Nov 10 10:37:43 2017
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0
           Name : raspberrypi:2
           UUID : f71c500d:1aad4778:f8b156b4:036268a5
         Events : 10835391
    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        1      removed
sdb1 was no longer part of the array, and the state was listed as degraded.  Googling for "raid array degraded" took me to this forum discussion (https://ubuntuforums.org/showthread.php?t=2298605) I first tried re-adding the second drive as suggested, when that didn't work, I used the add command which did:
mdadm /dev/md2 --add /dev/sdb1
mdadm: added /dev/sdb1
The status then changed to:
/dev/md2:
        Version : 1.2
  Creation Time : Mon Mar  9 01:48:54 2015
     Raid Level : raid1
     Array Size : 1953349440 (1862.86 GiB 2000.23 GB)
  Used Dev Size : 1953349440 (1862.86 GiB 2000.23 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent
    Update Time : Fri Nov 10 10:58:15 2017
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1
 Rebuild Status : 0% complete
           Name : raspberrypi:2
           UUID : f71c500d:1aad4778:f8b156b4:036268a5
         Events : 10835767
    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       2       8       17        1      spare rebuilding   /dev/sdb1
Checking /proc/mdstat indicates the rebuild will take ~31 hours:
cat /proc/mdstat  
Personalities : [raid1] md2 : active raid1 sdb1[2] sda1[0]      1953349440 blocks super 1.2 [2/1] [U_]      [>....................]  recovery =  0.6% (11755456/1953349440) finish=1880.9min speed=17203K/sec
unused devices: <none>

Making a new level 0 array to server as a backup of the level 1 array

Decided to use the two 1-TB drives to make a raid level 0 2-TB drive, that can then serve as a backup to the 2 TB level 1 array.  Followed previous blog post procedure:
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdc1 /dev/sdd1  
mdadm: /dev/sdc1 appears to contain an ext2fs file system
    size=976728064K  mtime=Wed Dec 31 19:00:00 1969
mdadm: /dev/sdc1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Sun Aug 17 13:14:28 2014
mdadm: /dev/sdd1 appears to contain an ext2fs file system
    size=976728064K  mtime=Wed Dec 31 19:00:00 1969
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Sun Aug 17 13:14:28 2014
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
It noticed that the 1-TB drives had previously been part of an array, but I had already decided to scrap that and start from scratch.  Checking on it indicates things appear to be OK:
cat /proc/mdstat
Personalities : [raid1] [raid0]
md0 : active raid0 sdd1[1] sdc1[0]
      1953455104 blocks super 1.2 512k chunks
   
md2 : active raid1 sdb1[2] sda1[0]
      1953349440 blocks super 1.2 [2/1] [U_]
      [>....................]  recovery =  1.7% (33407104/1953349440) finish=1842.4min speed=17367K/sec
   
unused devices: <none>
Used fdisk to partition, formatted partition (ext4), was able to mount and use.

Wrote a script to backup the existing array to the new array, the core of it is this rsync command:
rsync -av /media/raid/pictures /media/backup_raid/backup/ --delete
Explanation of options:
  • -a:  archive mode - preserves ownership, permissions, timestamps, acts recursively, preserves links, devices etc.  Needs to be run as root to achieve these though!
  • -v:  verbose mode.  Need I say more?
  • --delete:   delete any files that are present in the destination (/media/backup_raid/backup) that are not present in the source
Next up:  Setting up the remote raspberry pi and rsync'ing over the internets.

Edit:  raspberry pi became unresponsive, had to do a hard reboot and assemble the raids

After some amount of the raspberry pi attempting to do both of the rebuild and the backup, it became unresponsive to ssh.  After a hard reboot, both raid arrays appeared to be completely gone!  From  just the beginning of this post, I used:
sudo mdadm --assemble --scan
which resulted in finding and restoring the first raid (raid level 1, 2 2T drives).  However, this did not work for the raid 0 array, so from this page I used examine to see if it would be safe to force the assemble:
mdadm --examine /dev/sd[c,d]
This indicated there were 0 events for both drives, and hence no discrepancy, so it was safe to then use:
mdadm --assemble --force /dev/md0 /dev/sdc /dev/sdd
And with that both arrays were back in action.  I let the rebuild of the raid1 finish before starting the rsync / backup to the raid0, and it worked.

No comments:

Post a Comment