This week, it finally happened. I think it’s the first time in 20 years that a hard drive has died on me without warning. And it was also the first time I was using an NVMe drive, but that could be a coincidence.

The drive was still under warranty (barely a year and a half old). I even had a spare lying around. But the true cost of restoration is, of course, my own labor. My planning had not been perfect (for such a remote event, as I had judged). However, it was easy enough. I simply installed NixOS from a USB loader and downloaded my configuration from my backup on my NAS (daily rsync jobs to the rescue). I also downloaded all the important files for my home directory. Then, it was simply a matter of adjusting a few things in the configuration file, rebuilding the system, and voilà. Well, except for a few things that didn’t work quite right for some reason and had to be manually fixed, but nothing major.

However, next time I want this to be even easier. It’s probably overkill to install a RAID controller and have multiple drives running in RAID1 or RAID5, but the restoration process is still too much manual work. I was thinking of regularly backing up my main drive on the block device level, so I would just have to swap out the drive and restore the delta from the backup. I’m not quite sure if that’s feasible or a good idea. For my personal system, I have to balance the investment of preparing for a disaster with the likelihood and impact of such an event. This seems like a good trade-off, but I would be curious to hear how other people prepare for drive failure.

  • HelloRoot@lemy.lol
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 days ago

    I have btrfs snapshots with snapper on my desktop. It keeps the last 20 snapshots. Sending them to a second drive would require an equal amount of space as the main drive, which is ~850GB / 1T full.

    But the borg backup for the same takes only ~450GB and also keeps the last 20 versions. Because of the smaller size, sending the backup over the network is also quicker than with btrfs.

    So I use btrfs to restore situations about filechanges (for example a bad system update).

    Borg is easier to set up a central server for all my devices, because it takes much less space. I run https://github.com/Ravinou/borgwarehouse . So I use that in case where the drive fails. To restore I set up the same partition layout as before and then throw the borg backup at it. It was easy enough so far.