# Considering my backup systems

With the recent news that Crashplan were doing away with their “Home” offering, I had reason to reconsider my choice of online backup backup provider. Since I haven’t written anything here lately and the results of my exploration (plus description of everything else I do to ensure data longevity) might be of interest to others looking to set up backup systems for their own data, a version of my notes from that process follows.

## The status quo

I run a Linux-based home server for all of my long-term storage, currently 15 terabytes of raw storage with btrfs RAID on top. The choice of btrfs and RAID allows me some degree of robustness against local disk failures and accidental damage to data.

If a disk fails I can replace it without losing data, and using btrfs’ RAID support it’s possible to use heterogenous disks, meaning when I need more capacity it’s possible to remove one disk (putting the volume into a degraded state) and add a new (larger) one and rebalance onto the new disk.

btrfs’ ability to take copy-on-write snapshots of subvolumes at any time makes it reasonable to take regular snapshots of everything, providing a first line of defense against accidental damage to data. I use Snapper to automatically create rolling snapshots of each of the major subvolumes:

• Synchronized files (mounted to other machines over the network) have 8 hourly, 7 daily, 4 weekly and 3 monthly snapshots available at any time.
• Staging items (for sorting into other locations) have a snapshot for each of the last two hours only, because those items change frequently and are of low value until considered further.
• Everything else keeps one snapshot from the last hour and each of the last 3 days.

This configuration strikes a balance according to my needs for accident recovery and storage demands plus performance. The frequently-changed items (synchronized with other machines and containing active projects) have a lot of snapshots because most individual files are small but may change frequently, so a large number of snapshots will tend to have modest storage needs. In addition, the chances of accidental data destruction are highest there. The other subvolumes are either more static or lower-value, so I feel little need to keep many snapshots of them.

# Back on topic

Returning from that digression, the point of switching this site back to a WordPress backend is to get data out to the public more reliably and faster, in order to preserve the information more permanently.  What finally pushed me back was a sudden realization that there’s nothing stopping me from customizing WordPress in a similar fashion to what I did on the Hyde-based site- it simply requires a bit of experience with the backend code.  While PHP is one language I tend to loathe, the immediate utility of a working system is more valuable than the potential utility of a system I need to program myself.

There’s another lesson I can derive from this experience, too: building a flexible system is good, but you should distribute it ready-to-go for a common use case.  Reducing the barrier to entry for a tool can make or break it, and tools that go unused are of no use- getting people using a new creation is the primary barrier to progress