Tag Archives: btrfs

Considering my backup systems

With the recent news that Crashplan were doing away with their “Home” offering, I had reason to reconsider my choice of online backup backup provider. Since I haven’t written anything here lately and the results of my exploration (plus description of everything else I do to ensure data longevity) might be of interest to others looking to set up backup systems for their own data, a version of my notes from that process follows.

The status quo

I run a Linux-based home server for all of my long-term storage, currently 15 terabytes of raw storage with btrfs RAID on top. The choice of btrfs and RAID allows me some degree of robustness against local disk failures and accidental damage to data.

If a disk fails I can replace it without losing data, and using btrfs’ RAID support it’s possible to use heterogenous disks, meaning when I need more capacity it’s possible to remove one disk (putting the volume into a degraded state) and add a new (larger) one and rebalance onto the new disk.

btrfs’ ability to take copy-on-write snapshots of subvolumes at any time makes it reasonable to take regular snapshots of everything, providing a first line of defense against accidental damage to data. I use Snapper to automatically create rolling snapshots of each of the major subvolumes:

  • Synchronized files (mounted to other machines over the network) have 8 hourly, 7 daily, 4 weekly and 3 monthly snapshots available at any time.
  • Staging items (for sorting into other locations) have a snapshot for each of the last two hours only, because those items change frequently and are of low value until considered further.
  • Everything else keeps one snapshot from the last hour and each of the last 3 days.

This configuration strikes a balance according to my needs for accident recovery and storage demands plus performance. The frequently-changed items (synchronized with other machines and containing active projects) have a lot of snapshots because most individual files are small but may change frequently, so a large number of snapshots will tend to have modest storage needs. In addition, the chances of accidental data destruction are highest there. The other subvolumes are either more static or lower-value, so I feel little need to keep many snapshots of them.

I use Crashplan to back up the entire system to their “cloud”1 service for $5 per month. The rate at which I add data to the system is usually lower than the rate at which it can be uploaded back to Crashplan as a backup, so in most cases new data is backed up remotely within hours of being created.

Finally, I have a large USB-connected external hard drive as a local offline backup. Also formatted with btrfs like the server (but with the entire disk encrypted), I can use btrfs send to send incremental backups to this external disk, even without the ability to send information from the external disk back. In practice, this means I can store the external disk somewhere else completely (possibly without an Internet connection) and occasionally shuttle diffs to it to update to a more recent version. I always unplug this disk from power and its host computer when not being updated, so it should only be vulnerable to physical damage and not accidental modification of its contents.

Synchronization and remotes

For synchronizing current projects between my home server (which I treat as the canonical repository for everything), the tools vary according to the constraints of the remote system. I mount volumes over NFS or SMB from systems that rarely or never leave my network. For portable devices (laptop computers), Syncthing (running on the server and portable device) makes bidirectional synchronization easy without requiring that both machines always be on the same network.

I keep very little data on portable devices that is not synchronized back to the server, but because it is (or, was) easy to set up, I used Crashplan’s peer-to-peer backup feature to back up my portable computers to the server. Because the Crashplan application is rather heavyweight (it’s implemented in Java!) and it refuses to include peer-to-peer backups in uploads to their storage service (reasonably so; I can’t really complain about that policy), my remote servers back up to my home server with Borg.

I also have several Android devices that aren’t always on my home network- these aren’t covered very well by backups, unfortunately. I use FolderSync to automatically upload things like photos to my server which covers the extent of most data I create on those devices, but it seems difficult to make a backup of an Android device that includes things like preferences and per-app data without rooting the device (which I don’t wish to do for various reasons).

Summarizing the status quo

  • btrfs snapshots offer quick access to recent versions of files.
  • btrfs RAID provides resilience against single-disk failures and easy growth of total storage in my server.
  • Remote systems synchronize or back up most of their state to the server.
  • Everything on the server is continuously backed up to Crashplan’s remote servers.
  • A local offline backup can be easily moved and is rarely even connected to a computer so it should be robust against even catastrophic failures.

Evaluating alternatives

Now that we know how things were, we can consider alternative approaches to solve the problem of Crashplan’s $5-per-month service no longer being available. The primary factors for me are cost and storage capacity. Because most of my data changes rarely but none of it is strictly immutable, I want a system that makes it possible to do incremental backups. This will of course also depend on software support, but it means that I will tend to prefer services with straightforward pricing because it is difficult to estimate how many operations (read or write) are necessary to complete an incremental backup.

Some services like Dropbox or Google Drive as commonly-known examples might be appropriate for some users, but I won’t consider them. As consumer-oriented services positioned for the use case of “make these files available whenever I have Internet access,” they’re optimized for applications very different from the needs of my backups and tend to be significantly more expensive at the volumes I need.

So, the contenders:

  • Crashplan for Small Business: just like Crashplan Home (which was going away), but costs $10/mo for unlimited storage and doesn’t support peer-to-peer backup. Can migrate existing Crashplan Home backup archives to Small Business as long as they are smaller than 5 terabytes.
  • Backblaze: $50 per year for unlimited storage, but their client only runs on Mac and Windows.
  • Google Cloud Storage: four flavors available, where the interesting ones for backups are Nearline and Coldline. Low cost per gigabyte stored, but costs are incurred for each operation and transfer of data out.
  • Backblaze B2: very low cost per gigabyte, but incurs costs for download.
  • Online.net C14: very low cost per gigabyte, no cost for operations or data transfer in the “intensive” flavor.
  • AWS Glacier: lowest cost for storage, but very high latency and cost for data retrieval.

The pricing is difficult to consume in this form, so I’ll make some estimates with an 8 terabyte backup archive. This somewhat exceeds my current needs, so should be a useful if not strictly accurate guide. The following table summarizes expected monthly costs for storage, addition of new data and the hypothetical cost of recovering everything from a backup stored with that service.

Service Storage cost Recovery cost Notes
Crashplan $10 0 “Unlimited” storage, flat fee.
Backblaze $4.17 0 “Unlimited” storage, flat fee.
GCS Nearline $80 ~$80 Negligible but nonzero cost per operation. Download $0.08 to $0.23 per gigabyte depending on total monthly volume and destination.
GCS Coldline $56 ~$80 Higher but still negligible cost per operation. All items must be stored for at least 90 days (kind of).
B2 $40 $80 Flat fee for storage and transfer per-gigabyte.
C14 €40 0 “Intensive” flavor. Other flavors incur per-operation costs.
Glacier $32 $740 Per-gigabyte retrieval fees plus Internet egress. Reads may take up to 12 hours for data to become available. Negligible cost per operation. Minimum storage 90 days (like Coldline).

Note that for Google Cloud and AWS I’ve used the pricing quoted for the cheapest regions; Iowa on GCP and US East on AWS.

Analysis

Backblaze is easily the most attractive option, but the availability restriction for their client (which is required to use the service) to Windows and Mac makes it difficult to use. It may be possible to run a Windows virtual machine on my Linux server to make it work, but that sounds like a lot of work for something that may not be reliable. Backblaze is out.

AWS Glacier is inexpensive for storage, but extremely expensive and slow when retrieving data. The pricing structure is complex enough that I’m not comfortable depending on this rough estimate for the costs, since actual costs for incremental backups would depend strongly on the details of how they were implemented (since the service incurs charges for reads and writes). The extremely high latency on bulk retrievals (up to 12 hours) and higher cost for lower-latency reads makes it questionable that it’s even reasonable to do incremental backups on Glacier. Not Glacier.

C14 is attractively priced, but because they are not widely known I expect backup packages will not (yet?) support it as a destination for data. Unfortunately, that means C14 won’t do.

Google Cloud is fairly reasonably-priced, but Coldline’s storage pricing is confusing in the same ways that Glacier is. Either flavor is better pricing-wise than Glacier simply because the recovery cost is so much lower, but there are still better choices than GCS.

B2’s pricing for storage is competitive and download rates are reasonable (unlike Glacier!). It’s worth considering, but Crashplan still wins in cost. Plus I’m already familiar with software for doing incremental backups on their service (their client!) and wouldn’t need to re-upload everything to a new service.

Fallout

I conclude that the removal of Crashplan’s “Home” service effectively means a doubling of the monthly cost to me, but little else. There are a few extra things to consider, however.

First, my backup archive at Crashplan was larger than 5 terabytes so could not be migrated to their “Business” version. I worked around that by removing some data from my backup set and waiting a while for those changes to translate to “data is actually gone from the server including old versions,” then migrating to the new service and adding the removed data back to the backup set. This means I probably lost a few old versions of the items I removed and re-added, but I don’t expect to ever need any of them.

Second and more concerning in general is the newfound inability to do peer-to-peer backups from portable (and otherwise) computers to my own server. For Linux machines that are always Internet-connected Borg continues to do the job, but I needed a new package that works on Windows. I’ve eventually chosen Duplicati, which can connect to my server the same way Borg does (over SSH/SFTP) and will in general work over arbitrarily-restricted internet connections in the same way that Crashplan did.

Concluding

I’m still using Crashplan, but converting to their more-expensive service was not quite trivial. It’s still much more inexpensive to back up to their service compared to others, which means they still have some significant freedom to raise the cost until I consider some other way to back up my data remotely.

As something of a corollary, it’s pretty clear that my high storage use on Crashplan is subsidized by other customers who store much less on the service; this is just something they must recognize when deciding how to price the service!

High-availability /home revisited

About a month ago, I wrote about my experiments in ways to keep my home directory consistently available. I ended up concluding that DRBD is a neat solution for true high-availability systems, but it’s not really worth the trouble for what I want to do, which is keeping my home directory available and in-sync across several systems.

Considering the problem more, I determined that I really value a simple setup. Specifically, I want something that uses very common software, and is resistant to network failures. My local network going down is an extremely rare occurence, but it’s possible that my primary workstation will become a portable machine at some point in the future- if that happens, anything that depends on a constant network connection becomes hard to work with.

If an always-online option is out of the question, I can also consider solutions which can handle concurrent modification (which DRBD can do, but requires using OCFS, making that solution a no-go).

Rsync

rsync is many users’ first choice for moving files between computers, and for good reason: it’s efficient and easy to use.  The downside in this case is that rsync tends to be destructive, because the source of a copy operation is taken to be the canonical version, any modifications made in the destination will be wiped out.  I already have regular cron jobs running incremental backups of my entire /home so the risk of rsync permanently destroying valuable data is low.  However, being forced to recover from backup in case of accidental deletions is a hassle, and increases the danger of actual data loss.

In that light, a dumb rsync from the NAS at boot-time and back to it at shutdown could make sense, but carries undesirable risk.  It would be possible to instruct rsync to never delete files, but the convenience factor is reduced, since any file deletions would have to be done manually after boot-up.  What else is there?

Unison

I eventually decided to just use Unison, another well-known file synchronization utility.  Unison is able to handle non-conflicting changes between destinations as well as intelligently detect which end of a transfer has been modified.  Put simply, it solves the problems of rsync, although there are still situations where it requires manual intervention.  Those are handled with reasonable grace, however, with prompting for which copy to take, or the ability to preserve both and manually resolve the conflict.

Knowing Unison can do what I want and with acceptable amounts of automation (mostly only requiring intervention on conflicting changes), it became a simple matter of configuration.  Observing that all the important files in my home directory which are not already covered by some other synchronization scheme (such as configuration files managed with Mercurial) are only in a few subdirectories, I quickly arrived at the following profile:

root = /home/tari
root = /media/Caring/sync/tari

path = incoming
path = pictures
path = projects
path = wallpapers

Fairly obvious function here, the two sync roots are /home/tari (my home directory) and /media/Caring/sync/tari (the NAS is mounted via NFS at /media/Caring), and only the four listed directories will be syncronized. An easy and robust solution.

I have yet to configure the system for automatic syncronization, but I’ll probably end up simply installing a few scripts to run unison at boot and when shutting down, observing that other copies of the data are unlikely to change while my workstation is active.  Some additional hooks may be desired, but I don’t expect configuration to be difficult.  If it ends up being more complex, I’ll just have to post another update on how I did it.

Update Jan. 30: I ended up adding a line to my rc.local and rc.shutdown scripts that invokes unison:

su tari -c "unison -auto home"

Note that the Unison profile above is stored as ~/.unison/home.prf, so this handles syncing everything I listed above.

Btrfs

I recently converted the root filesystem on my netbook, a now rather old Acer Aspire One with an incredibly slow 1.8″ Flash SSD, from the ext3 I had been using for quite a while to the shiny new btrfs, which becomes more stable every time the Linux kernel gets updated. As I don’t keep any data of particular importance on there, I had no problem with running an experimental filesystem on it.

Not only was the conversion relatively painless, but the system now performs better than it ever did with ext3/4.

Conversion

Btrfs supports a nearly painless conversion from ext2/3/4 due to its flexible design. Because btrfs has almost no fixed locations for metadata on the disc, it is actually possible to allocate btrfs metadata inside the free space in an ext filesystem. Given that, all that’s required to convert a filesystem is to run btrfs-convert on it- the only requirement is that the filesystem not be mounted.

As the test subject of this experiment was just my netbook, this was easy, since I keep a rather simple partition layout on that machine. In fact, before the conversion, I had a single 8GB ext4 partition on the system’s rather pathetic SSD, and that was the extent of available storage. After backing up the contents of my home directory to another machine, I proceeded to decimate the contents of my home directory and drop the amount of storage in-use from about 6GB to more like 3GB, a healthy gain.

Linux kernel

To run a system on Btrfs, there must, of course, be support for it in the kernel. Because I customarily build my own kernels on my netbook, it was a simple matter of enabling Btrfs support and rebuilding my kernel image. Most distribution kernels probably won’t have such support enabled since the filesystem is still under rather heavy development, so it was fortunate that my setup made it so easy.

GRUB

The system under consideration runs GRUB 2, currently version 1.97, which has no native btrfs support. That’s a problem, as I was hoping to only have a single partition. With a little research, it was easy to find that no version of GRUB currently supports booting from btrfs, although there is an experimental patchset with provides basic btrfs support in a module. Unfortunately, to load a module, GRUB needs to be able to read the partition in which the module resides. If my /boot is on btrfs, that’s a bit troublesome. Thus, the only option is for me to create a separate partition for /boot, containing GRUB’s files and my Linux kernel image to boot, formatted with some other file system. The obvious choice was the tried-and-true ext3.

This presents a small problem, in that I need to resize my existing root partition to make room on the disc for a small /boot partition. Easily remedied, however, with application of the Ultimate Boot CD, which includes the wonderful Parted Magic. GParted, included in Parted Magic, made short work of resizing the existing partition and its filesystem, as well as moving that partition to the end of the disc, which eventually left me with a shiny new ext3 partition filling the first 64MB of the disc.

Repartitioning

After creating my new /boot partition, it was a simple matter of copying the contents of /boot on the old partition to the new one, adjusting the fstab, and changing my kernel command line in the GRUB config file to mount /dev/sda2 as root rather than sda1.

Move the contents of /boot:

$ mount /dev/sda1 /mnt/boot
$ cp -a /boot /mnt/boot
$ rm -r /boot

Updated fstab:

/dev/sda1       /boot   ext3    defaults    0 1
/dev/sda2       /       btrfs   defaults    0 1

Finishing up

Finally, it was time to actually run btrfs-convert. I booted the system into the Arch Linux installer (mostly an arbitrary choice, since I had that image laying around) and installed the btrfs utilities package (btrfs-progs-unstable) in the live environment. Then it was a simple matter of running btrfs-convert on /dev/sda2 and waiting about 15 minutes, during which time the disc was being hit pretty hard. Finally, a reboot.

..following which the system failed to come back up, with GRUB complaining loudly about being unable to find its files. I booted the system from the Arch installer once again and ran grub-install on sda1 in order to reconfigure GRUB to handle the changed disc layout. With another reboot, everything was fine.

With my new file system in place, I took some time to tweak the mount options for the new partition. Btrfs is able to tune itself for solid-state devices, and will set those options automatically. From the Btrfs FAQ:

There are some optimizations for SSD drives, and you can enable them by mounting with -o ssd. As of 2.6.31-rc1, this mount option will be enabled if Btrfs is able to detect non-rotating storage.

However, there’s also a ssd_spread option:

Mount -o ssd_spread is more strict about finding a large unused region of the disk for new allocations, which tends to fragment the free space more over time. Mount -o ssd_spread is often faster on the less expensive SSD devices

That sounds exactly like my situation- a less expensive SSD device which is very slow when doing extensive writes to ext3/4. In addition to ssd_spread, I turned on the noatime option for the filesystem, which cuts down on writes at the expense of not recording access times for files and directories on the file system. As I’m seldom, if ever, concerned with access times, and especially so on my netbook, I lose nothing from such a change and gain (hopefully) increased performance.

Thus, my final (optimized) fstab line for the root filesystem:

/dev/sda2       /       btrfs   defaults,noatime,ssd_spread    0

Results

After running with the new setup for about a week and working on normal tasks with it, I can safely say that on my AA1, Btrfs with ssd_spread is significantly more responsive than ext4 ever was. While running Firefox, for example, the system would sometimes stop responding to input while hitting the disc fairly hard.

With Btrfs, I no longer have any such problem- everything remains responsive even under fairly high I/O load (such as while Firefox is downloading data from Firefox Sync, or when I’m applying updates).