bup 0.09: git-based backup system for really huge datasets

Hi all,

bup is a file backup tool based on the git packfile format. If you’re
interested in git, you might find bup interesting because:

  • It can handle really massive datasets (hundreds of gigabytes)
    without melting down.

  • It can handle huge individual files (hundreds of gigabytes), such as
    virtual machine images or giant textual database dumps, while neither
    wasting disk space nor bogging down in xdelta.

  • It can backup files directly to a remote server, without creating
    git objects on the local system first.

  • It uses a different format for its index file (.bup/bupindex) that
    allows you to search and iterate non-linearly. Thus if you have a
    filesystem with a million files and only one of them is marked dirty,
    bup can back it up near-instantly.

  • Like git, it separates the concept of indexing the filesystem from
    the concept of actually making new commits. Thus it would be easy to
    plugin an inotify-like system eventually, avoiding the slow filesystem
    iteration every time you want to make a backup.

  • It introduces a “multi-index” file (midx) that has a sorted list of
    the objects from multiple .pack files, so that checking for a
    nonexistent object only needs to swap in two pages at most. (This is
    unimportant in git, but critical when most of your work is ingesting
    huge files whose sha1sums haven’t been seen before.)

  • It provides a FUSE-based filesystem so that you can easily browse
    your backup history, including exporting it via samba if you want.

bup doesn’t yet back up extra file metadata (beyond what git already
tracks). Obviously this will be needed relatively soon.

bup is still pretty experimental, but it’s already a useful tool for
backing up your files, even if those files include millions of files
and hundreds of gigs of VM images.

You can find the source code (and README) at github:

http://github.com/apenwarr/bup

To subscribe to the bup mailing list, send an email to:

bup-list+subscribe@googlegroups.com

Looking forward to everyone’s feedback.

Have fun,

Avery

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s