Skip to content

separable repository format #370

@jonaballe

Description

@jonaballe

i originally intended to ask this on the mailing list, but i can't post or subscribe, so i'm just going to go ahead and post it here as a feature request.

the following paragraphs are part of what i intended to post to the mailing list (skip below them if you don't care about the motivation):


hello!

i'm in need of a new backup solution, and i've been playing with zbackup for a while. i like that it delegates in a true unix fashion.

what's nice about this is that zbackup does not implement transfer of the archive off-site itself. rather, its repository format cleanly separates data chunks from indexes. data chunks are never modified, only new ones added, and they are only read for restoring an archive. so you can use whatever you want to transfer the data chunks, and then just delete them. only the index files remain on the system, so new backups can still be created efficiently, even if the data chunks are now missing from the original system.

this is nice because in my setup, it makes a lot of sense to create a backup archive at one time and transfer it off-site slightly time-delayed. also, i do not have full control over the off-site storage (i can't install arbitrary software). so being able to use whatever software/script i want to transfer the data is a huge benefit.

the downside is that zbackup doesn't even implement reading the input files itself. rather, it takes an input stream (from tar, or any other archive program) and deduplicates it. this is where it gets inefficient, because even though the data is deduplicated, it generates a lot of disk i/o because tar still needs to fully read each file on every run of the backup.
....


these are the two features i'm curious about:

  • since attic reads the input files itself, it appears that it could be smarter than zbackup about deduplication (if a file's metadata indicates it hasn't changed, do not even attempt to read it, like in rsync).

it seems that this is the case, at least judging from some experiments i did. can you confirm that?

  • is attic's repository structure similar to zbackup's, i.e., can i create an attic repository locally, synchronize all its files off-site by my own means, and then safely delete the data chunks without affecting the effectiveness of future backups?

this seems not to be the case. how easy would it be to implement this?

thanks

  • johannes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions