so like silly question but if I have two large binary files which have a lot of common data in them, and I want to store just the base file and then any differences as efficiently as possible, is there a way I can do that?

ok bonus complication: can I do this in a way that I can transparently access the data of a snapshot as though it weren’t simply a diff?

Show thread

(it looks like MAYBE one could do some turducken nonsense and make a block-level deduplicated btrfs disk image, but it’s also not automatic, or simple, at all, hmm)

Show thread

@ticky patch utilities like bsdiff, xdelta, or bps should work

@ticky there are many variations on fusefs out there that may do similar things

@dch I’m not clear in the case of these large files on whether git is going to actually store diffs here or just n versions separately?

@ticky at that point you'd need filesystem-level fuckery

@vikxin oh totally, absolutely, but I thought this was (in part) what btrfs was for, but that’ll learn me lmao

@ticky I can't speak for btrfs, but something sort of similar can be done on zfs. Though it's a little different; you take snapshots of volumes, basically, and you can refer back to them. btrfs probably has something similar but I've never used it.

Sign in to participate in the conversation

cybrespace: the social hub of the information superhighway jack in to the mastodon fediverse today and surf the dataflow through our cybrepunk, slightly glitchy web portal support us on patreon or liberapay!