Opened 16 years ago
Closed 13 years ago
#348 closed enhancement (fixed)
Compress replication changesets
Reported by: | Olly Betts | Owned by: | Dan |
---|---|---|---|
Priority: | normal | Milestone: | 1.3.1 |
Component: | Backend-Brass | Version: | SVN trunk |
Severity: | normal | Keywords: | |
Cc: | dcolish@… | Blocked By: | |
Blocking: | Operating System: | All |
Description
We should probably compress changesets for replication with zlib as we store to disk and uncompress after they've been sent over the network.
Advantages are that replication would need much less disk space and network bandwidth. Disadvantage is a bit more CPU needed.
My feeling is that it's probably best to always do it though (at least until we have a backend which produces less compressible blocks) as replication is likely to be I/O bound. We can always use the equivalent of gzip -1
to trade speed/space.
Setting milestone:1.1.1 for now - I'm currently intending to mark replication as "experimental" for 1.1.0.
Attachments (2)
Change History (12)
comment:1 by , 16 years ago
Milestone: | 1.1.1 → 1.1.2 |
---|
comment:2 by , 15 years ago
Priority: | normal → high |
---|
comment:4 by , 15 years ago
I'm not fully convinced that it's sensible to always compress the changesets, though the argument about keeping the number of codepaths to a minimum seems quite compelling, and I suspect it would be an improvement in most cases. CPU doesn't seem likely to be the limiting factor when applying changesets on the clients, but I've not enough evidence to be sure about that either way.
Changesets already begin with a magic string to make the file easy to identify, followed by an integer version number (currently 1), so there would be no need to change the format on disk in a backwards incompatible manner to support both compressed and non-compressed changesets here: we'd just use version number 2 for compressed changesets. [The version numbers were partly implemented with this in mind, and partly so that we could support other schemes in future, such as storing the sequence of key-tag operations involved in a changeset, rather than the block level changes.]
If we're willing to support a second codepath for non-compressed changesets, we can punt this to 1.2.x. - we might need to leave compression off by default until 1.3.x if we did this though, so that any 1.2.x client could read changesets from any 1.2.y server.
I'd be happy to live with this, in the interests of getting 1.2 out the door.
comment:5 by , 15 years ago
Component: | Backend-Chert → Backend-Brass |
---|---|
Milestone: | 1.1.4 → 1.2.x |
Priority: | high → normal |
Type: | defect → enhancement |
And we can just drop uncompressed support in 1.3.x for brass if compressed is always a win (or close enough to one not to worry about it).
I suspect I may have to essentially reimplement replication for brass anyway - the current implementation is sadly rather mixed up with code we can't relicense, and brass is probably not going to be completely block based.
Marking for 1.2.x for now, but maybe 1.3.x is more realistic.
comment:6 by , 14 years ago
Cc: | added |
---|
comment:7 by , 14 years ago
Owner: | changed from | to
---|
comment:8 by , 13 years ago
Status: | new → assigned |
---|
comment:9 by , 13 years ago
Milestone: | 1.2.x → 1.3.0 |
---|
Dan says this is getting close, so pencilling in for 1.3.0 for now.
by , 13 years ago
Attachment: | compressed_cc.diff added |
---|
by , 13 years ago
Attachment: | compressed-changesets.diff added |
---|
comment:10 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
It seems this shouldn't require a lot of work, and it would be simpler just to have a single format for changesets rather than having the alternative of uncompressed which gives a second code path to test and maintain. So high priority.