Opened 15 years ago

Closed 12 years ago

#348 closed enhancement (fixed)

Compress replication changesets

Reported by: Olly Betts Owned by: Dan
Priority: normal Milestone: 1.3.1
Component: Backend-Brass Version: SVN trunk
Severity: normal Keywords:
Cc: dcolish@… Blocked By:
Blocking: Operating System: All

Description

We should probably compress changesets for replication with zlib as we store to disk and uncompress after they've been sent over the network.

Advantages are that replication would need much less disk space and network bandwidth. Disadvantage is a bit more CPU needed.

My feeling is that it's probably best to always do it though (at least until we have a backend which produces less compressible blocks) as replication is likely to be I/O bound. We can always use the equivalent of gzip -1 to trade speed/space.

Setting milestone:1.1.1 for now - I'm currently intending to mark replication as "experimental" for 1.1.0.

Attachments (2)

compressed_cc.diff (16.8 KB ) - added by Dan 12 years ago.
compressed-changesets.diff (26.2 KB ) - added by Dan 12 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 by Olly Betts, 15 years ago

Milestone: 1.1.11.1.2

comment:2 by Olly Betts, 15 years ago

Priority: normalhigh

It seems this shouldn't require a lot of work, and it would be simpler just to have a single format for changesets rather than having the alternative of uncompressed which gives a second code path to test and maintain. So high priority.

comment:3 by Olly Betts, 14 years ago

Milestone: 1.1.31.1.4

Bumping milestone.

comment:4 by Richard Boulton, 14 years ago

I'm not fully convinced that it's sensible to always compress the changesets, though the argument about keeping the number of codepaths to a minimum seems quite compelling, and I suspect it would be an improvement in most cases. CPU doesn't seem likely to be the limiting factor when applying changesets on the clients, but I've not enough evidence to be sure about that either way.

Changesets already begin with a magic string to make the file easy to identify, followed by an integer version number (currently 1), so there would be no need to change the format on disk in a backwards incompatible manner to support both compressed and non-compressed changesets here: we'd just use version number 2 for compressed changesets. [The version numbers were partly implemented with this in mind, and partly so that we could support other schemes in future, such as storing the sequence of key-tag operations involved in a changeset, rather than the block level changes.]

If we're willing to support a second codepath for non-compressed changesets, we can punt this to 1.2.x. - we might need to leave compression off by default until 1.3.x if we did this though, so that any 1.2.x client could read changesets from any 1.2.y server.

I'd be happy to live with this, in the interests of getting 1.2 out the door.

comment:5 by Olly Betts, 14 years ago

Component: Backend-ChertBackend-Brass
Milestone: 1.1.41.2.x
Priority: highnormal
Type: defectenhancement

And we can just drop uncompressed support in 1.3.x for brass if compressed is always a win (or close enough to one not to worry about it).

I suspect I may have to essentially reimplement replication for brass anyway - the current implementation is sadly rather mixed up with code we can't relicense, and brass is probably not going to be completely block based.

Marking for 1.2.x for now, but maybe 1.3.x is more realistic.

comment:6 by Dan, 13 years ago

Cc: dcolish@… added

comment:7 by Dan, 13 years ago

Owner: changed from Olly Betts to Dan

comment:8 by Dan, 13 years ago

Status: newassigned

comment:9 by Olly Betts, 12 years ago

Milestone: 1.2.x1.3.0

Dan says this is getting close, so pencilling in for 1.3.0 for now.

by Dan, 12 years ago

Attachment: compressed_cc.diff added

by Dan, 12 years ago

Attachment: compressed-changesets.diff added

comment:10 by Dan, 12 years ago

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.