Opened 19 years ago

Last modified 5 months ago

#59 assigned enhancement

Compress postlist changes buffered in memory

Reported by: Olly Betts Owned by: Olly Betts
Priority: low Milestone: 2.0.0
Component: Backend-Glass Version: SVN trunk
Severity: minor Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description (last modified by Olly Betts)

If we could somehow reduce the memory used by the postlist changes chert buffers, we could buffer more and/or let the OS have more spare memory for buffering disk blocks. That should allow indexing to run faster. However we need to compress in such a way that we can still implement Xapian::Database methods including the effects of the buffered changes.

Attachments (1)

xapian-faster-flint-add-document.patch (5.7 KB ) - added by Olly Betts 8 years ago.
Externally hosted patch refererred to in an earlier comment

Download all attachments as: .zip

Change History (23)

comment:1 by Olly Betts, 19 years ago

Status: newassigned

comment:2 by Olly Betts, 19 years ago

Summary: Compress quartz postlist changes buffered in memoryCompress flint postlist changes buffered in memory

This is really a candidate for flint, not quartz now.

We *can* fairly easily compress in the common (and speed critical) case of appending documents by simple storing a sorted list of entries for each term (probably in the same format we use on disk) as we're always appending to it.

If we keep the existing map, we can handle the pure modification case, and also a mixture of modifications and updates.

comment:3 by Olly Betts, 19 years ago

Component: otherBackend-Flint
op_sys: otherAll
rep_platform: OtherAll
Severity: normalenhancement
Version: otherCVS HEAD

comment:4 by Olly Betts, 17 years ago

Priority: highnormal

comment:5 by Olly Betts, 17 years ago

This is a patch which adds a quick hack implementation:

http://www.oligarchy.co.uk/xapian/patches/xapian-faster-flint-add-document.patch

It probably won't apply cleanly to SVN HEAD, it disables replace_document() and delete_document(), and it could store the appended changes in a much more compact way, but even this crude approach was measurably faster (I don't recall the exact figures, but something like 10-15% I think).

comment:6 by Olly Betts, 17 years ago

Blocking: 120 added
Operating System: All

This doesn't require API or ABI changes, so can go in 1.0.x.

comment:8 by Richard Boulton, 16 years ago

Description: modified (diff)
Milestone: 1.1

comment:9 by Richard Boulton, 16 years ago

Blocking: 120 removed

(In #120) Remove the unfixed dependencies so we can close this bug - they're all marked for the 1.1.0 milestone.

comment:10 by Olly Betts, 16 years ago

Component: Backend-FlintBackend-Chert
Description: modified (diff)
Summary: Compress flint postlist changes buffered in memoryCompress chert postlist changes buffered in memory

Update to refer to chert rather than flint or quartz.

comment:11 by Olly Betts, 15 years ago

Milestone: 1.1.01.1.1

API and ABI compatible so bumping to 1.1.1.

comment:12 by Olly Betts, 15 years ago

Milestone: 1.1.11.1.7

Triaging milestone:1.1.1 bugs.

comment:13 by Olly Betts, 15 years ago

Should require API or ABI changes, but it would be really good to have this done. Leaving as "minor" for now.

comment:14 by Olly Betts, 15 years ago

Priority: normallow

Should *not* require I mean. Set priority.

comment:15 by Olly Betts, 15 years ago

Milestone: 1.1.71.2.0

Shouldn't need API or ABI changes (comment:13 is missing a "not") so bumping to stay on track for 1.2.0.

comment:16 by Olly Betts, 14 years ago

Component: Backend-ChertBackend-Brass

Mark for brass (though once working we could consider backporting).

comment:17 by Olly Betts, 14 years ago

Summary: Compress chert postlist changes buffered in memoryCompress postlist changes buffered in memory

Brass already uses less memory than chert does, but there's scope for further improvement.

In some cases, we can just flush the data we want to read, then read it from disk as we would if it hadn't been modified at all. Brass allows flushing changes for a single term, which makes this more feasible.

comment:18 by Olly Betts, 11 years ago

Milestone: 1.2.x1.3.x

This isn't 1.2.x material now.

comment:19 by Olly Betts, 10 years ago

Milestone: 1.3.x1.3.3

comment:20 by Olly Betts, 9 years ago

Component: Backend-BrassBackend-Glass

comment:21 by Olly Betts, 9 years ago

Milestone: 1.3.31.3.4

by Olly Betts, 8 years ago

Externally hosted patch refererred to in an earlier comment

comment:22 by Olly Betts, 8 years ago

Milestone: 1.3.41.4.x

An updated version of that patch, enhanced so that appended changes are stored as in the patch while inserted changes are stored as now, would probably be a nice gain for the effort. But this doesn't affect the ABI, so not a 1.4.0 blocker.

comment:23 by Olly Betts, 5 months ago

Milestone: 1.4.x2.0.0

Postponing. Could potentially be backported to a stable release as it shouldn't need incompatible ABI changes.

Note: See TracTickets for help on using tickets.