Opened 20 years ago
Last modified 12 months ago
#59 assigned enhancement
Compress postlist changes buffered in memory
Reported by: | Olly Betts | Owned by: | Olly Betts |
---|---|---|---|
Priority: | low | Milestone: | 2.0.0 |
Component: | Backend-Glass | Version: | SVN trunk |
Severity: | minor | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description (last modified by )
If we could somehow reduce the memory used by the postlist changes chert buffers, we could buffer more and/or let the OS have more spare memory for buffering disk blocks. That should allow indexing to run faster. However we need to compress in such a way that we can still implement Xapian::Database methods including the effects of the buffered changes.
Attachments (1)
Change History (23)
comment:1 by , 20 years ago
Status: | new → assigned |
---|
comment:2 by , 19 years ago
Summary: | Compress quartz postlist changes buffered in memory → Compress flint postlist changes buffered in memory |
---|
comment:3 by , 19 years ago
Component: | other → Backend-Flint |
---|---|
op_sys: | other → All |
rep_platform: | Other → All |
Severity: | normal → enhancement |
Version: | other → CVS HEAD |
comment:4 by , 18 years ago
Priority: | high → normal |
---|
comment:5 by , 17 years ago
This is a patch which adds a quick hack implementation:
http://www.oligarchy.co.uk/xapian/patches/xapian-faster-flint-add-document.patch
It probably won't apply cleanly to SVN HEAD, it disables replace_document() and delete_document(), and it could store the appended changes in a much more compact way, but even this crude approach was measurably faster (I don't recall the exact figures, but something like 10-15% I think).
comment:6 by , 17 years ago
Blocking: | 120 added |
---|---|
Operating System: | → All |
This doesn't require API or ABI changes, so can go in 1.0.x.
comment:8 by , 17 years ago
Description: | modified (diff) |
---|---|
Milestone: | → 1.1 |
comment:9 by , 17 years ago
Blocking: | 120 removed |
---|
(In #120) Remove the unfixed dependencies so we can close this bug - they're all marked for the 1.1.0 milestone.
comment:10 by , 16 years ago
Component: | Backend-Flint → Backend-Chert |
---|---|
Description: | modified (diff) |
Summary: | Compress flint postlist changes buffered in memory → Compress chert postlist changes buffered in memory |
Update to refer to chert rather than flint or quartz.
comment:13 by , 15 years ago
Should require API or ABI changes, but it would be really good to have this done. Leaving as "minor" for now.
comment:15 by , 15 years ago
Milestone: | 1.1.7 → 1.2.0 |
---|
Shouldn't need API or ABI changes (comment:13 is missing a "not") so bumping to stay on track for 1.2.0.
comment:16 by , 15 years ago
Component: | Backend-Chert → Backend-Brass |
---|
Mark for brass (though once working we could consider backporting).
comment:17 by , 15 years ago
Summary: | Compress chert postlist changes buffered in memory → Compress postlist changes buffered in memory |
---|
Brass already uses less memory than chert does, but there's scope for further improvement.
In some cases, we can just flush the data we want to read, then read it from disk as we would if it hadn't been modified at all. Brass allows flushing changes for a single term, which makes this more feasible.
comment:19 by , 11 years ago
Milestone: | 1.3.x → 1.3.3 |
---|
comment:20 by , 10 years ago
Component: | Backend-Brass → Backend-Glass |
---|
comment:21 by , 10 years ago
Milestone: | 1.3.3 → 1.3.4 |
---|
by , 9 years ago
Attachment: | xapian-faster-flint-add-document.patch added |
---|
Externally hosted patch refererred to in an earlier comment
comment:22 by , 9 years ago
Milestone: | 1.3.4 → 1.4.x |
---|
An updated version of that patch, enhanced so that appended changes are stored as in the patch while inserted changes are stored as now, would probably be a nice gain for the effort. But this doesn't affect the ABI, so not a 1.4.0 blocker.
comment:23 by , 12 months ago
Milestone: | 1.4.x → 2.0.0 |
---|
Postponing. Could potentially be backported to a stable release as it shouldn't need incompatible ABI changes.
This is really a candidate for flint, not quartz now.
We *can* fairly easily compress in the common (and speed critical) case of appending documents by simple storing a sorted list of entries for each term (probably in the same format we use on disk) as we're always appending to it.
If we keep the existing map, we can handle the pure modification case, and also a mixture of modifications and updates.