Opened 14 years ago

Closed 14 years ago

#427 closed defect (fixed)

xapian-compact results in corrupt postlist table (test data included)

Reported by: Henry Owned by: Olly Betts
Priority: normal Milestone: 1.1.4
Component: Backend-Chert Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: Linux

Description

Either xapian-compact is corrupting the data, or the data is corrupt to begin with (even though xapian-check reports source indexes are ok).

How to reproduce:

Extract tgz file (creates two folders: index1 & temp.index1).

Check sources:

# xapian-check-1.1 index1/

# xapian-check-1.1 temp.index1/

(both should test OK).

# mkdir dst

Test1 - compacts OK:

# xapian-compact-1.1 temp.index1/ dst

# xapian-check-1.1 dst

(should check out OK).

Test2 - compact reports OK, but check fails:

# rm dst/*

# xapian-compact-1.1 index1/ dst

# xapian-check-1.1 dst

(reports errors in postlist).

Needless to say, compacting both index1 and temp.index1 into dst will compact OK, but the check will fail.

Attachments (3)

testidxfolders.tgz (37.3 KB ) - added by Henry 14 years ago.
xapian-compact-fix1.patch (1.4 KB ) - added by Richard Boulton 14 years ago.
Partial fix
compacttest.patch (1.4 KB ) - added by Richard Boulton 14 years ago.
Patch to testsuite to exhibit this bug

Download all attachments as: .zip

Change History (11)

by Henry, 14 years ago

Attachment: testidxfolders.tgz added

comment:1 by Henry, 14 years ago

Let me know if you need the individual source indexes (100 of them, split into two). The ones I used for the two test indexes were individually checked with xapian-check as well.

comment:2 by Henry, 14 years ago

Severity: normalmajor

Further tests confirm the following:

# merges ok, xapian-check-1.1 ok; composite_index can be searched on: xapian-compact-1.1 src1 src2 src3 srcN... composite_index

# merges ok, xapian-check-1.1 fails; big_composite_index cannot be search on: xapian-compact-1.1 composite_index1 composite_indexN... big_composite_index

help!

comment:3 by Olly Betts, 14 years ago

Component: OtherBackend-Chert
Milestone: 1.1.4
Severity: majornormal
Status: newassigned

I found a more minimal example - the spelling and position tables aren't relevant, and neither is temp.index1. This reproduces the issue:

$ rm -rf index1/spelling.*
$ rm -rf index1/position.*
$ (rm -rf dst;../bin/xapian-compact index1 dst && ../bin/xapian-check dst) 2>&1|more

Running delve on index1 reports 611 distinct terms, while for dst it reports 710, so it appears there's a bug in xapian-compact here as that statistic shouldn't be changed by compaction. The alternative seems to be that index1 is malformed in a way which xapian-check doesn't detect.

This appears to be trunk-only (the chert format has changed incompatibly since 1.1.3) so lowering the priority, and marking for 1.1.4.

comment:4 by Richard Boulton, 14 years ago

Thanks for the details - I've reproduced the error following the steps in your last comment. xapian-check passes on index1, but fails on the output of xapian-compact, so I'm pretty sure the error is in the trunk version of xapian compact.

by Richard Boulton, 14 years ago

Attachment: xapian-compact-fix1.patch added

Partial fix

comment:5 by Richard Boulton, 14 years ago

The patch I've just applied partially fixes this: the databases produced with this fix appear to be nearly valid, but return a value of 0 for get_lastdocid() - I think there's a secondary problem causing this which will need a different fix.

My patch addresses the following issue: there was an off-by-one error in the truncation of the key of follow-on chunks of postlists in PostlistCursor which was meant to make the key into the equivalent key for an initial chunk. My fix needs a testcase (which will in turn need a database in which the postlist for a term is split into more than one chunk), but seems to help, and makes sense to me. The problem is hidden in flint because all keys have a trailing '\0' byte, so the key for the first chunk in a postlist had the '\0' byte trailing when returned from PostlistCursor, but still matched the key for the next chunk. For Chert and Brass, the first chunk's key doesn't have a trailing '\0', so didn't match the following keys after they had been (insufficiently) truncated.

by Richard Boulton, 14 years ago

Attachment: compacttest.patch added

Patch to testsuite to exhibit this bug

comment:6 by Richard Boulton, 14 years ago

Just written a testcase which exhibits this problem. It currently fails for chert and brass, but passes for flint.

comment:7 by Richard Boulton, 14 years ago

Fixed in trunk in revisions r13857, r13858 and r13859, with a test in r13860. (sorry for the rather unclean patch history) The off-by-one fix in r13857 needed not to be applied to the keys for doclenchunks, since they don't use a \0 byte separator.

comment:8 by Richard Boulton, 14 years ago

Resolution: fixed
Status: assignedclosed

I've just gone back and rechecked with the original test files supplied, and these changes now produce a database which passes xapian-check happily. The problem only existed in chert and brass which are not in the 1.0 branch, and only in trunk (not in the 1.1.3 release), so marking this as closed: no need to backport.

Note: See TracTickets for help on using tickets.