Opened 14 years ago

Closed 14 years ago

#455 closed defect (fixed)

"Value in posting list too large" error with 1.1.4 (chert and brass, not flint)

Reported by: tack Owned by: Olly Betts
Priority: normal Milestone: 1.0.19
Component: Backend-Chert Version: 1.1.4
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

I am able to trigger "Value in posting list too large" in Enquire.get_mset() with 1.1.4 using chert and brass (but not flint) on certain document sets (which I unfortunately cannot share) using the following general approach:

  1. Set XAPIAN_FLUSH_THRESHOLD=1000
  2. Open a (new) database for write
  3. Index a few thousand documents
  4. Periodically query the database

If I set XAPIAN_FLUSH_THRESHOLD to a value greater than the number of documents indexed (nor do I ever explicitly flush), the problem goes away.

The query performed in step 4 depends on the contents of the document being indexed, so it varies.

xapian-check indicates the database is consistent so it doesn't appear to be an on-disk corruption problem.

After setting a breakpoint at backends/chert/chert_postlist.cc:153, here is a backtrace:

#0  report_read_error (position=0xa7b8175 "�\001\a�") at backends/chert/chert_postlist.cc:153
#1  0xb7691677 in ChertPostList::move_to_chunk_containing (this=0xa0f37d8, desired_did=<value optimized out>)
    at backends/chert/chert_postlist.cc:876
#2  0xb76916ff in ChertPostList::jump_to (this=0xa0f37d8, desired_did=475) at backends/chert/chert_postlist.cc:971
#3  0xb76918ea in ChertPostListTable::get_doclength (this=0x8c2ca98, did=475, db=@0xbf901fa8)
    at backends/chert/chert_postlist.cc:68
#4  0xb7678515 in ChertDatabase::get_doclength (this=0x8c2ca70, did=475) at backends/chert/chert_database.cc:797
#5  0xb7678f4f in ChertWritableDatabase::get_doclength (this=0x8c2ca70, did=475)
    at backends/chert/chert_database.cc:1501
#6  0xb768ed2b in ChertPostList::get_doclength (this=0xa7166c0) at backends/chert/chert_postlist.cc:726
#7  0xb768cdb8 in ChertModifiedPostList::get_doclength (this=0xa7166c0)
    at backends/chert/chert_modifiedpostlist.cc:68
#8  0xb75ecb6e in LeafPostList::get_weight (this=0xa7166c0) at api/leafpostlist.cc:77
#9  0xb772fdfb in MultiAndPostList::get_weight (this=0xa7b8188) at matcher/multiandpostlist.cc:165
#10 0xb772af34 in SelectPostList::get_weight (this=0xa7b8270) at matcher/selectpostlist.h:52
#11 0xb7732da8 in MultiMatch::get_mset (this=0xbf9025dc, first=0, maxitems=5, check_at_least=5, mset=@0xbf902730, 
    stats=@0xbf902644, mdecider=0x0, matchspy_legacy=0x0, sorter=0x0) at matcher/multimatch.cc:574
#12 0xb7600bba in Xapian::Enquire::Internal::get_mset (this=0xa7004c0, first=0, maxitems=5, check_at_least=0, 
    rset=0x0, mdecider=0x0, matchspy_legacy=0x0) at api/omenquire.cc:670
#13 0xb76010f9 in Xapian::Enquire::get_mset (this=0xa3fafe8, first=0, maxitems=5, check_at_least=0, rset=0x0, 
    mdecider=0x0) at api/omenquire.cc:1034
#14 0xb7818257 in _wrap_Enquire_get_mset__SWIG_4 (nobjs=3, swig_obj=0xbf902798) at modern/xapian_wrap.cc:22653
#15 0xb7819719 in _wrap_Enquire_get_mset (self=0x0, args=0x8e4cc34) at modern/xapian_wrap.cc:22798
#16 0x0805cb97 in PyObject_Call ()
[...]

(All frames over 16 are Python as I am using the Python bindings, so I've omitted them for brevity.)

Attachments (1)

bug-455-fix.patch (4.6 KB ) - added by Olly Betts 14 years ago.
candidate fix

Download all attachments as: .zip

Change History (11)

comment:1 by tack, 14 years ago

The (Python) code that generates the query from step 4 looks like:

filtered = [re.sub(r"[^\w_']+", '', word) for word in phrase]
qs = [xapian.Query(word, 1, pos+1) for pos, word in enumerate(filtered) if word]
query = xapian.Query(xapian.Query.OP_PHRASE, qs) 

enquire = xapian.Enquire(xapdb)
enquire.set_query(query)
enquire.set_collapse_key(1)
enquire.set_sort_by_relevance()
matches = enquire.get_mset(0, 5)

If I comment out enquire.set_collapse_key(1), the problem goes away.

Value 1 exists in each document added to the database and is an integer serialized with xapian.sortable_serialise().

comment:2 by tack, 14 years ago

The backtrace originally reported isn't quite right. I was using 1.1.4 built with -O2, whereas I thought I was using r14141 with -O0 -g. Correct backtrace follows:

#0  report_read_error (position=0x9f3c485 "�\001\005�") at backends/chert/chert_postlist.cc:153
#1  0xb7638234 in ChertPostList::move_to_chunk_containing (this=0x9aa7d80, desired_did=475)
    at backends/chert/chert_postlist.cc:876
#2  0xb7638325 in ChertPostList::jump_to (this=0x9aa7d80, desired_did=475) at backends/chert/chert_postlist.cc:971
#3  0xb7638cec in ChertPostListTable::get_doclength (this=0x8b76098, did=475, db=@0xbfeeba98)
    at backends/chert/chert_postlist.cc:68
#4  0xb761e801 in ChertDatabase::get_doclength (this=0x8b76070, did=475) at backends/chert/chert_database.cc:797
#5  0xb761eb41 in ChertWritableDatabase::get_doclength (this=0x8b76070, did=475)
    at backends/chert/chert_database.cc:1483
#6  0xb7633722 in ChertPostList::get_doclength (this=0x9f12a10) at backends/chert/chert_postlist.cc:726
#7  0xb7632474 in ChertModifiedPostList::get_doclength (this=0x9f12a10)
    at backends/chert/chert_modifiedpostlist.cc:68
#8  0xb758a48e in LeafPostList::get_weight (this=0x9f12a10) at api/leafpostlist.cc:77
#9  0xb76db5bc in MultiAndPostList::get_weight (this=0x9f3c498) at matcher/multiandpostlist.cc:165
#10 0xb76d5458 in SelectPostList::get_weight (this=0x9f3c580) at matcher/selectpostlist.h:52
#11 0xb76dd161 in MultiMatch::get_mset (this=0xbfeec068, first=0, maxitems=5, check_at_least=5, mset=@0xbfeec1d0, 
    stats=@0xbfeec0d0, mdecider=0x0, matchspy_legacy=0x0, sorter=0x0) at matcher/multimatch.cc:574
#12 0xb75a2e0f in Xapian::Enquire::Internal::get_mset (this=0x9e4dd60, first=0, maxitems=5, check_at_least=0, 
    rset=0x0, mdecider=0x0, matchspy_legacy=0x0) at api/omenquire.cc:670
#13 0xb75a2f20 in Xapian::Enquire::get_mset (this=0x8bec350, first=0, maxitems=5, check_at_least=0, rset=0x0, 
    mdecider=0x0) at api/omenquire.cc:1034
#14 0xb77f1257 in _wrap_Enquire_get_mset__SWIG_4 (nobjs=3, swig_obj=0xbfeec238) at modern/xapian_wrap.cc:22653
#15 0xb77f2719 in _wrap_Enquire_get_mset (self=0x0, args=0x8ac25a4) at modern/xapian_wrap.cc:22798
#16 0x0805cb97 in PyObject_Call ()
[...]

comment:3 by tack, 14 years ago

Trying to reduce my program to generate a testcase and now Enquire.get_mset() is segfaulting:

#0  0xb75d8a8b in getint2 (p=0x2000200 <Address 0x2000200 out of bounds>, c=9) at ./common/unaligned.h:50
#1  0xb763abb8 in ChertTable::find_in_block (p=0x2000200 <Address 0x2000200 out of bounds>, key=
      {p = 0x8aac772 "\a"}, leaf=false, c=105) at backends/chert/chert_table.cc:475
#2  0xb7641bea in ChertTable::find (this=0x8aabee8, C_=0x8b88148) at backends/chert/chert_table.cc:509
#3  0xb76168d0 in ChertCursor::find_entry (this=0x8b87fc0, key=@0xbf81f75c) at backends/chert/chert_cursor.cc:175
#4  0xb76330c7 in ChertPostList::move_to_chunk_containing (this=0x8b88170, desired_did=1001)
    at backends/chert/chert_postlist.cc:846
#5  0xb7633325 in ChertPostList::jump_to (this=0x8b88170, desired_did=1001) at backends/chert/chert_postlist.cc:971
#6  0xb7633cec in ChertPostListTable::get_doclength (this=0x8aabee8, did=1001, db=@0xbf81f828)
    at backends/chert/chert_postlist.cc:68
#7  0xb7619801 in ChertDatabase::get_doclength (this=0x8aabec0, did=1001) at backends/chert/chert_database.cc:797
#8  0xb7619b41 in ChertWritableDatabase::get_doclength (this=0x8aabec0, did=1001)
    at backends/chert/chert_database.cc:1483
#9  0xb762e722 in ChertPostList::get_doclength (this=0x95bc9b0) at backends/chert/chert_postlist.cc:726
#10 0xb762d474 in ChertModifiedPostList::get_doclength (this=0x95bc9b0)
    at backends/chert/chert_modifiedpostlist.cc:68
#11 0xb758548e in LeafPostList::get_weight (this=0x95bc9b0) at api/leafpostlist.cc:77
#12 0xb76d65bc in MultiAndPostList::get_weight (this=0x95c8580) at matcher/multiandpostlist.cc:165
#13 0xb76d0458 in SelectPostList::get_weight (this=0x95c86b8) at matcher/selectpostlist.h:52
#14 0xb76d8161 in MultiMatch::get_mset (this=0xbf81fdf8, first=0, maxitems=5, check_at_least=5, mset=@0xbf81ff60, 
    stats=@0xbf81fe60, mdecider=0x0, matchspy_legacy=0x0, sorter=0x0) at matcher/multimatch.cc:574
#15 0xb759de0f in Xapian::Enquire::Internal::get_mset (this=0x95a3890, first=0, maxitems=5, check_at_least=0, 
    rset=0x0, mdecider=0x0, matchspy_legacy=0x0) at api/omenquire.cc:670
#16 0xb759df20 in Xapian::Enquire::get_mset (this=0x9234eb8, first=0, maxitems=5, check_at_least=0, rset=0x0, 
    mdecider=0x0) at api/omenquire.cc:1034
#17 0xb77ec257 in _wrap_Enquire_get_mset__SWIG_4 (nobjs=3, swig_obj=0xbf81ffc8) at modern/xapian_wrap.cc:22653
#18 0xb77ed719 in _wrap_Enquire_get_mset (self=0x0, args=0x89f98c4) at modern/xapian_wrap.cc:22798
#19 0x0805cb97 in PyObject_Call ()
[...]

comment:4 by Olly Betts, 14 years ago

Milestone: 1.1.5
Status: newassigned

Working on this...

by Olly Betts, 14 years ago

Attachment: bug-455-fix.patch added

candidate fix

comment:5 by Olly Betts, 14 years ago

I've attached a candidate fix, which allows the reduced testcase you provided (which you didn't want made public).

Can you test this with your full system?

If it fixes that too, I'll clean up the patch and apply it.

comment:6 by tack, 14 years ago

I applied the patch to a snapshot of r14208 and unfortunately even the reduced testcase is not only segfaulting, but segfaulting earlier.

Without the patch, r14208 behaves as before, with the assertion error "backends/chert/chert_cursor.cc:160: B->level <= level"

The backtrace with the patch is strange because it makes its way inside the brass backend:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7f21ad0 (LWP 28178)]
0xb783b37a in getint2 (p=0x0, c=9) at ./common/unaligned.h:50
50	    return p[c] << 8 | p[c + 1];
Current language:  auto; currently c++
(gdb) bt
#0  0xb783b37a in getint2 (p=0x0, c=9) at ./common/unaligned.h:50
#1  0xb786ac98 in BrassTable::find_in_block (p=0x0, key={p = 0x9d924d2 "\b"}, leaf=true, c=-1)
    at backends/brass/brass_table.cc:476
#2  0xb7878362 in BrassTable::find (this=0x9dab340, C_=0x9e2deb8) at backends/brass/brass_table.cc:519
#3  0xb783a18d in BrassCursor::find_entry (this=0x9e2d9a0, key=@0xbf9209d0) at backends/brass/brass_cursor.cc:177
#4  0xb7886f1b in ValueUpdater::update (this=0xbf920a44, did=1, value=@0x9e2e6b4) at backends/chert/chert_values.cc:274
#5  0xb78cffc8 in ChertValueManager::merge_changes (this=0x9dab6d4) at backends/chert/chert_values.cc:346
#6  0xb7891c95 in ChertWritableDatabase::open_value_list (this=0x9dab318, slot=1) at backends/chert/chert_database.cc:1576
#7  0xb79a16c2 in ValueStreamDocument::do_get_value (this=0xbf920f68, slot=1) at matcher/valuestreamdocument.cc:67
#8  0xb77e536b in Xapian::Document::Internal::get_value (this=0xbf920f68, valueid=1) at api/omdocument.cc:284
#9  0xb7975d53 in Collapser::process (this=0xbf920fe0, item=@0xbf921028, postlist=0xa3854c0, vsdoc=@0xbf920f68, mcmp=@0xbf921134)
    at matcher/collapser.cc:82
#10 0xb7984fbd in MultiMatch::get_mset (this=0xbf9213ec, first=0, maxitems=5, check_at_least=5, mset=@0xbf921580, 
    stats=@0xbf921454, mdecider=0x0, matchspy_legacy=0x0, sorter=0x0) at matcher/multimatch.cc:664
#11 0xb77f2a4d in Xapian::Enquire::Internal::get_mset (this=0xa37b5e8, first=0, maxitems=5, check_at_least=0, rset=0x0, 
    mdecider=0x0, matchspy_legacy=0x0) at api/omenquire.cc:670
#12 0xb77f2e02 in Xapian::Enquire::get_mset (this=0xa040e88, first=0, maxitems=5, check_at_least=0, rset=0x0, mdecider=0x0)
    at api/omenquire.cc:1034
#13 0xb7dca4af in _wrap_Enquire_get_mset__SWIG_4 (nobjs=3, swig_obj=0xbf9215e4) at modern/xapian_wrap.cc:22653
#14 0xb7dcc0b0 in _wrap_Enquire_get_mset (self=0x0, args=0xb7eb939c) at modern/xapian_wrap.cc:22798
#15 0x0805cb97 in PyObject_Call ()
#16 0x080c7aa7 in PyEval_EvalFrameEx ()
#17 0x080c96e5 in PyEval_EvalFrameEx ()
#18 0x080cb1f7 in PyEval_EvalCodeEx ()
#19 0x080cb347 in PyEval_EvalCode ()
#20 0x080ea818 in PyRun_FileExFlags ()
#21 0x080eaab9 in PyRun_SimpleFileExFlags ()
#22 0x08059335 in Py_Main ()
#23 0x080587f2 in main ()

I'm compiling with CXXFLAGS='-O0 -g' and --enable-assertions. I also tried building with gcc 4.1 and it made no difference.

Is the patch you attached complete? Perhaps you forgot to diff one or more files?

comment:7 by Olly Betts, 14 years ago

That patch is complete.

It looks like we have issues with having classes called ValueUpdater in both brass and chert, which violates the one definition rule. I've fixed this already on my brass-btree branch where I hit similar issues. Not sure why we seem to usually get away with this on trunk (I guess the classes are pretty much identical aside from brass vs chert, so perhaps that's why), but I've cherry-picked that change and committed it to trunk as r14209.

Does my patch applied to r14209 work any better for you?

comment:8 by tack, 14 years ago

Yes, it does. The patch applied to r14214 now properly (well, properly inasmuch as there are no assertion failures, exceptions, or segfaults) executes the testcase I provided as well as the larger application and original dataset where I first discovered the problem.

comment:9 by Olly Betts, 14 years ago

Committed a cleaned up version of the patch to brass in r14267, and to chert in r14268.

The issue is also relevant to flint (though not demonstrated by the supplied testcase as values are stored differently in flint) so needs backporting to flint too, and then backporting to 1.0.x.

comment:10 by Olly Betts, 14 years ago

Milestone: 1.1.51.0.19
Resolution: fixed
Status: assignedclosed

Backported to flint in r14269, and that to 1.0 in r14270 (along with an unrelated change to tests/api_none.cc which I managed to accidentally commit at the same time).

This also affects quartz and backports pretty cleanly, so fixed there in r14272.

Note: See TracTickets for help on using tickets.