Opened 4 years ago

Closed 4 years ago

#803 closed defect (fixed)

Snippet generator crashes on empty query

Reported by: Robert Stepanek Owned by: Olly Betts
Priority: normal Milestone: 1.4.16
Component: Library API Version: 1.4.15
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Latest master crashes on my machine for the following test case:

DEFINE_TESTCASE(snippet_crasher, backend) {
    Xapian::Enquire enquire(get_database("apitest_simpledata"));
    enquire.set_query(Xapian::Query());
    Xapian::MSet mset = enquire.get_mset(0, 0);
    TEST_STRINGS_EQUAL(mset.snippet("foo", 3), "foo");
}

I tried to debug this but did not come to a conclusion what triggers this crash. The gdb trace of the crasher is

(gdb) run -b inmemory -v snippet_crasher
Starting program: /home/rost/work/cyruslibs/xapian/xapian-core/tests/apitest -b inmemory -v snippet_crasher
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running tests with backend "inmemory"...
Running test: snippet_crasher...
Program received signal SIGSEGV, Segmentation fault.
Xapian::Query::get_num_subqueries (this=this@entry=0x18) at api/query.cc:285
285         return internal.get() ? internal->get_num_subqueries() : 0;
(gdb) bt
#0  Xapian::Query::get_num_subqueries (this=this@entry=0x18) at api/query.cc:285
#1  0x00007ffff7f364ca in Xapian::check_query (query=..., exact_phrases=empty std::__cxx11::list, loose_terms=std::unordered_map with 0 elements, wildcards=empty std::__cxx11::list, fuzzies=empty std::__cxx11::list, 
    longest_phrase=@0x7fffffffb6c0: 0) at queryparser/termgenerator_internal.cc:723
#2  0x00007ffff7f377ec in Xapian::MSet::Internal::snippet (this=0x5555558bf630, text="foo", length=<optimized out>, length@entry=3, stemmer=..., flags=<optimized out>, flags@entry=3, hi_start="<b>", hi_end="</b>", omit="...")
    at queryparser/termgenerator_internal.cc:837
#3  0x00007ffff7de1b8c in Xapian::MSet::snippet (this=this@entry=0x7fffffffbc68, text="foo", length=length@entry=3, stemmer=..., flags=flags@entry=3, hi_start="<b>", hi_end="</b>", omit="...") at api/mset.cc:193
#4  0x0000555555756167 in test_snippet_crasher () at /usr/include/c++/8/bits/basic_string.h:252
#5  0x00005555557e4007 in test_driver::runtest (this=0x7fffffffd230, test=0x55555589f4f0 <ApiTestRunner::run() const::tests+2896>) at harness/testsuite.cc:344
#6  0x00005555557e5cc5 in test_driver::do_run_tests (this=0x7fffffffd230, b=..., e=...) at harness/testsuite.cc:699
#7  0x00005555557e6347 in test_driver::run_tests (e=..., b=..., this=0x7fffffffd230) at harness/testsuite.cc:661
#8  test_driver::run (tests=tests@entry=0x55555589e9a0 <ApiTestRunner::run() const::tests>) at harness/testsuite.cc:927
#9  0x00005555555e4ea3 in ApiTestRunner::run (this=0x7fffffffd6c0) at api_collated.h:372
#10 0x00005555557e221c in TestRunner::do_tests_for_backend_ (this=0x7fffffffd6c0, manager=0x7fffffffd5f0) at harness/testrunner.cc:123
#11 0x00005555557e25dd in TestRunner::do_tests_for_backend (manager=..., this=0x7fffffffd6c0) at harness/testrunner.h:64
#12 TestRunner::run_tests (this=this@entry=0x7fffffffd6c0, argc=<optimized out>, argv=<optimized out>) at harness/testrunner.cc:152
#13 0x00005555555e2ef1 in main (argc=<optimized out>, argv=<optimized out>) at apitest.cc:172

More specifically, when Query::get_num_subqueries is run, the Query::internal member points to the bogus address 0x18. Most likely that address will differ in other builds or environments. Valgrind also reports an invalid read of size 8 at this 0x18 address.

I could not determine where this data corruption occurs. The internal member is initialized to 0 in the constructor.

Change History (6)

comment:1 by Olly Betts, 4 years ago

Status: newassigned

ubsan says:

queryparser/termgenerator_internal.cc:837:26: runtime error: member access within null pointer of type 'const Xapian::Enquire::Internal'

So looks like the MSet ends up with a NULL Enquire object internal in this situation. Should be an easy fix.

Did you test 1.4.x?

comment:2 by Olly Betts, 4 years ago

Candidate fix:

--- a/xapian-core/queryparser/termgenerator_internal.cc
+++ b/xapian-core/queryparser/termgenerator_internal.cc
@@ -827,6 +827,10 @@ MSet::Internal::snippet(const string & text,
        max_tw *= 1.015625;
     }
 
+    Xapian::Query query;
+    if (enquire.get()) {
+       query = enquire->query;
+    }
     SnipPipe snip(length);
 
     list<vector<string>> exact_phrases;
@@ -834,7 +838,7 @@ MSet::Internal::snippet(const string & text,
     list<const Xapian::Internal::QueryWildcard*> wildcards;
     list<const Xapian::Internal::QueryEditDistance*> fuzzies;
     size_t longest_phrase = 0;
-    check_query(enquire->query, exact_phrases, loose_terms,
+    check_query(query, exact_phrases, loose_terms,
                wildcards, fuzzies, longest_phrase);
 
     vector<double> exact_phrases_relevance;

comment:3 by Robert Stepanek, 4 years ago

Did you test 1.4.x?

No, just on latest master at the time of submitting this issue.

comment:4 by Olly Betts, 4 years ago

Component: OtherLibrary API
Version: git master

Fixed applied to git master as e3ac2b9c0955b3b66084c538d3a4d1fbef50d82d.

This doesn't manifest in 1.4.15 - there the internal enquire member inside the returned MSet always gets set by Enquire::get_mset(). I'll check and see if this can be triggered in other ways there as it isn't an invasive fix so worth backporting if there's a case it helps.

comment:5 by Olly Betts, 4 years ago

Milestone: 1.4.16
Version: git master1.4.15

This variant of your testcase segfaults in the 1.4 branch:

    Xapian::MSet mset;
    TEST_STRINGS_EQUAL(mset.snippet("foo", 3), "foo");
Last edited 4 years ago by Olly Betts (previous) (diff)

comment:6 by Olly Betts, 4 years ago

Resolution: fixed
Status: assignedclosed

Added the variant testcase to master and backported for 1.4.16 as e3ac2b9c0955b3b66084c538d3a4d1fbef50d82d and 78114e2043182b49cf35a47a2116a242e7e34f68.

Note: See TracTickets for help on using tickets.