#484 closed defect (fixed)
QueryParser does not expand wildcarded terms in some cases
Reported by: | Daniel Ménard | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.0.21 |
Component: | QueryParser | Version: | 1.0.2 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
Hi,
It seems that the query parser does not expand wildcards if the query contains at least 3 terms and the wildcarded term is in the middle of the query.
Exemples:
OK : test* xapian user -> testable OR tester OR xapian OR user. OK : xapian user test* -> xapian OR user OR testable OR tester. NOT OK : xapian test* user -> xapian OR test OR user.
Attached is a small PHP script which reproduce the problem.
On my machine (Windows XP, Xapian php-bindings 1.2.0 binaries from flax.co.uk, php 5.2.13), this script produces the following output:
PHP version 5.2.13, Xapian version 1.2.0 Creating a new db containing one document with terms : xapian, tester, testable, user, query, query: test* xapian result: Xapian::Query(((testable:(pos=1) SYNONYM tester:(pos=1)) OR xapian:(pos=2))) OK. query: xapian test* result: Xapian::Query((xapian:(pos=1) OR (testable:(pos=2) SYNONYM tester:(pos=2)))) OK. query: test* xapian user result: Xapian::Query(((testable:(pos=1) SYNONYM tester:(pos=1)) OR xapian:(pos=2) OR user:(pos=3))) OK. query: xapian user test* result: Xapian::Query((xapian:(pos=1) OR user:(pos=2) OR (testable:(pos=3) SYNONYM tester:(pos=3)))) OK. query: xapian test* user result: Xapian::Query((xapian:(pos=1) OR test:(pos=2) OR user:(pos=3))) expect: Xapian::Query((xapian:(pos=1) OR (testable:(pos=2) SYNONYM tester:(pos=2)) OR user:(pos=3))) FAILS. query: xapian query test* user result: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR test:(pos=3) OR user:(pos=4))) expect: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR (testable:(pos=3) SYNONYM tester:(pos=3)) OR user:(pos=4))) FAILS. query: xapian que* test* user result: Xapian::Query((xapian:(pos=1) OR que:(pos=2) OR test:(pos=3) OR user:(pos=4))) expect: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR (testable:(pos=3) SYNONYM tester:(pos=3)) OR user:(pos=4))) FAILS. query: xapian que* user test* result: Xapian::Query((xapian:(pos=1) OR que:(pos=2) OR user:(pos=3) OR test:(pos=4))) expect: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR user:(pos=3) OR (testable:(pos=4) SYNONYM tester:(pos=4)))) FAILS.
I tried with various older versions of Xapian and I think that the problem was introduced in Xapian 1.0.2 because my tests pass with Xapian 1.0.1 but not with more recent versions:
PHP version 5.2.13, Xapian version 1.0.1 Creating a new db containing one document with terms : xapian, tester, testable, user, query, query: test* xapian result: Xapian::Query((testable:(pos=1) OR tester:(pos=1) OR xapian:(pos=2))) OK. query: xapian test* result: Xapian::Query((xapian:(pos=1) OR testable:(pos=2) OR tester:(pos=2))) OK. query: test* xapian user result: Xapian::Query((testable:(pos=1) OR tester:(pos=1) OR xapian:(pos=2) OR user:(pos=3))) OK. query: xapian user test* result: Xapian::Query((xapian:(pos=1) OR user:(pos=2) OR testable:(pos=3) OR tester:(pos=3))) OK. query: xapian test* user result: Xapian::Query((xapian:(pos=1) OR testable:(pos=2) OR tester:(pos=2) OR user:(pos=3))) OK. query: xapian query test* user result: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR testable:(pos=3) OR tester:(pos=3) OR user:(pos=4))) OK. query: xapian que* test* user result: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR testable:(pos=3) OR tester:(pos=3) OR user:(pos=4))) OK. query: xapian que* user test* result: Xapian::Query((xapian:(pos=1) OR query:(pos=2) OR user:(pos=3) OR testable:(pos=4) OR tester:(pos=4))) OK.
Sorry for spotting only now a 3 years old problem!
Attachments (1)
Change History (5)
by , 14 years ago
Attachment: | bug-wildcards.php added |
---|
comment:1 by , 14 years ago
Milestone: | → 1.2.1 |
---|---|
Version: | → 1.0.2 |
If this broke in 1.0.2, it was probably the addition of synonyms which did it.
Marking for 1.2.1, at least for now.
comment:2 by , 14 years ago
Not sure if it helps, but if the wildcarded term if followed by any character other than a space (e.g. a comma), the term is correctly expanded :
NOT OK: xapian test* user OK: xapian test*, user OK: xapian test* (user) OK: (xapian test*) user
Inspiration from queryparser.lemony?rev=9085, line 1209: "GROUP_TERM is a query term which follows a TERM or another GROUP_TERM and is only separated by whitespace characters."
comment:3 by , 14 years ago
Milestone: | 1.2.1 → 1.0.21 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
PHP script which reproduce the problem.