Opened 9 years ago
Last modified 20 months ago
#700 new defect
Support Enquire::matching_terms_begin() without termlist table?
Reported by: | Olly Betts | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 2.0.0 |
Component: | Backend-Glass | Version: | |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description (last modified by )
(Split out of #181)
Currently Enquire::matching_terms_begin()
uses the termlist of the document, comparing it with terms in the query. This means it doesn't work if the database has no termlist. It's also another item to lookup for each result, and comparing the two lists of terms isn't free.
It's also arguably not quite correct in some cases, for example for this query:
A OR (B AND NOT C)
It'll report A
and B
as matching terms in a document containing all three terms, but perhaps only A
should be reported in such a case since B AND NOT C
wouldn't say B
matched this document.
We could record the information about matching terms for each candidate entry in the proto-MSet
, which would solve both of these issues. The tricky part is doing this in a way which doesn't incur a significant space or time overhead during the match. E.g. a bitmap of matching terms is fairly space efficient.
If we don't care about the corner cases of which terms match like the one above, we could also skip through the posting lists a second time to get this information. More data to decode, but it's likely to already be in cache.
Probably doesn't need API or ABI changes, so suitable for 1.4.x.
Change History (2)
comment:1 by , 5 years ago
Description: | modified (diff) |
---|
comment:2 by , 20 months ago
Milestone: | 1.4.x → 2.0.0 |
---|