#23 closed enhancement (released)
Matcher could optimise by hoisting near/phrase filter
Reported by: | Olly Betts | Owned by: | Olly Betts |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | Library API | Version: | SVN trunk |
Severity: | minor | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
The near/phrase filter could usefully be hoisted up the tree in some cases (leaving the AND part where it is). Consider:
e-mail AND filter This is probably more efficient as a search for: `e AND mail AND filter' with results from that filtered for phrase matches on "e mail".
But that's not clear cut. It might be that every document matches that AND, but just one match the phrase. In that case, the current code will try the phrase check on all documents, find one match, skip to that posting in "filter", find it matches, and return one results.
If the phrase match is hoisted, then the 3-way AND needs to look at *all* of the postings for "filter", and the phrase filtering still does the same amount of work.
This extreme is probably rare, but it's not totally obvious that hoisting the phrase filter is a good idea generally. Or it might be there's a good heuristic for when to. If there's an AND with a rare term which we can hoist the filter above for example...
Change History (7)
comment:1 by , 21 years ago
Status: | new → assigned |
---|
comment:2 by , 20 years ago
Severity: | normal → enhancement |
---|
comment:3 by , 18 years ago
Blocking: | 120 added |
---|---|
Version: | 0.7.5 → SVN HEAD |
comment:4 by , 17 years ago
Blocking: | 200 added; 120 removed |
---|
I have a working patch for this which I'm currently running performance tests on. This should make it into 1.0.4, so marking this bug for 1.0.4.
comment:5 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
(Finally) fixed in SVN HEAD. The savings are impressive, and I think the worries about making things worse in some corner cases have turned out to be unfounded, or at least swamped by the improvements in real world cases.
comment:7 by , 17 years ago
Blocking: | 200 removed |
---|---|
Operating System: | → All |
It would be nice to implement this in the 1.0 series.