#49 closed enhancement (fixed)
Ensure OP_ELITE_SET matches at least some documents
| Reported by: | Olly Betts | Owned by: | Olly Betts | 
|---|---|---|---|
| Priority: | lowest | Milestone: | 1.2.9 | 
| Component: | Library API | Version: | SVN trunk | 
| Severity: | minor | Keywords: | |
| Cc: | Blocked By: | ||
| Blocking: | Operating System: | All | 
Description (last modified by )
OP_ELITE_SET should never select groups of subqueries which don't match any documents. (Currently, it will exclude those for which termfreq_max() is 0, but this may still result in a bad choice).
Change History (8)
comment:1 by , 21 years ago
| Severity: | blocker → normal | 
|---|---|
| Status: | new → assigned | 
comment:2 by , 21 years ago
| Severity: | normal → enhancement | 
|---|
comment:3 by , 21 years ago
| Component: | other → Library API | 
|---|---|
| op_sys: | other → All | 
| Priority: | high → lowest | 
| rep_platform: | Other → All | 
| Version: | other → CVS HEAD | 
comment:4 by , 19 years ago
| Operating System: | → All | 
|---|---|
| Owner: | changed from to | 
| Status: | assigned → new | 
comment:6 by , 15 years ago
| Description: | modified (diff) | 
|---|---|
| Milestone: | → 1.2.x | 
| Summary: | Ensure OP_ELITE_SET matches at least some document → Ensure OP_ELITE_SET matches at least some documents | 
No API or ABI changes required, so suitable for fixing in 1.2.x, so setting milestone to reflect this. Probably not a high priority though.
comment:7 by , 14 years ago
| Milestone: | 1.2.x → 1.3.0 | 
|---|---|
| Owner: | changed from to | 
| Status: | new → assigned | 
Revisiting this ticket, I'm now thinking that we should just note in the documentation that this can happen when OP_ELITE_SET is used with non-term subqueries. The natural use case for OP_ELITE_SET is to pick a sane-sized set of terms from a larger set and it's fine there.
Also, FILTER(OP_ELITE_SET(A,B,C,...),Z) might match some or no documents, depending which of A,B,C,... are selected, so why should OP_ELITE_SET(FILTER(A,Z),FILTER(B,Z),FILTER(C,Z),...) be handled so differently?
So unless there's disagreement, I suggest we document this and do it for 1.3.0 (and then backport to 1.2.x).
comment:8 by , 14 years ago
| Milestone: | 1.3.0 → 1.2.9 | 
|---|
Richard agreed on IRC that just documenting was reasonable, so done on trunk r16215.
Marking to backport for 1.2.9.


Since OP_ELITE_SET performs an OR on the subqueries it selects, this can only be a problem if the selected subqueries are all something like 'a AND b' or 'a NOT b' or NEAR/PHRASE operations, and none of these match anything.
This is pretty obscure, and I'm not sure what the solution is. Perhaps read the first posting from each subquery before picking the elite set to determine if any are in reality empty?