Opened 10 years ago
Closed 9 years ago
#663 closed enhancement (fixed)
Make ValueRangeProcessor return a Query object
Reported by: | Olly Betts | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.3.6 |
Component: | QueryParser | Version: | SVN trunk |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
Currently VRP takes two strings by reference, potentially modifies them, and returns a value slot number.
It would be more flexible and easier to wrap for the bindings if it took two strings by const reference and returned a Query object instead. There's an example in the getting started guide suggesting adding boolean terms to restrict the number of docs a VRP has to consider, but this doesn't really work as you can't insert an OP_FILTER for such terms at the right place in the query tree - this change would allow that to work nicely.
We could provide this in 1.4 by having two overloaded virtual forms of the operator()()
method with the existing and new signature, and have a default implementation of the new method which calls the old.
Change History (8)
comment:1 by , 10 years ago
Milestone: | 1.3.3 → 1.3.4 |
---|---|
Status: | new → assigned |
comment:3 by , 10 years ago
Blocking: | 193 removed |
---|
This would address #193 too.
One issue I noticed: QueryParser
wants to know which slot the range is for, so it knows about combining ranges on the same slot with OP_OR
and then combine these groups with OP_AND
. That isn't unsurmountable, but would probably need us to return both a Query
object and some sort of string or number "key" to identify the type of range. Or perhaps this is an overly fussy little detail, which most people wouldn't expect us to worry about. FieldProcessor
just returns a Query
object after all.
comment:4 by , 9 years ago
We can check the type of the returned query with Query::get_type()
and could easily add a method to return the slot for a range query. But then returning different query types for some cases wouldn't work so well.
Perhaps some sort of OP_FILTER_KEY
operator, which takes a single subquery and just delegates to it when evaluated, but serves as something which can store the "key" above (and could be added by FieldProcessor
as well). If we go that way, a clear name would be good.
This is a fairly simple change (once we have a plan) and the benefit seems high so I'd quite like to try to do it for 1.4.0.
comment:7 by , 9 years ago
Milestone: | 1.3.x → 1.3.6 |
---|
I have been working on this.
I've tried making operator()
return std::pair<Xapian::Query, std::string>
but it makes writing a custom range processor class feel rather clumsy. A few details:
- The second parameter is a "grouping key" - I used
str(slot)
for value ranges - I used
std::make_pair(Xapian::Query(), std::string())
to mean "range not handled" - I provided a helper method
make_value_range(Xapian::valueno slot, const std::string& begin, const std::string& end)
to create a value range (withslot == Xapian::BAD_VALUENO
meaning "range not handled", andend.empty()
meaningOP_VALUE_GE
), which helps a bit FieldProcessor
really ought to be consistent with this - we could change its API incompatibly at this point, but I am aware people have already started to use it
I'm wondering if the OP_FILTER_KEY
approach would work better overall - the key benefits I can see are:
- Simple return type (
Xapian::Query
) - Simple cases (prefixed terms, value ranges) can just work automatically by introspecting the returned
Query
object - Complex cases (e.g. wanting to group
date:
range processor anddate:
field processor together, and grouping a range which may be optimised using terms) are supported - Something like
OP_NULL
can cleanly signify "range not handled" - having a "no query"Query
object (distinct from aQuery
which matches nothing) seems like it would be more widely useful
comment:8 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Fixed in [cca8b4a5cf6f68bc44b26aff00a9a6ccb7925d74].
After trying various approaches, I've made the return type Xapian::Query
with a new OP_INVALID
op for "range not handled". But rather than OP_FILTER_KEY
, the grouping key can optionally be explicitly specified to QueryParser
methods add_boolean_prefix()
and add_rangeprocessor()
, with sane defaults if not specified.
Perhaps a cleaner way to provide this is to add a new
RangeProcessor
class, since there would no longer be a tie to usingOP_VALUE_RANGE
for the resulting query (though it would probably be the most common implementation). ThenValueRangeProcessor
can be deprecated.We should think if there are any other aspects of VRPs we'd like to adjust, as this would be a good time to address them.
Bumping to 1.3.4 so we can think about this a bit more.