What is OP_ELITE_SET for? How does it differ from OP_OR?
If you want to implement a feature which finds documents similar to a piece of text, an obvious approach is to build an "OR" query from all the terms in the text, and run this query against a database containing the documents. However such a query can contain a lots of terms and be quite slow to perform, yet many of these terms don't contribute usefully to the results.
OP_ELITE_SET operator can be used instead of
OP_OR in this situation.
OP_ELITE_SET selects the most important N terms and then acts as an
query with just these, ignoring any other terms. This will usually return results just
as good as the full
OP_OR query, but much faster.
In general, the
OP_ELITE_SET operator can be used when you have a large OR query,
but it doesn't matter if the search completely ignores some of the less important terms in
You can specify a parameter to the query constructor which control the number of terms
OP_ELITE_SET will pick. If not specified, this defaults to 10 (Xapian used
to default to
ceil(sqrt(number_of_subqueries)) if there are more than 100 subqueries, but
this rather arbitrary special case was dropped in 1.3.0). For example, this will pick
the best 7 terms:
Xapian::Query query(Xapian::Query::OP_ELITE_SET, subqs.begin(), subqs.end(), 7);
If the number of subqueries is less than this threshold,