Opened 5 years ago

Closed 5 years ago

#790 closed defect (fixed)

Operators |= and &= with Xapian::Query{}

Reported by: Дилян Палаузов Owned by: Olly Betts
Priority: normal Milestone: 1.4.12
Component: Library API Version: 1.4.11
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

I do expect, that Xapian::Query{} constructs an empty query, that is the neutral element towards AND and OR operations. Thus in:

std::vector<std::string> v {"A", "B"};
Xapian::Query q1, q2, qOR {Xapian::Query::OP_OR, v.begin(), v.end()}, qAND {Xapian::Query::OP_OR, v.begin(), v.end()}

q1 |= qOR → q1.get_type() will be OR and the q1 terms will be A and B

q2 &= qAND → q2.get_type() will be AND and the q2 terms will be A and B

Instead q1.get_type() == q2.get_type() == LEAF_MATCH_NOTHING and contain no terms.

Reproducer:

#include <stdio.h>
#include <string>
#include <xapian.h>
void print_terms(const Xapian::Query& f, std::string s) {
  printf("%s: ", s.c_str());
  for (Xapian::TermIterator it = f.get_terms_begin();  it != f.get_terms_end(); ++it)
    printf("%s ", (*it).c_str());
  printf(" get_type=%i\n", f.get_type());
}

int main() {
  std::vector<std::string> v {"abc", "def", "ghi"};
  Xapian::Query t;
  print_terms(t, "t");
  Xapian::Query q { Xapian::Query::OP_OR, v.begin(), v.end()};
  print_terms(q, "q");
  t &= q;
  print_terms(t, "t2");
  return 0;
}

-- actual result
t:  get_type=103
q: abc def ghi  get_type=1
t2:  get_type=103

-- expected result
t:  get_type=103
q: abc def ghi  get_type=1
t2: abc def ghi get_type=1

Change History (4)

comment:1 by Olly Betts, 5 years ago

Component: OtherLibrary API
Milestone: 1.4.x1.4.12
Status: newassigned

I do expect, that Xapian::Query{} constructs an empty query, that is the neutral element towards AND and OR operations

Long, long ago (probably 15-20 years) we tried making a default-constructed query work in the context-sensitive way you're expecting, but concluded it was more confusing than helpful.

Now Xapian::Query{} works just like int{} (which is 0) in bitwise logical operations or bool{} (which is false) in boolean logical operations. So Xapian::Query{} gives you a query which matches nothing (and Xapian::Query::MatchNothing is just a static instance of this).

It's easy to handle the case of accumulating onto a query with AND:

if (t.empty()) {
    t = q;
} else {
    t &= q;
}

(And t.empty() inlines to a NULL pointer test so the extra check is low overhead.)

q1 |= qOR → q1.get_type() will be OR and the q1 terms will be A and B q2 &= qAND → q2.get_type() will be AND and the q2 terms will be A and B

Instead q1.get_type() == q2.get_type() == LEAF_MATCH_NOTHING and contain no terms.

I tested with current RELEASE/1.4 branch (this code hasn't changed since 1.4.11) and I can't reproduce q1.get_type() == LEAF_MATCH_NOTHING - instead I get q1.get_type() == OP_OR (which is what I'd expect).

The query operators do simple folding at construction time around Query{}, so for q2 the result is simplified to Query() rather than giving you an OP_AND query, and q1 is actually qOR.

The API documentation for the default Query constructor just says "Default constructor." which is uselessly vague. I'll improve that for the next release.

comment:2 by Дилян Палаузов, 5 years ago

#include <stdio.h>
#include <xapian.h>

int main() {
  std::vector<std::string> v;
  printf("%i\n", Xapian::Query {Xapian::Query::OP_OR, v.begin(), v.end()}.get_type());
}

prints 103 = LEAF_MATCH_NOTHING, which is problematic if the query is later appended with OP_AND to another query. If v is empty, that query will never match.

comment:3 by Olly Betts, 5 years ago

Xapian::Query{Xapian::Query::OP_OR, v.begin(), v.end()} is a query for any documents which match any of the terms in v. If v.empty() then no documents match any of the terms in v, so this constructs a MatchNothing query.

If you want to filter a query by OR of terms in v but handle empty v as meaning "don't apply a filter" you want something like:

if (!v.empty()) {
    q = Xapian::Query{Xapian::Query::OP_FILTER, q, Xapian::Query{Xapian::Query::OP_OR, v.begin(), v.end()}};
}

comment:4 by Olly Betts, 5 years ago

Resolution: fixed
Status: assignedclosed

I've made a lot of improvements to the Query class API docs in [00f69cf3928b44a756ddafcde248610f72babf62] (backported for 1.4.12 in [976ea09888503b9c2dc188c9b8111f928b3c19d4]).

I think that addresses all the issues raised here, so I'm going to close this. It wouldn't be feasible to change how Query() or a query constructed from an empty iterator pair works at this point as it would force an incompatible change on existing code using the API.

We could perhaps make Query(OP_INVALID) behave like you expected Query() to - if you or somebody else thinks that would be beneficial, we could try it and see how it looks in various situations, and whether it's actually helpful or just confusing.

Note: See TracTickets for help on using tickets.