Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#815 closed defect (fixed)

Omega: different search results in template query with regard to templates topterms, xml, opensearch.

Reported by: Gennadiy Owned by: Olly Betts
Priority: normal Milestone: 1.4.19
Component: Omega Version: 1.4.18
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: Linux

Description

Hello and thanks for the development of xapian and omega.

It seems strange to me, that the template "query" returns results differently than templates "topterms", "xml", "opensearch". I assume it is not intentional.

Setup: debian 11, debian packages xapian-omega (1.4.18-2), apache2 (all of them are pretty much "out of the box") The database was created via omindex (attached) with poppler-utils for pdf (but all of the discrepancies are present without pdf-entries too)

Discrepancies (the tiny database to reproduce them is attached):

  • relevance scale is always different (in query it seems to be with regard to the whole database, while in topterms|xml|opensearch with regard to the query results);

- omega?P=bigdata&DEFAULTOP=and&DB=default&FMT=query finds 9 matches, while

omega?P=bigdata&DEFAULTOP=and&DB=default&FMT=[topterms|xml|opensearch] finds 7 matches

- omega?P=lovely+try&DEFAULTOP=and&DB=default&FMT=query returns cobalt_A4_love.plot.html first and cookbook---using-more-complex-recipes-involving-text.html second, while

omega?P=lovely+try&DEFAULTOP=and&DB=default&FMT=[topterms|xml|opensearch] returns cookbook---using-more-complex-recipes-involving-text.html first and cobalt_A4_love.plot.html second

Best regards, Gennadiy.

Attachments (1)

default.tar.gz (469.5 KB ) - added by Gennadiy 3 years ago.
tiny glass database to reproduce the discrepancies

Download all attachments as: .zip

Change History (4)

by Gennadiy, 3 years ago

Attachment: default.tar.gz added

tiny glass database to reproduce the discrepancies

comment:1 by Olly Betts, 3 years ago

Milestone: 1.4.19
Status: newassigned

The topterms template is just a thin wrapper around the query one:

$set{topterms,$or{$query,$ne{$msize,0}}}$include{query}

The problem here is that checking $msize forces the query to be parsed and the match run, and that means that some things set up in the query template (the term prefixes at least) end up ignored because they're now set too late to make a difference.

I think we probably need to move at least the $msize check into the query template.

The issue with opensearch and xml is that they don't have the code to set up the term prefixes. So they're only searching the body of documents, whereas the query template searches title and topic too. It would be more consistent for them to default to the same fields.

comment:2 by Olly Betts, 3 years ago

Resolution: fixed
Status: assignedclosed

comment:3 by Gennadiy, 3 years ago

Thank you!

Note: See TracTickets for help on using tickets.