Changes between Version 4 and Version 11 of Ticket #216


Ignore:
Timestamp:
16/07/08 02:45:52 (17 years ago)
Author:
Olly Betts
Comment:

Merge the ReleaseNotes entry from 1.0.7 into the description to try to keep information about this issue in one place.

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #216

    • Property Milestone 1.11.0.8
    • Property Owner changed from New Bugs to Olly Betts
    • Property Blocking 200
  • Ticket #216 – Description

    v4 v11  
    1 When results are being sorted by a value, the percentage values for the results
    2 returned are normalised based on the document in the portion of the mset
     1When results are being sorted primarily by an order other than relevance (e.g. {{{sort_by_value()}}}), the percentage values returned by the MSet object may be incorrect because they are
     2calculated based on the document in the portion of the MSet
    33requested which has the highest weight, instead of the document matching the
    4 query which has the highest weight.  I have a testcase demonstrating this which
    5 I will attach shortly.
     4query which has the highest weight.
    65
    7 This is because, in multimatch.cc, we calculate "best" by looking for the
     6This issue has existed in all previous Xapian releases, as far as we can tell.
     7
     8There is currently no fix in progress, since it is probably not possible to fix without significant loss of efficiency, which would
     9adversely affect users who aren't interested in the percentage scores.
     10
     11If you really need percentage scores in this situation, one workaround would be to first run the search using relevance order, asking for only the top document, and to remember the weight and percentage assigned to that document. Then, re-run the search in sorted order, and calculate the percentages yourself from the weights assigned to the results, using this information.
     12
     13A testcase demonstrating this is attached to this ticket.
     14
     15The issue is that in multimatch.cc, we calculate "best" by looking for the
    816highest weighted document in the candidate mset, but when sorting by anything
    917other than relevance, the highest weighted document may have been discarded already.