Ticket #245 (assigned defect)

Opened 10 months ago

Last modified 10 months ago

All-stopword queries with two or more terms should ignore stopword list

Reported by: richard Owned by: olly
Priority: normal Milestone:
Component: QueryParser Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Operating System: All Blocking:

Description

Currently, if a single word query is parsed, and that word is a stopword, the stopwording is ignored. However, if a multiple word query is parsed, and all words are stopwords, the stopwording is applied (resulting in an empty query).

If all the words in the query are stopwords, I think it may make sense to ignore the stopwording. However, even if we decide to apply the stopwording in this case, we should be consistent in our behaviour.

Some examples, in python:

import xapian s=xapian.SimpleStopper?() s.add('foo') s.add('bar') qp=xapian.QueryParser?() qp.set_stopper(s) str(qp.parse_query('foo'))

'Xapian::Query(foo:(pos=1))'

str(qp.parse_query('foo foo'))

'Xapian::Query()'

str(qp.parse_query('foo bar'))

'Xapian::Query()'

Either the first parse_query() call should return Xapian::Query(), or the later ones should return non-empty queries.

Attachments

qp-allstop.patch (3.0 kB) - added by olly 10 months ago.
Simple approach to fixing this

Change History

Changed 10 months ago by olly

  • status changed from new to assigned
  • summary changed from Inconsistent behaviour if all words in the query are stopwords. to All-stopword queries with two or more terms should ignore stopword list

The bug is that all-stopword queries with two or more terms aren't handled specially - the single term case was easy to handle, while the multi-term case seemed trickier for some reason, and I only handled the easy one so far.

There's actually a FIXME comment about this in queryparser.lemony which probably shows where the fix should go:

// FIXME what if E && E->empty() (all terms are stopwords)?

Changed 10 months ago by olly

  • owner changed from newbugs to olly
  • status changed from assigned to new

Changed 10 months ago by olly

Simple approach to fixing this

Changed 10 months ago by olly

  • status changed from new to assigned

Changed 10 months ago by trac

  • platform set to All
Note: See TracTickets for help on using tickets.