Opened 13 years ago

Last modified 4 months ago

#521 new defect

Precedence of HATE is surprising

Reported by: Olly Betts Owned by: Olly Betts
Priority: normal Milestone: 2.0.0
Component: QueryParser Version: 1.2.3
Severity: normal Keywords:
Cc: dcolish@… Blocked By:
Blocking: Operating System: All

Description (last modified by Olly Betts)

This query:

a OR b -c

gets parsed as:

a OR (b -c)

rather than the more obvious interpretation of:

(a OR b) -c

While this appears to be just a precedence issue, actually HATE is resolved within a "term group" and b -c is a term group here, so this might take a bit of reworking to fix.

Attachments (1)

grammar.out (1.8 KB ) - added by Dan 13 years ago.
Literal Refactoring of Grammar

Download all attachments as: .zip

Change History (10)

comment:1 by Olly Betts, 13 years ago

Description: modified (diff)

comment:2 by Dan, 13 years ago

After looking briefly at the grammar, I think a good refactoring will be needed to support this style of precedence. The main issue is that the grammar will branch on the boolean operators before ever considering the hate/love operators. This means that even with precedence weight changes, the grammar could not support the expected output. One way to resolve this is to classify all binary expressions and reduce those. This way, the potential reduction of '-' before 'AND' will be controlled by the precedence of operators. Another approach would be to structure the grammar to ensure the LOVE/HATE operators are reduced before the boolean ones; however, this could also have unintended side-effects. I did dump out the lemon grammar as it stands::

// Reprint of input file "queryparser.lemony".
// Symbols:
//   0 $               10 HATE_AFTER_AND  20 BRA             30 stop_prob     
//   1 ERROR           11 SYNONYM         21 KET             31 stop_term     
//   2 OR              12 TERM            22 EMPTY_GROUP_OK  32 compound_term 
//   3 XOR             13 GROUP_TERM      23 error           33 phrase        
//   4 AND             14 PHR_TERM        24 query           34 phrased_term  
//   5 NOT             15 WILD_TERM       25 expr            35 group         
//   6 NEAR            16 PARTIAL_TERM    26 prob_expr       36 near_expr     
//   7 ADJ             17 BOOLEAN_FILTER  27 bool_arg        37 adj_expr      
//   8 LOVE            18 RANGE           28 prob          
//   9 HATE            19 QUOTE           29 term          
query ::= expr.
query ::=.
expr ::= prob_expr.
expr ::= bool_arg AND bool_arg.
expr ::= bool_arg NOT bool_arg.
expr ::= bool_arg AND NOT bool_arg. [NOT]
expr ::= bool_arg AND HATE_AFTER_AND bool_arg. [AND]
expr ::= bool_arg OR bool_arg.
expr ::= bool_arg XOR bool_arg.
bool_arg ::= expr.
bool_arg ::=. [ERROR]
prob_expr ::= prob.
prob_expr ::= term.
prob ::= RANGE.
prob ::= stop_prob RANGE.
prob ::= stop_term stop_term.
prob ::= prob stop_term.
prob ::= LOVE term.
prob ::= stop_prob LOVE term.
prob ::= HATE term.
prob ::= stop_prob HATE term.
prob ::= HATE BOOLEAN_FILTER.
prob ::= stop_prob HATE BOOLEAN_FILTER.
prob ::= BOOLEAN_FILTER.
prob ::= stop_prob BOOLEAN_FILTER.
prob ::= LOVE BOOLEAN_FILTER.
prob ::= stop_prob LOVE BOOLEAN_FILTER.
stop_prob ::= prob.
stop_prob ::= stop_term.
stop_term ::= TERM.
stop_term ::= compound_term.
term ::= TERM.
term ::= compound_term.
compound_term ::= WILD_TERM.
compound_term ::= PARTIAL_TERM.
compound_term ::= QUOTE phrase QUOTE.
compound_term ::= phrased_term.
compound_term ::= group.
compound_term ::= near_expr.
compound_term ::= adj_expr.
compound_term ::= BRA expr KET.
compound_term ::= SYNONYM TERM.
phrase ::= TERM.
phrase ::= phrase TERM.
phrased_term ::= TERM PHR_TERM.
phrased_term ::= phrased_term PHR_TERM.
group ::= TERM GROUP_TERM.
group ::= group GROUP_TERM.
group ::= group EMPTY_GROUP_OK.
near_expr ::= TERM NEAR TERM.
near_expr ::= near_expr NEAR TERM.
adj_expr ::= TERM ADJ TERM.
adj_expr ::= adj_expr ADJ TERM.
Last edited 8 years ago by Olly Betts (previous) (diff)

in reply to:  2 comment:3 by Dan, 13 years ago

Sorry, that grammar paste didnt work out too well. I've got it pasted: http://paste.pocoo.org/raw/347500/

by Dan, 13 years ago

Attachment: grammar.out added

Literal Refactoring of Grammar

comment:4 by Dan, 13 years ago

Took a shot at getting this refactored, i think it will make a little more sense not that we're not trying to explicitly match each possible combination of expressions, and instead letting the expressions reduce themselves. There might be some right recursion in it which needs to be factored out more.

comment:5 by Dan, 13 years ago

Cc: dcolish@… added

comment:6 by Olly Betts, 13 years ago

There are some problems with the grammar in grammar.out.

One is easily fixed - expr can be a term but can also be a prob which can be a term. I think you don't want to allow expr to be a term directly.

The rules for group and phrase_term also seem to be wrong - these needs to start with a TERM, but your rules seem to mean it's the last but one token which needs to be a TERM. I think the recursive rule in each case needs its RHS transposing.

But there are other issues which seem harder to address - for example, you can specify arbitrary numbers of PLUS and MINUS in a row (because unop loops back to expr). Also even without that, you can write: term1 PLUS PLUS term2 (first PLUS from binop, second from unop).

comment:7 by Olly Betts, 11 years ago

Milestone: 1.2.x1.3.x

comment:8 by Olly Betts, 8 years ago

Milestone: 1.3.x1.4.x

Doesn't affect the ABI, so not a 1.4.0 blocker.

comment:9 by Olly Betts, 4 months ago

Milestone: 1.4.x2.0.0

Reluctantly postponing again.

Note: See TracTickets for help on using tickets.