Opened 10 years ago

Last modified 4 months ago

#677 new defect

OP_WILDCARD with multidb

Reported by: Olly Betts Owned by: Olly Betts
Priority: normal Milestone: 2.0.0
Component: Library API Version: git master
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description (last modified by Olly Betts)

Currently the limits for both of these are applied separately for each subdatabase in the multidatabase case - really they should both work the same as for a single database containing the same documents as all the subdatabases.

OP_ELITE_SET and QueryParser's wildcard expansion limits do work that way in 1.2.x, so this seems like it ought to be fixed before 1.4.0.

Change History (6)

comment:1 by Olly Betts, 9 years ago

Milestone: 1.3.41.3.x

comment:2 by Olly Betts, 9 years ago

Description: modified (diff)
Milestone: 1.3.x1.4.x
Summary: OP_WILDCARD and OP_ELITE_SET with multidbOP_WILDCARD with multidb

Turns out OP_ELITE_SET is easy to fix - we just remove the special case check for termfreq_max() being zero! That's done in [6f3ff69b87bf50390c09c4f0f31740857c241109].

The fix for OP_WILDCARD is unfortunately not just a matter of deleting a couple of lines of code.

The reason for the wildcard limits is to improve performance so we want the implementation to be efficient, and perhaps that's more important than it being exact. Care is certainly needed to make sure any fix isn't a lot slower.

As I noted in the description, 1,2's QueryParser wildcard expansion limits work differently, but the current behaviour isn't entirely unreasonable, and OP_WILDCARD and these limits are new features.

So I think for 1.4.0 we document what currently happens, and note that these details may change - done in [a6cbcf9ee8c091adef1547571ac502f8c78343c2].

comment:3 by Olly Betts, 9 years ago

Description: modified (diff)

comment:4 by Olly Betts, 5 years ago

Version: SVN trunkgit master

comment:5 by Olly Betts, 20 months ago

Milestone: 1.4.x2.0.0

comment:6 by Olly Betts, 4 months ago

A related problem is that we currently use the shard's termfreqs for weighting: [b0834cbac83d2cc02faa1fe0fbb03a8e6f6684f5]

Note: See TracTickets for help on using tickets.