Speed up phrase queries with a "settling pond"
|Reported by:||Olly Betts||Owned by:||Olly Betts|
The attached patch implements a "settling pond" to delay the checking of exact phrases which are "and-like with the root" (by which I mean if the phrase doesn't match, the whole query doesn't match). We discard pond entries which are below the current min_weight, and when the pond fills up, or the postlist tree is done, we take the highest weighted entries from the pond, which makes it more likely we'll increase min_weight and so be able to discard lower weighted pond entries without needing to do the potentially expensive phrase check.
This patch can dramatically improve query speeds for exact phrase queries for common terms when the positional data isn't cached.
This could be extended to any OP_PHRASE or OP_NEAR check which is "and-like with the root", and also to perform more than one such check.
The patch needs cleaning up in a few places, and the pond size should default to something sane (based on DB size perhaps?) but I'm putting it here to make sure it doesn't get lost or forgotten.
Patch is against trunk r13285.
Change History (23)
comment:16 by , 5 months ago
|Milestone:||1.4.x → 1.5.0|
|Priority:||high → normal|
|Version:||1.1.2 → git master|