#174 closed enhancement (released)
The check_at_least parameter to get_mset could be more efficient
Reported by: | Richard Boulton | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Matcher | Version: | SVN trunk |
Severity: | minor | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
Currently, looking at the implementation of check_at_least, it looks like it causes up-to check_at_least potential matches to be held in memory; the benefit over setting maxitems is mostly simply avoiding sorting so many matches.
However, I think it could be implemented simply by keeping the "min_weight" parameter passed to next_handling_prune() as it initial value until check_at_least hits have been seen.
This would require less memory to be used, and could be a big win for large values of check_at_least.
However, test coverage for check_at_least is currently very thin, so this would need to be fixed before attempting to implement this feature.
Attachments (1)
Change History (6)
comment:1 by , 17 years ago
comment:2 by , 17 years ago
Severity: | normal → enhancement |
---|
comment:3 by , 17 years ago
Owner: | changed from | to
---|
Pondering over the matcher code, the idea behind check_at_least isn't really useful if we're sorting primarily by value, since we need to consider all possible matches then anyway.
And sort by relevance-then-value is much the same as sort by relevance from this point of view.
Since in was the sort by value cases I was worrying about, I'm pretty much convinced this plan works.
I've added more tests for check_at_least which all pass with the patch, so I'm going to apply it once I've checked it over again.
comment:5 by , 17 years ago
Operating System: | → All |
---|---|
Resolution: | fixed → released |
Fix is in upcoming 1.0.2 release.
Olly comments that this probably would need to check docs_matched - duplicates_found >= check_at_least.