Opened 14 years ago
Closed 14 years ago
#524 closed enhancement (wontfix)
dynamic weight
Reported by: | saridemir | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Matcher | Version: | |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | Microsoft Windows |
Description
Hello, I am a C# developer, and I use Xapian c# version. I collect several e-commerce web sites products information. When I use product names for searching the xapian shows only one site products because its product name are shortes. for example one site write "Nokia n95","nokia n97","nokia n77" and the second site write "Nokia n95 cellphone","nokia n97 cellphone". when I search "nokia" the xapian engine shows me "nokia n95","nokia n97","nokia n77","Nokia n95 cellphone","nokia n97 cellphone". but the result is not relevant. the result must "nokia n95","nokia n97","Nokia n95 cellphone","nokia n77","nokia n97 cellphone". I need override the weight function and for each result the custom weight factor calculate the previous result. is it posible?
Note: I am currently use Lucene.net and I override the weight function as
siteweight: global variable for each site,
weight = weight * siteweight; siteweight *=0.98; return weight;
Thanks,
Change History (4)
comment:1 by , 14 years ago
comment:2 by , 14 years ago
Component: | Other → Matcher |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
Type: | defect → enhancement |
If I understand the original proposal correctly, the scaling should be applied to the results in descending rank order, which means that we don't know the true weight of a document until we have processed all documents, because it might be scaled down by any amount later on. So rather than tracking the best N results we've seen so far, we would have to track all results which match, then go through and scale down some weights, then pick the best N and re-sort them.
Or is the idea to simply increasingly scale down matches for the same site as we encounter them? That seems horribly arbitrary...
So either way, I don't think we want to implement this, hence closing as "wontfix".
comment:3 by , 14 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
Hello, I ve use lucene.net previously.(I cant use it becouse lucene.net has a bug for memeory problem it increase) I pass the problem with using multisearcher. I mean I edit the custom seacher and add a global variable which is initaly set 1 and its multply with 0.99 when the calculate weight is call and return orginal wight multiply gloabal vaiable. I have to define each site's product as an seperate index. and seache them with multisearcher. however I could'nt do it with sphinx becouse the maximum weight score allways changes so the aproach will not solve if I can'nt set the maximum wight to 1. If I can I seach the same term for each index. and it solves my problem not %100 percent but %90 percent and it is enough form me,
comment:4 by , 14 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
It sounds like what you're really wanting is for the document length not to affect the weights - if so, the best way to achieve that is to simply set the weighting scheme parameters appropriately: