Opened 14 years ago

Closed 14 years ago

#524 closed enhancement (wontfix)

dynamic weight

Reported by: saridemir Owned by: Olly Betts
Priority: normal Milestone:
Component: Matcher Version:
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: Microsoft Windows

Description

Hello, I am a C# developer, and I use Xapian c# version. I collect several e-commerce web sites products information. When I use product names for searching the xapian shows only one site products because its product name are shortes. for example one site write "Nokia n95","nokia n97","nokia n77" and the second site write "Nokia n95 cellphone","nokia n97 cellphone". when I search "nokia" the xapian engine shows me "nokia n95","nokia n97","nokia n77","Nokia n95 cellphone","nokia n97 cellphone". but the result is not relevant. the result must "nokia n95","nokia n97","Nokia n95 cellphone","nokia n77","nokia n97 cellphone". I need override the weight function and for each result the custom weight factor calculate the previous result. is it posible?

Note: I am currently use Lucene.net and I override the weight function as

siteweight: global variable for each site,

weight = weight * siteweight; siteweight *=0.98; return weight;

Thanks,

Change History (4)

comment:1 by Olly Betts, 14 years ago

It sounds like what you're really wanting is for the document length not to affect the weights - if so, the best way to achieve that is to simply set the weighting scheme parameters appropriately:

    enquire.set_weighting_scheme(new Xapian.BM25Weight(1, 0, 1, 0, 0.5));

comment:2 by Olly Betts, 14 years ago

Component: OtherMatcher
Resolution: wontfix
Status: newclosed
Type: defectenhancement

If I understand the original proposal correctly, the scaling should be applied to the results in descending rank order, which means that we don't know the true weight of a document until we have processed all documents, because it might be scaled down by any amount later on. So rather than tracking the best N results we've seen so far, we would have to track all results which match, then go through and scale down some weights, then pick the best N and re-sort them.

Or is the idea to simply increasingly scale down matches for the same site as we encounter them? That seems horribly arbitrary...

So either way, I don't think we want to implement this, hence closing as "wontfix".

comment:3 by saridemir, 14 years ago

Resolution: wontfix
Status: closedreopened

Hello, I ve use lucene.net previously.(I cant use it becouse lucene.net has a bug for memeory problem it increase) I pass the problem with using multisearcher. I mean I edit the custom seacher and add a global variable which is initaly set 1 and its multply with 0.99 when the calculate weight is call and return orginal wight multiply gloabal vaiable. I have to define each site's product as an seperate index. and seache them with multisearcher. however I could'nt do it with sphinx becouse the maximum weight score allways changes so the aproach will not solve if I can'nt set the maximum wight to 1. If I can I seach the same term for each index. and it solves my problem not %100 percent but %90 percent and it is enough form me,

comment:4 by saridemir, 14 years ago

Resolution: wontfix
Status: reopenedclosed
Note: See TracTickets for help on using tickets.