|Work hours||0400 - 1200 UTC|
|Public GSoC page||https://summerofcode.withgoogle.com/projects/#4994403428990976|
I'll be improving existing weighting schemes in Xapian & add support for a new normalization (Piv+) in existing vector space model.
Also, evaluate & compare the existing schemes with their improved counterparts for speed & retrieval effectiveness.
I'm planning to complete the following tasks by the end of GSoC 2016:
- Implement improved existing weighting schemes (BM25, Pl2 & Dir) in Xapian as BM25+, PL2+ & Dir+ respectively.
- Implement a new normalization function (Piv+) for existing vector space model Tf-Idf.
- Evaluate the performance of implemented functions using TREC dataset collections & calculating Precision or Recall and MAP.
- In the end, compare existing weighting functions with their improved counterparts based upon the evaluation which will be useful for users.