wiki:GSoC2018/Diversification/WorkProduct

Diversification of search results

Work Product

The project that I worked in this year was to do with adding in diversifcation of search results to the API. The API brings in functionality to diversify search results using an implicit method, which does not rely on external data such as query logs. The implementation has been majorly merged into master, but there still needs to be evaluation done on a public data set (ClueWeb09).

The main parts of this project are:

  • Add C2-GLS
  • Add in LCD Clustering
  • Evaluation on ClueWeb09

Merged

A diversification API which supports this year. For details of the implementation, refer to project plan. Following components have been merged in this year.

Link containing all merged commits https://github.com/xapian/xapian/commits/master/?author=uppinder

Work in Progress

  • Writing up documentation for the Getting Started with Xapian guide.

Since diversification is a new functionality in Xapian, I would be adding information on how to use the API in the Xapian getting started guide here

Future Work

  • Evaluation on ClueWeb09

There is a need to evaluate the current diversification implementation on the ClueWeb09 Category B data set using TREC 2009/2010 topical queries, and compare the results with those of the original paper.

Last modified 12 months ago Last modified on 13/08/18 12:54:36