Bi-gram Language Modeling
Name | Gaurav Arora |
IRC nick | samuelharden |
Timezone | UTC+5:30 |
Work hours | 5:00 -14:00 UTC |
Official mentor | James Aylett |
Code repository | https://github.com/samuelharden/xapian-gaurav-gsoc |
Current Worked on Branch | https://github.com/samuelharden/xapian-gaurav-gsoc/tree/bigram |
Evaluation Code repository | https://github.com/samuelharden/xapian-evaluation |
Documentation repository | https://github.com/samuelharden/xapian-docsprint |
Melange | http://www.google-melange.com/gsoc/project/google/gsoc2012/samuelharden/21001 |
GSOC Blog | http://gsocxapian.blogspot.com |
Bi-gram Language modeling approach to information retrieval have proved to outperform the three tradition IR approaches . Bi-gram Language model apart from better retrieval performance renders a rich resource Bi-gram from collection which can be used for phrase searching, Diversifying search results, and query reformulation suggestion to user. Bi-gram Language model would make Xapian a more powerful library for research in information retrieval.
Last modified
8 years ago
Last modified on 03/06/17 23:36:25
Note:
See TracWiki
for help on using the wiki.