Bi-gram Language Modeling

Name Gaurav Arora
IRC nick samuelharden
Timezone UTC+5:30
Work hours 5:00 -14:00 UTC
Official mentor James Aylett
Code repository
Current Worked on Branch
Evaluation Code repository
Documentation repository

Bi-gram Language modeling approach to information retrieval have proved to outperform the three tradition IR approaches . Bi-gram Language model apart from better retrieval performance renders a rich resource Bi-gram from collection which can be used for phrase searching, Diversifying search results, and query reformulation suggestion to user. Bi-gram Language model would make Xapian a more powerful library for research in information retrieval.

