Bi-gram Language Modeling
| Name | Gaurav Arora |
| IRC nick | samuelharden |
| Timezone | UTC+5:30 |
| Work hours | 5:00 -14:00 UTC |
| Official mentor | James Aylett |
| Code repository | https://github.com/samuelharden/xapian-gaurav-gsoc |
| Current Worked on Branch | https://github.com/samuelharden/xapian-gaurav-gsoc/tree/bigram |
| Evaluation Code repository | https://github.com/samuelharden/xapian-evaluation |
| Documentation repository | https://github.com/samuelharden/xapian-docsprint |
| Melange | http://www.google-melange.com/gsoc/project/google/gsoc2012/samuelharden/21001 |
| GSOC Blog | http://gsocxapian.blogspot.com |
Bi-gram Language modeling approach to information retrieval have proved to outperform the three tradition IR approaches . Bi-gram Language model apart from better retrieval performance renders a rich resource Bi-gram from collection which can be used for phrase searching, Diversifying search results, and query reformulation suggestion to user. Bi-gram Language model would make Xapian a more powerful library for research in information retrieval.
Last modified
9 years ago
Last modified on 06/03/17 23:36:25
Note:
See TracWiki
for help on using the wiki.
