wiki:GSoC2014/Posting list encoding improvements

PROJECT TITLE

Name Shangtong Zhang
Preferred communication Email
Timezone UTC+8
Work hours UTC+8 19:00-23:00
Official mentor Dan Colish
Code repository https://github.com/HurricaneTong/xapian-1/tree/fixed-width-format-chunk-for-doc-length
Melange http://www.google-melange.com/gsoc/project/details/google/gsoc2014/shangtongzhang/5676830073815040

In Xapian, storing a list( post list ) for a specific term is an important part.Current approach is not so ideal, so I come up with some ways to improve the encoding of the post list.Linear searching is used in some part of Xapian now, I'm going to replace it with a skip list or hashing. Storing a list ( position list ) of the positions where a term appear in a document is also of great importance. I'd like to use dynamic encoding in place of the interpolative encoding.

Last modified 5 years ago Last modified on 26/06/14 17:29:20