Here's some of the changes I'm planning to work on soon, probably in roughly this order:
- Alter the "7 bit" coding to eliminate the multiple encodings of values (which will reduce the number of bytes needed for some values). Note: experiments seem to show that the size reduction is minimal. Since the encoding is more complicated to work with, perhaps this change isn't worthwhile - we could redefine the redundant encodings for other purposes instead (e.g. to encode ranges in postlists).
- Write a replacement Btree manager which compresses keys, has a specialised format
for branch blocks, and is structured in a more helpful way for future changes I want
to make. To reduce the amount of work required to
get back to a testable system, the initial version is likely to be missing the following
features (which will get added once the initial version is working well):
- reading while updating
- deleting tags
- modifying tags (except for a special case to cope with the magic document length tag)
- Subclass the key comparison method so we don't need to jump through hoops to make keys sort in byte order. I think we'll also need a "shortest dividing key" method too.
- Store Btree tags which are larger than a block by filling whole blocks with tag data and only storing the partial block in the Btree itself.
Note: See TracWiki for help on using the wiki.