The Flint Synonym Table
This page describes the format of the Synonym table in the FlintBackend. This table is used to implement the synonym feature.
Key Format
The key is just the term.
Tag Format
The tag contains a sorted list of synonyms for the term. These are encoded as:
char(synonym.size() ^ MAGIC_XOR_VALUE) + synonym
MAGIC_XOR_VALUE is 96. We XOR the length values with this so that they are more likely to coincide with lower-case ASCII letters, which are likely to be common in the words. This means that zlib should do a better job of compressing these tag values.
The tag may be stored compressed by zlib, which gives a decent amount of extra compression. Very small tags (less than the minimum size of compressed output) are stored uncompressed.