Version 21 (modified by olly, 3 months ago)

formatting

Missing Documentation

This page is intended to list gaps in the documentation which should be addressed.

Feel free to add to it. Feel even freer to write these documents! If you do, either put them on the wiki or write them as restructured text and attach them to a trac ticket. Even a rather rough and ready version is useful.

If you have new glossary entries, please add them to Glossary instead.

  • Documentation of the algorithm used by Enquire::get_eset(). There's a brief overview of this stuff in overview.html, but it doesn't go into details. Some examples of uses would be good too.
  • Document of RSet usage other than in generating the ESet. I (sja) believe this is only for affecting weights in generating the MSet.
  • The spelling documentation doesn't explain how to add spelling data to the database (either via TermGenerator or "by hand").
  • There doesn't appear to be anything documenting how OP_VALUE_* actually work.
  • Topic document covering all aspects of probabilistic and boolean prefixes.
  • For 1.1.x, we should have a topic document on implementing your own weighting scheme (i.e. Xapian::Weight subclass).
  • A FAQ on caching ("trust the OS cache") may be helpful, as this seems to come up periodically. There's some helpful discussion in the scalability doc. There's also  a long email by James (that may be somewhat out of date)
  • A FAQ on "how to deal with DatabaseModifiedError" would be helpful. Also a discussion somewhere of how to do concurrent rolling indexing while allowing search may be handy (perhaps in a new class of HOWTO-style documents, if we have any more like that?)
  • We have an FAQ on how the TermGenerator differs from direct API calls, and a termgenerator core doc that actually describes how term generation *works* (to be compatible with QueryParser?), but we don't seem have a simple summary of how and why to use TermGenerator. This should probably be a HOWTO-style doc, giving a little background to the simpleindex programs we have in (most?) supported programming languages.
  • We've been asked a few times how to apply access systems on top of Xapian searches. There are a variety of approaches to this that can be used, but some document discussing authz mechanisms should prove helpful.
  • We don't have a goal-oriented document talking about the prefix convention. The main two things to cover are using prefixed terms for restriction (boolean) and for searching in a "field" of the source document (probabilistic). There's some good stuff in  this thread.
  • There's some useful discussion on how many documents to feed to an RSet in  this helpfully-titled thread (at the start; later messages start dealing with some bugs).
  • (1.1.x) POD docs for Search::Xapian::WritableDatabase don't mention new close() method.