Help Wanted
This page is intended to list possible projects for somebody who's interested in getting involved with Xapian development, but who isn't intimately familiar with code yet - if you are then we can probably assume you can think of things to work on by yourself, or else have a look at the bug database. Some would benefit from special skills in other areas (for example, experience with a language would really be needed to produce decent bindings for it).
- Add the ability to introspect on Query objects (see issue #159). This would allow queries generated by the query parser to be investigated and modified, and would be helpful in various scenarios. The trickiest bit is probably working out a suitable API - for this, you'd need to discuss possible APIs with the other developers on IRC or the mailing lists. It's probably a good task for a beginner, being fairly small and self-contained.
- Add classes which provide more choice of weighting schemes (e.g. the Divergence From Randomness family of weighting schemes) and investigate how they compare for speed and effectiveness with BM25.
- Implement bindings for Xapian in another language using SWIG. Currently we have decent support for Python, PHP, Tcl, C#, Java, Perl, and Ruby, though we could do with support from a user of C# or Ruby to help maintain those bindings better. Someone has been working on Pike bindings. Bindings for other languages would be useful to have (including guile since the current guile bindings aren't remotely usable).
- Reimplement Java bindings using SWIG instead of hand-coded JNI (it's much harder to maintain the hand-coded JNI we currently have, plus SWIG generates smaller code and supports CNI too). A lot of this has been done now, but it would be useful to have assistance from someone with plenty of Java experience.
- Add visibility annotations to the xapian-core library. A lot of this has been done now.
- Write an OmegaScript command to pull out some sentences/phrases from a string which contain terms from a specified list, and add an option to omindex to allow larger samples to be stored in the sample field. Then we can show a sample with term matches in context.
