Omega is a search application built on top of the Xapian library. You can use it to easily add a search feature to your website, but it's also easy to use as a search frontend with your own indexer.

Things to cover:

  • scriptindex allows easily configurable indexing of data from diverse sources (e.g. indexing from SQL)
    • document dbi2omega Environment variables: DBUSER - user name to connect to the database with (defaults to $USER then $LOGNAME then "")

DBPASSWORD - password to connect to the database with (defaults to "")

DBIDRIVER - DBI driver to use (defaults to "mysql")

  • document mbox2omega
  • crawling using ht://dig:
    • document htdig2omega
    • what changes will htdig4 need?
  • crawling using GNU wget:
    • mirror web pages locally and then use omindex
    • supports resuming download after error, proxies, cookies
    • HOWTO style guide and/or wrapper script would be useful
    • Peter Masiar concluded ht://dig was more suitable - find out why...
  • file formats which omindex understands