wiki:OmegaExample

Detailed Example of Running Using Omega for the First Time

Reformatted from Jim's original at http://fayettedigital.com/omegaexample.html by Olly.
See also a tutorial at http://www.linux.com/archive/feature/149223?theme=print

This document will outline in detail the steps necessary to get an example search engine based on Omega and Xapian up and running. I'll point you to a set of files that you can install on your own system and index. This example uses omindex and omega.

Requirements are:

  • Apache or another http server that you are familiar with
  • A C++ compiler

This example was developed on Linux. I have no idea how to get it to run on any other OS, so it's up to you to translate the instructions here to your specific system. I'm running a Debian 3.1 system with Apache 2 and G++ 3.3.

First you must install the xapian libraries. Download the source from the Xapian download site.

Extract the files from the archive with the following command. Note, the file name will probably be different from this example:

tar xzf xapian-core-1.0.5.tar.gz

This will create a directory xapian-core-1.0.5 so change to that directory, i.e.:

cd xapian-core-1.0.5

And configure via:

./configure

If there are no errors, then you can make the libraries with a make command:

make

Assuming the make went OK and you didn't get any errors, become root (su or sudo command) and type:

su
make install
exit

This will install the xapian library on your system.

Now that we have Xapian installed, we'll have to install the Omega utilities. To do this download Omega from the same place you found the Xapian files, extract, configure, make and install the same way you did for the libraries. The following commands should work.

cd ~
tar xzf xapian-omega-1.0.5.tar.gz
cd xapian-omega-1.0.5
./configure
make
su
make install
exit

If you encounter errors during the configure or make steps for either of these scripts, please check the README and INSTALL files in each directory for possible additional instructions. If that doesn't help, search the mail list archive and then post a message to the mail list if you still are having problems.

If you've gotten this far then we're almost home. The next step is to copy the omega program into your cgi-bin directory. If you don't know where it is, you'll need to look at the apache (or httpd) configuration files. Here's the section of my apache config file that tells me where to look:

ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/

So I know to put cgi binaries in the /usr/lib/cgi-bin directory. The next few lines demonstrate copying the omega binary.

cd ~
su
cd xapian-omega-1.0.5
cp omega /usr/lib/cgi-bin/omega.cgi
cp omega.conf /usr/lib/cgi-bin/
chmod 755 /usr/lib/cgi-bin/omega.cgi
exit

Some http servers require the cgi binaries to have an extension of .cgi, so we'll do that so we're sure it'll work. Note we've also copied the omega.conf file to the same directory. This is the easiest way to get things to work.

The next step is to download the sample data and install it on your system. The file is less than 7MB so hopefully you've got enough space for it and download time won't be too bad. Point your browser to http://fayettedigital.com/book/book.0.1.tar.gz and download the file to somewhere convenient. Change directory to your document root and extract the files. On my system, I used the following commands:

su
cd /var/www
tar xzf ~/book.0.1.tar.gz
exit

You may also extract the files somewhere else and copy them to your document root. There is nothing magic about the book directory.

First let's examine the /usr/lib/cgi-bin/omega.conf file we just copied. Here is the file as it is in the release (at least for this version):

database_dir /var/lib/omega/data
template_dir /var/lib/omega/templates
log_dir /var/log/omega
cdb_dir /var/lib/omega/cdb

You may leave the values as they are or you can change them. In any case you'll have to create the missing directories, e.g.:

su
mkdir -p /var/lib/omega/data
mkdir /var/lib/omega/templates
mkdir /var/lib/omega/cdb
mkdir /var/log/omega

And copy the templates to the new directory.

cd ~/xapian-omega-1.0.5
cp templates/* /var/lib/omega/templates

Be sure the templates are readable by others. Now we are ready to index the data we just stored in the directory /var/www/book.

omindex is the utility that we will use to index the documents. It knows how to parse html documents so we don't have to do anything special.

You should change the ownership of the /var/lib/omega/data directory to a non-root user and do the indexing as that user, but also make sure all the database files are readable by others (since the user that CGI programs run as needs to be able to read them):

sudo chown "`whoami`" /var/lib/omega/data
chmod -R a+r /var/lib/omega/data

The command I used to index the data and the output is as follows:

/usr/local/bin/omindex --db /var/lib/omega/data/default --url /book /var/www/book
[Entering directory /]
Indexing "/ci_01.htm" as text/html ...  added.
Indexing "/ci_02.htm" as text/html ...  added.
...
Indexing "/Introduction.htm" as text/html ... added.
Indexing "/Jpg4.htm" as text/html ...  added.
Indexing "/pato.htm" as text/html ... added.

Let's look at the omindex command. The --db parameter tells it to create a database with a name of default. That's the name that omega uses as its default. That can be changed, but for this demonstration let's keep it simple. The --url parameter identifies the url prefix that corresponds to the directory we start indexing from. Since we put the documents in /var/www/book we need to specify --url /book. If we were adding files that were in the document root, we'd set use --url /.

The last parameter, /var/www/book tells omindex to look for the documents at that location on disk. Omindex does not web crawl, it only looks at files on disk.

Now test your installation by pointing your browser at http://localhost/cgi-bin/omega.cgi

Last modified 3 months ago Last modified on 30/05/17 03:30:21