Detailed Example of Running Using Omega for the First Time
This working example was based on Xapian 1.0, and has not been updated to a more recent version. Although the concepts and principles have not changed, some of the details may have changed. In particular, we strongly recommend you use a recent released version of Xapian. These days you can skip the installation steps if you are able to use operating system packages.
Reformatted from Jim's original at
https://fayettedigital.com/omegaexample.html by Olly.
See also a tutorial at http://www.linux.com/archive/feature/149223?theme=print
This document will outline in detail the steps necessary to get an example search engine based on Omega and Xapian up and running. I'll point you to a set of files that you can install on your own system and index. This example uses omindex and omega.
Installing Xapian and Omega
You should generally follow the instructions in our user guide. If you use operating system packages, you can skip a lot of the following, although you may have to change some URLs and paths in the later sections depending on the details of those packages.
Requirements are:
- Apache or another http server that you are familiar with
- A C++ compiler
This example was developed on Linux. I have no idea how to get it to run on any other OS, so it's up to you to translate the instructions here to your specific system. I'm running a Mint 20 system with Apache 2.4.1 and G++ 9.3.0.
First you must install the xapian libraries. Download the source from the Xapian download site.
Extract the files from the archive with the following command. Note, the file name will probably be different from this example:
tar xzf xapian-core-1.4.18.tar.xz
This will create a directory xapian-core-1.4.18
so change to
that directory, i.e.:
cd xapian-core-1.4.18
And configure via:
./configure --prefix=/usr/local
The --prefix isn't mandatory so if you know what you are doing, you can remove it. If there are no errors, then you can make the libraries with a make command:
make
Assuming the make went OK and you didn't get any errors, become root (su
or sudo
command) and type:
sudo make install
This will install the xapian library on your system.
Now that we have Xapian installed, we'll have to install the Omega utilities. To do this download Omega from the same place you found the Xapian files, extract, configure, make and install the same way you did for the libraries. The following commands should work.
cd ~ tar xzf xapian-omega-1.4.18.tar.xz cd xapian-omega-1.4.18 ./configure --prefix=/usr/local make sudo make install
If you are installing from source and encounter errors during the
configure or make steps for either of these scripts, please check the
README
and INSTALL
files in each directory for possible additional
instructions. If that doesn't help, search the mail list archive and
then post a message to the mail list if you still are having
problems.
If you've gotten this far then
we're almost home. The next step is to copy the omega program into
your cgi-bin
directory. If you don't know where it is, you'll need
to look at the apache (or httpd) configuration files. Here's the
section of my apache config file that tells me where to look:
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
So I know to put cgi binaries in the /usr/lib/cgi-bin
directory. The
next few lines demonstrate copying the omega binary.
sudo cp /usr/local/lib/xapian-omega/bin/omega /usr/lib/cgi-bin/omega.cgi cd xapian-omega-1.4.18 sudo cp omega.conf /usr/lib/cgi-bin/ sudo chmod 755 /usr/lib/cgi-bin/omega.cgi
Some http servers require the cgi
binaries to have an extension of .cgi
, so we'll do that so we're sure
it'll work. Note we've also copied the omega.conf
file to the same
directory. This is the easiest way to get things to work.
Building a database
The next step is to download the sample data and install it on your system. The file is less than 7MB so hopefully you've got enough space for it and download time won't be too bad. Point your browser to https://fayettedigital.com/book/book.0.1.tar.gz and download the file to somewhere convenient. Or use this command:
wget https://fayettedigital.com/book/book.0.1.tar.gz
Change directory to your document root and extract the files. On my system, I used the following commands:
cd ~ wget https://fayettedigital.com/book/book.0.1.tar.gz cd /var/www/html sudo tar xf ~/book.0.1.tar.gz
You may also extract the files somewhere else and copy them to your document root. There is nothing magic about the book directory.
First let's examine the
/usr/lib/cgi-bin/omega.conf
file we just copied. Here is the file as
it is in the release (at least for this version):
# Directory containing Xapian databases: database_dir /var/lib/omega/data # Directory containing OmegaScript templates: template_dir /var/lib/omega/templates # Default template name if the CGI parameter "FMT" is not specified. # (If not specified here, the default template name is "query"): #default_template query # Default database name if the CGI parameter "DB" is not specified. # (If not specified here, the default database name is "default"): #default_db default # Directory to write Omega logs to: log_dir /var/log/omega # Directory containing any cdb files for the $lookup OmegaScript command: cdb_dir /var/lib/omega/cdb
You may leave the values as they are or you can change them. In any case you'll have to create the missing directories, e.g.:
sudo mkdir -p /var/lib/omega/data sudo mkdir /var/lib/omega/templates sudo mkdir /var/lib/omega/cdb sudo mkdir /var/log/omega
And copy the templates to the new directory.
cd ~/xapian-omega-1.4.18 sudo cp -r templates/* /var/lib/omega/templates
Be sure the templates are readable by others. Now we are ready to index
the data we just stored in the directory /var/www/html/book
.
omindex is the utility that we will use to index the documents. It knows how to parse html documents so we don't have to do anything special.
You should change the ownership of the /var/lib/omega/data
directory to a
non-root user and do the indexing as that user, but also make
sure all the database files are readable by others (since the user that CGI
programs run as needs to be able to read them):
sudo chown "`whoami`" /var/lib/omega/data sudo chmod -R a+r /var/lib/omega/data
The command I used to index the data and the output is as follows:
/usr/local/bin/omindex --db /var/lib/omega/data/default --url /book /var/www/html/book [Entering directory /] S_ci_sto1.jpg: Skipping - unknown MIME type 'image/jpeg' S_ci_mul.jpg: Skipping - unknown MIME type 'image/jpeg' S_ci_spr1.jpg: Skipping - unknown MIME type 'image/jpeg'
Let's look at the omindex command. The --db
parameter tells
it to create a database with a name of default. That's
the name that omega uses as its default. That can be changed, but
for this demonstration let's keep it simple. The --url
parameter identifies the url prefix that corresponds to the directory
we start indexing from. Since we put the documents in
/var/www/html/
book we need to specify --url /book
. If we were adding
files that were in the document root, we'd set use --url /
.
The last parameter, /var/www/html/book
tells omindex to look for the documents
at that location on disk. Omindex does not web crawl, it only looks at
files on disk.
Using delve to show stats:
$ xapian-delve /var/lib/omega/data/default/ UUID = 715c2d1d-9199-4697-978c-c9bb96944055 number of documents = 55 average document length = 7882.53 document length lower bound = 292 document length upper bound = 35317 highest document id ever used = 55 has positional information = true revision = 1 currently open for writing = false
Searching using the Omega CGI
Now test your installation by pointing your browser at http://localhost/cgi-bin/omega.cgi
Questions or comments, drop an email to jim at the website where you downloaded the files from.