Ticket #481: geospatial_clean1.patch

File geospatial_clean1.patch, 60.6 KB (added by Richard Boulton, 15 years ago)

First patch to merge the geospatial branch.

  • xapian-core/docs/geospatial.rst

     
     1.. Copyright (C) 2008 Lemur Consulting Ltd
     2
     3================================
     4Geospatial searching with Xapian
     5================================
     6
     7.. contents:: Table of contents
     8
     9Introduction
     10============
     11
     12This document describes a set of features present in Xapian which are designed
     13to allow geospatial searches to be supported.  Currently, the geospatial
     14support allows sets of locations to be stored associated with each document, as
     15latitude/longitude coordinates, and allows searches to be restricted or
     16reordered on the basis of distance from a second set of locations.
     17
     18Three types of geospatial searches are supported:
     19
     20 - Returning a list of documents in order of distance from a query location.
     21   This may be used in conjunction with any Xapian query.
     22
     23 - Returning a list of documents within a given distance of a query location.
     24   This may be used in conjunction with any other Xapian query, and with any
     25   Xapian sort order.
     26
     27 - Returning a set of documents in a combined order based on distance from a
     28   query location, and relevance.
     29
     30Locations are stored in value slots, allowing multiple independent locations to
     31be used for a single document.  It is also possible to store multiple
     32coordinates in a single value slot, in which case the closest coordinate will
     33be used for distance calculations.
     34
     35Metrics
     36=======
     37
     38A metric is a function which calculates the distance between two points.
     39
     40Calculating the exact distance between two geographical points is an involved
     41subject.  In fact, even defining the meaning of a geographical point is very
     42hard to do precisely.  Xapian leaves the system used to define latitude and
     43longitude up to the user: the important thing, as far as Xapian is concerned,
     44is that all the points it's given use the same system.
     45
     46Since there are lots of ways of calculating distances between two points, using
     47different assumptions about the approximations which are valid, Xapian allows
     48user-implemented metrics.  These are subclasses of the Xapian::LatLongMetric
     49class; see the API documentation for details on how to implement the various
     50required methods.
     51
     52There is currently only one built-in metric - the GreatCircleMetric.  As the
     53name suggests, this calculates the distance between a latitude and longitude
     54based on the assumption that the world is a perfect sphere.  The radius of the
     55world can be specified as a constructor parameter, but defaults to a reasonable
     56approximation of the radius of the Earth.  The calculation uses the Haversine
     57formula, which is accurate for points which are close together, but can have
     58significant error for coordinates which are on opposite sides of the sphere: on
     59the other hand, such points are likely to be at the end of a ranked list of
     60search results, so this probably doesn't matter.
     61
     62Indexing
     63========
     64
     65To index a set of documents with location, you need to store serialised
     66latitude-longitude coordinates in a value slot in your documents.  To do this,
     67use the LatLongCoord class.  For example, this is how you might store a
     68latitude and longitude corresponding to "London" in value slot 0::
     69
     70  Xapian::Document doc;
     71  doc.add_value(0, Xapian::LatLongCoord(51.53, 0.08).serialise());
     72
     73Of course, often a location is a bit more complicated than a single point - for
     74example, postcode regions in the UK can cover a fairly wide area.  If a search
     75were to treat such a location as a single point, the distances returned could
     76be incorrect by as much as a couple of miles.  Xapian therefore allows you to
     77store a set of points in a single slot - the distance calculation will return
     78the distance to the closest of these points.  This is often a good enough work
     79around for this problem - if you require greater accuracy, you will need to
     80filter the results after they are returned from Xapian.
     81
     82To store multiple coordinates in a single slot, use the LatLongCoords class::
     83
     84  Xapian::Document doc;
     85  Xapian::LatLongCoords coords;
     86  coords.insert(Xapian::LatLongCoord(51.53, 0.08));
     87  coords.insert(Xapian::LatLongCoord(51.51, 0.07));
     88  coords.insert(Xapian::LatLongCoord(51.52, 0.09));
     89  doc.add_value(0, coords.serialise());
     90
     91(Note that the serialised form of a LatLongCoords object containing a single
     92coordinate is exactly the same as the serialised form of the corresponding
     93LatLongCoord object.)
     94
     95Searching
     96=========
     97
     98Sorting results by distance
     99---------------------------
     100
     101If you simply want your results to be returned in order of distance, you can
     102use the LatLongDistanceKeyMaker class to calculate sort keys.  For example, to
     103return results in order of distance from the coordinate (51.00, 0.50), based on
     104the values stored in slot 0, and using the great-circle distance::
     105
     106  Xapian::Database db("my_database");
     107  Xapian::Enquire enq(db);
     108  enq.set_query(Xapian::Query("my_query"));
     109  GreatCircleMetric metric;
     110  LatLongCoord centre(51.00, 0.50);
     111  Xapian::LatLongDistanceKeyMaker keymaker(0, centre, metric);
     112  enq.set_sort_by_key(keymaker, False);
     113
     114Filtering results by distance
     115-----------------------------
     116
     117To return only those results within a given distance, you can use the
     118LatLongDistancePostingSource.  For example, to return only those results within
     1195 miles of coordinate (51.00, 0.50), based on the values stored in slot 0, and
     120using the great-circle distance::
     121
     122  Xapian::Database db("my_database");
     123  Xapian::Enquire enq(db);
     124  Xapian::Query q("my_query");
     125  GreatCircleMetric metric;
     126  LatLongCoord centre(51.00, 0.50);
     127  double max_range = Xapian::miles_to_metres(5);
     128  Xapian::LatLongDistancePostingSource ps(0, centre, metric, max_range)
     129  q = Xapian::Query(Xapian::Query::OP_FILTER, q, Xapian::Query(ps));
     130  enq.set_query(q);
     131
     132Ranking results on a combination of distance and relevance
     133----------------------------------------------------------
     134
     135To return results ranked by a combination of their relevance and their
     136distance, you can also use the LatLongDistancePostingSource.  Beware that
     137getting the right balance of weights is tricky: there is little solid
     138theoretical basis for this, so the best approach is often to try various
     139different parameters, evalutate the results, and settle on the best.  The
     140LatLongDistancePostingSource returns a weight of 1.0 for a document which is at
     141the specified location, and a lower, but always positive, weight for points
     142further away. It has two parameters, k1 and k2, which control how fast the
     143weight decays, which can be specified to the constructor (but aren't in this
     144example) - see the API documentation for details of these parameters.::
     145
     146  Xapian::Database db("my_database");
     147  Xapian::Enquire enq(db);
     148  Xapian::Query q("my_query");
     149  GreatCircleMetric metric;
     150  LatLongCoord centre(51.00, 0.50);
     151  double max_range = Xapian::miles_to_metres(5);
     152  Xapian::LatLongDistancePostingSource ps(0, centre, metric, max_range)
     153  q = Xapian::Query(Xapian::Query::AND, q, Xapian::Query(ps));
     154  enq.set_query(q);
     155
     156
     157Performance
     158===========
     159
     160The location information associated with each document is stored in a document
     161value.  This allows it to be looked up quickly at search time, so that the
     162exact distance from the query location can be calculated.  However, this method
     163requires that the distance of each potential match is checked, which can be
     164expensive.
     165
     166Some experimental code exists to produce terms corresponding to a hierarchical
     167index of locations (using the O-QTM algorithm - see references below), which
     168can be used to narrow down the search so that only a small number of potential
     169matches need to be checked.  Contact the Xapian developers (on email or IRC) if
     170you would like to help finish and test this code.
     171
     172It is entirely possible that a more efficient implementation could be performed
     173using "R trees" or "KD trees" (or one of the many other tree structures used
     174for geospatial indexing - see http://en.wikipedia.org/wiki/Spatial_index for a
     175list of some of these).  However, using the QTM approach will require minimal
     176effort and make use of the existing, and well tested, Xapian database.
     177Additionally, by simply generating special terms to restrict the search, the
     178existing optimisations of the Xapian query parser are taken advantage of.
     179
     180References
     181==========
     182
     183The O-QTM algorithm is described in "Dutton, G. (1996). Encoding and handling
     184geospatial data with hierarchical triangular meshes. In Kraak, M.J. and
     185Molenaar, M. (eds.)  Advances in GIS Research II. London: Taylor & Francis,
     186505-518." , a copy of which is available from
     187http://www.spatial-effects.com/papers/conf/GDutton_SDH96.pdf
     188
     189Some of the geometry needed to calculate the correct set of QTM IDs to cover a
     190particular region is detailed in
     191ftp://ftp.research.microsoft.com/pub/tr/tr-2005-123.pdf
     192
     193Also, see:
     194http://www.sdss.jhu.edu/htm/doc/c++/htmInterface.html
  • xapian-core/geospatial/Makefile.mk

     
     1EXTRA_DIST += \
     2        geospatial/dir_contents \
     3        geospatial/Makefile
     4
     5lib_src += \
     6        geospatial/latlongcoord.cc \
     7        geospatial/latlong_distance_keymaker.cc \
     8        geospatial/latlong_metrics.cc \
     9        geospatial/latlong_posting_source.cc
  • xapian-core/geospatial/latlong_distance_keymaker.cc

     
     1/** \file latlong_distance_keymaker.cc
     2 * \brief LatLongDistanceKeyMaker implementation.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 *
     6 * This program is free software; you can redistribute it and/or
     7 * modify it under the terms of the GNU General Public License as
     8 * published by the Free Software Foundation; either version 2 of the
     9 * License, or (at your option) any later version.
     10 *
     11 * This program is distributed in the hope that it will be useful,
     12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14 * GNU General Public License for more details.
     15 *
     16 * You should have received a copy of the GNU General Public License
     17 * along with this program; if not, write to the Free Software
     18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     19 * USA
     20 */
     21
     22#include <config.h>
     23
     24#include "xapian/geospatial.h"
     25#include "xapian/document.h"
     26#include "xapian/queryparser.h" // For sortable_serialise.
     27
     28using namespace Xapian;
     29
     30std::string
     31LatLongDistanceKeyMaker::operator()(const Document &doc) const
     32{
     33    std::string val(doc.get_value(valno));
     34    LatLongCoords doccoords = LatLongCoords::unserialise(val);
     35    if (doccoords.empty()) {
     36        return defkey;
     37    }
     38    double distance = (*metric)(centre, doccoords);
     39    return sortable_serialise(distance);
     40}
     41
     42LatLongDistanceKeyMaker::~LatLongDistanceKeyMaker()
     43{
     44    delete metric;
     45}
  • xapian-core/geospatial/latlong_posting_source.cc

    Property changes on: xapian-core/geospatial/latlong_distance_keymaker.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1/** @file latlong_posting_source.cc
     2 * @brief LatLongPostingSource implementation.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 * Copyright 2010 Richard Boulton
     6 *
     7 * This program is free software; you can redistribute it and/or
     8 * modify it under the terms of the GNU General Public License as
     9 * published by the Free Software Foundation; either version 2 of the
     10 * License, or (at your option) any later version.
     11 *
     12 * This program is distributed in the hope that it will be useful,
     13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     15 * GNU General Public License for more details.
     16 *
     17 * You should have received a copy of the GNU General Public License
     18 * along with this program; if not, write to the Free Software
     19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     20 * USA
     21 */
     22
     23#include <config.h>
     24
     25#include "xapian/geospatial.h"
     26
     27#include "xapian/document.h"
     28#include "xapian/error.h"
     29#include "xapian/registry.h"
     30
     31#include "serialise.h"
     32#include "serialise-double.h"
     33#include "utils.h"
     34
     35#include <math.h>
     36
     37namespace Xapian {
     38
     39static double
     40weight_from_distance(double dist, double k1, double k2)
     41{
     42    return k1 * pow(dist + k1, -k2);
     43}
     44
     45void
     46LatLongDistancePostingSource::calc_distance()
     47{
     48    std::string val(*value_it);
     49    LatLongCoords coords = LatLongCoords::unserialise(val);
     50    dist = (*metric)(centre, coords);
     51}
     52
     53LatLongDistancePostingSource::LatLongDistancePostingSource(
     54        valueno slot_,
     55        const LatLongCoords & centre_,
     56        const LatLongMetric * metric_,
     57        double max_range_,
     58        double k1_,
     59        double k2_)
     60        : ValuePostingSource(slot_),
     61          centre(centre_),
     62          metric(metric_),
     63          max_range(max_range_),
     64          k1(k1_),
     65          k2(k2_)
     66{
     67    if (k1 <= 0)
     68        throw InvalidArgumentError(
     69            "k1 parameter to LatLongDistancePostingSource must be greater "
     70            "than 0; was " + om_tostring(k1));
     71    if (k2 <= 0)
     72        throw InvalidArgumentError(
     73            "k2 parameter to LatLongDistancePostingSource must be greater "
     74            "than 0; was " + om_tostring(k2));
     75    set_maxweight(weight_from_distance(0, k1, k2));
     76}
     77
     78LatLongDistancePostingSource::LatLongDistancePostingSource(
     79        valueno slot_,
     80        const LatLongCoords & centre_,
     81        const LatLongMetric & metric_,
     82        double max_range_,
     83        double k1_,
     84        double k2_)
     85        : ValuePostingSource(slot_),
     86          centre(centre_),
     87          metric(metric_.clone()),
     88          max_range(max_range_),
     89          k1(k1_),
     90          k2(k2_)
     91{
     92    if (k1 <= 0)
     93        throw InvalidArgumentError(
     94            "k1 parameter to LatLongDistancePostingSource must be greater "
     95            "than 0; was " + om_tostring(k1));
     96    if (k2 <= 0)
     97        throw InvalidArgumentError(
     98            "k2 parameter to LatLongDistancePostingSource must be greater "
     99            "than 0; was " + om_tostring(k2));
     100    set_maxweight(weight_from_distance(0, k1, k2));
     101}
     102
     103LatLongDistancePostingSource::~LatLongDistancePostingSource()
     104{
     105    delete metric;
     106}
     107
     108void
     109LatLongDistancePostingSource::next(weight min_wt)
     110{
     111    ValuePostingSource::next(min_wt);
     112
     113    while (value_it != db.valuestream_end(slot)) {
     114        calc_distance();
     115        if (max_range == 0 || dist <= max_range)
     116            break;
     117        ++value_it;
     118    }
     119}
     120
     121void
     122LatLongDistancePostingSource::skip_to(docid min_docid,
     123                                      weight min_wt)
     124{
     125    ValuePostingSource::skip_to(min_docid, min_wt);
     126
     127    while (value_it != db.valuestream_end(slot)) {
     128        calc_distance();
     129        if (max_range == 0 || dist <= max_range)
     130            break;
     131        ++value_it;
     132    }
     133}
     134
     135bool
     136LatLongDistancePostingSource::check(docid min_docid,
     137                                    weight min_wt)
     138{
     139    if (!ValuePostingSource::check(min_docid, min_wt)) {
     140        // check returned false, so we know the document is not in the source.
     141        return false;
     142    }
     143    if (value_it == db.valuestream_end(slot)) {
     144        // return true, since we're definitely at the end of the list.
     145        return true;
     146    }
     147
     148    calc_distance();
     149    if (max_range > 0 && dist > max_range) {
     150        return false;
     151    }
     152    return true;
     153}
     154
     155weight
     156LatLongDistancePostingSource::get_weight() const
     157{
     158    return weight_from_distance(dist, k1, k2);
     159}
     160
     161LatLongDistancePostingSource *
     162LatLongDistancePostingSource::clone() const
     163{
     164    return new LatLongDistancePostingSource(slot, centre,
     165                                            metric->clone(),
     166                                            max_range, k1, k2);
     167}
     168
     169std::string
     170LatLongDistancePostingSource::name() const
     171{
     172    return std::string("Xapian::LatLongDistancePostingSource");
     173}
     174
     175std::string
     176LatLongDistancePostingSource::serialise() const
     177{
     178    std::string serialised_centre = centre.serialise();
     179    std::string metric_name = metric->name();
     180    std::string serialised_metric = metric->serialise();
     181
     182    std::string result = encode_length(slot);
     183    result += encode_length(serialised_centre.size());
     184    result += serialised_centre;
     185    result += encode_length(metric_name.size());
     186    result += metric_name;
     187    result += encode_length(serialised_metric.size());
     188    result += serialised_metric;
     189    result += serialise_double(max_range);
     190    result += serialise_double(k1);
     191    result += serialise_double(k2);
     192    return result;
     193}
     194
     195LatLongDistancePostingSource *
     196LatLongDistancePostingSource::unserialise_with_registry(const std::string &s,
     197                                             const Registry & registry) const
     198{
     199    const char * p = s.data();
     200    const char * end = p + s.size();
     201
     202    valueno new_slot = decode_length(&p, end, false);
     203    size_t len = decode_length(&p, end, true);
     204    std::string new_serialised_centre(p, len);
     205    p += len;
     206    len = decode_length(&p, end, true);
     207    std::string new_metric_name(p, len);
     208    p += len;
     209    len = decode_length(&p, end, true);
     210    std::string new_serialised_metric(p, len);
     211    p += len;
     212    double new_max_range = unserialise_double(&p, end);
     213    double new_k1 = unserialise_double(&p, end);
     214    double new_k2 = unserialise_double(&p, end);
     215    if (p != end) {
     216        throw NetworkError("Bad serialised LatLongDistancePostingSource - junk at end");
     217    }
     218
     219    LatLongCoords new_centre =
     220            LatLongCoords::unserialise(new_serialised_centre);
     221
     222    const Xapian::LatLongMetric * metric_type =
     223            registry.get_lat_long_metric(new_metric_name);
     224    if (metric_type == NULL) {
     225        throw InvalidArgumentError("LatLongMetric " + new_metric_name +
     226                                   " not registered");
     227    }
     228    LatLongMetric * new_metric =
     229            metric_type->unserialise(new_serialised_metric);
     230
     231    return new LatLongDistancePostingSource(new_slot, new_centre,
     232                                            new_metric,
     233                                            new_max_range, new_k1, new_k2);
     234}
     235
     236void
     237LatLongDistancePostingSource::init(const Database & db_)
     238{
     239    ValuePostingSource::init(db_);
     240    if (max_range > 0.0) {
     241        // Possible that no documents are in range.
     242        termfreq_min = 0;
     243        // Note - would be good to improve termfreq_est here, too, but
     244        // I can't think of anything we can do with the information
     245        // available.
     246    }
     247}
     248
     249std::string
     250LatLongDistancePostingSource::get_description() const
     251{
     252    return "Xapian::LatLongDistancePostingSource(slot=" + om_tostring(slot) + ")";
     253}
     254
     255}
  • xapian-core/geospatial/latlong_metrics.cc

    Property changes on: xapian-core/geospatial/latlong_posting_source.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1/** \file latlong_metrics.cc
     2 * \brief Geospatial distance metrics.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 *
     6 * This program is free software; you can redistribute it and/or
     7 * modify it under the terms of the GNU General Public License as
     8 * published by the Free Software Foundation; either version 2 of the
     9 * License, or (at your option) any later version.
     10 *
     11 * This program is distributed in the hope that it will be useful,
     12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14 * GNU General Public License for more details.
     15 *
     16 * You should have received a copy of the GNU General Public License
     17 * along with this program; if not, write to the Free Software
     18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     19 * USA
     20 */
     21
     22#include <config.h>
     23
     24#include "xapian/geospatial.h"
     25#include "xapian/error.h"
     26#include "serialise-double.h"
     27
     28#include <math.h>
     29
     30using namespace Xapian;
     31
     32/** Quadratic mean radius of the earth in metres.
     33 */
     34#define QUAD_EARTH_RADIUS_METRES 6372797.6
     35
     36/** Set M_PI if it's not already set.
     37 */
     38#ifndef M_PI
     39#define M_PI 3.14159265358979323846
     40#endif
     41
     42LatLongMetric::~LatLongMetric()
     43{
     44}
     45
     46double
     47LatLongMetric::operator()(const LatLongCoords & a, const LatLongCoords &b) const
     48{
     49    if (a.empty() || b.empty()) {
     50        throw InvalidArgumentError("Empty coordinate list supplied to LatLongMetric::operator()().");
     51    }
     52    double min_dist = 0.0;
     53    bool have_min = false;
     54    for (std::set<LatLongCoord>::const_iterator a_iter = a.begin();
     55         a_iter != a.end();
     56         ++a_iter)
     57    {
     58        for (std::set<LatLongCoord>::const_iterator b_iter = b.begin();
     59             b_iter != b.end();
     60             ++b_iter)
     61        {
     62            double dist = operator()(*a_iter, *b_iter);
     63            if (!have_min) {
     64                min_dist = dist;
     65                have_min = true;
     66            } else if (dist < min_dist) {
     67                min_dist = dist;
     68            }
     69        }
     70    }
     71    return min_dist;
     72}
     73
     74
     75GreatCircleMetric::GreatCircleMetric()
     76        : radius(QUAD_EARTH_RADIUS_METRES)
     77{}
     78
     79GreatCircleMetric::GreatCircleMetric(double radius_)
     80        : radius(radius_)
     81{}
     82
     83double
     84GreatCircleMetric::operator()(const LatLongCoord & a,
     85                              const LatLongCoord & b) const
     86{
     87    double lata = a.latitude * (M_PI / 180.0);
     88    double latb = b.latitude * (M_PI / 180.0);
     89
     90    double latdiff = lata - latb;
     91    double longdiff = (a.longitude - b.longitude) * (M_PI / 180.0);
     92
     93    double sin_half_lat = sin(latdiff / 2);
     94    double sin_half_long = sin(longdiff / 2);
     95    double h = sin_half_lat * sin_half_lat +
     96            sin_half_long * sin_half_long * cos(lata) * cos(latb);
     97    double sqrt_h = sqrt(h);
     98    if (sqrt_h > 1.0) sqrt_h = 1.0;
     99    return 2 * radius * asin(sqrt_h);
     100}
     101
     102LatLongMetric *
     103GreatCircleMetric::clone() const
     104{
     105    return new GreatCircleMetric(radius);
     106}
     107
     108std::string
     109GreatCircleMetric::name() const
     110{
     111    return "Xapian::GreatCircleMetric";
     112}
     113
     114std::string
     115GreatCircleMetric::serialise() const
     116{
     117    return serialise_double(radius);
     118}
     119
     120LatLongMetric *
     121GreatCircleMetric::unserialise(const std::string & s) const
     122{
     123    const char * p = s.data();
     124    const char * end = p + s.size();
     125
     126    double new_radius = unserialise_double(&p, end);
     127    if (p != end) {
     128        throw Xapian::NetworkError("Bad serialised GreatCircleMetric - junk at end");
     129    }
     130
     131    return new GreatCircleMetric(new_radius);
     132}
  • xapian-core/geospatial/dir_contents

    Property changes on: xapian-core/geospatial/latlong_metrics.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1<Directory>geospatial</Directory>
     2
     3<Description>
     4Support for geospatial matching, and parsing of locations.
     5</Description>
  • xapian-core/geospatial/latlongcoord.cc

     
     1/** \file latlong.cc
     2 * \brief Latitude and longitude representations.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 *
     6 * This program is free software; you can redistribute it and/or
     7 * modify it under the terms of the GNU General Public License as
     8 * published by the Free Software Foundation; either version 2 of the
     9 * License, or (at your option) any later version.
     10 *
     11 * This program is distributed in the hope that it will be useful,
     12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14 * GNU General Public License for more details.
     15 *
     16 * You should have received a copy of the GNU General Public License
     17 * along with this program; if not, write to the Free Software
     18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     19 * USA
     20 */
     21
     22#include <config.h>
     23
     24#include "xapian/geospatial.h"
     25#include "xapian/error.h"
     26
     27#include "serialise.h"
     28#include "serialise-double.h"
     29#include "str.h"
     30
     31#include <math.h>
     32
     33using namespace Xapian;
     34
     35LatLongCoord::LatLongCoord(double latitude_, double longitude_)
     36        : latitude(latitude_),
     37          longitude(longitude_)
     38{
     39    if (latitude < -90.0 || latitude > 90.0)
     40        throw InvalidArgumentError("Latitude out-of-range");
     41    longitude = fmod(longitude_, 360);
     42    if (longitude <= -180) longitude += 360;
     43    if (longitude > 180) longitude -= 360;
     44    if (longitude == -0.0) longitude = 0.0;
     45}
     46
     47LatLongCoord
     48LatLongCoord::unserialise(const std::string & serialised)
     49{
     50    const char * ptr = serialised.data();
     51    const char * end = ptr + serialised.size();
     52    LatLongCoord result = unserialise(&ptr, end);
     53    if (ptr != end)
     54        throw InvalidArgumentError(
     55                "Junk found at end of serialised LatLongCoord");
     56    return result;
     57}
     58
     59LatLongCoord
     60LatLongCoord::unserialise(const char ** ptr, const char * end)
     61{
     62    try {
     63        // This will raise NetworkError for invalid serialisations.
     64        double latitude = unserialise_double(ptr, end);
     65        double longitude = unserialise_double(ptr, end);
     66        return LatLongCoord(latitude, longitude);
     67    } catch (const NetworkError & e) {
     68        // FIXME - modify unserialise_double somehow so we don't have to catch
     69        // and rethrow the exceptions it raises.
     70        throw InvalidArgumentError(e.get_msg());
     71    }
     72}
     73
     74std::string
     75LatLongCoord::serialise() const
     76{
     77    std::string result(serialise_double(latitude));
     78    result += serialise_double(longitude);
     79    return result;
     80}
     81
     82std::string
     83LatLongCoord::get_description() const
     84{
     85    std::string res("Xapian::LatLongCoord(");
     86    res += str(latitude);
     87    res += ", ";
     88    res += str(longitude);
     89    res += ")";
     90    return res;
     91}
     92
     93LatLongCoords
     94LatLongCoords::unserialise(const std::string & serialised)
     95{
     96    const char * ptr = serialised.data();
     97    const char * end = ptr + serialised.size();
     98    LatLongCoords coords = unserialise(ptr, end);
     99    return coords;
     100}
     101
     102LatLongCoords
     103LatLongCoords::unserialise(const char * ptr, const char * end)
     104{
     105    LatLongCoords result;
     106    try {
     107        while (ptr != end) {
     108            // This will raise NetworkError for invalid serialisations (so we
     109            // catch and re-throw it).
     110            result.coords.insert(LatLongCoord::unserialise(&ptr, end));
     111        }
     112    } catch (const NetworkError & e) {
     113        // FIXME - modify unserialise_double somehow so we don't have to catch
     114        // and rethrow the exceptions it raises.
     115        throw InvalidArgumentError(e.get_msg());
     116    }
     117    if (ptr != end)
     118        throw InvalidArgumentError(
     119                "Junk found at end of serialised LatLongCoords");
     120    return result;
     121}
     122
     123std::string
     124LatLongCoords::serialise() const
     125{
     126    std::string result;
     127    std::set<LatLongCoord>::const_iterator coord;
     128    for (coord = coords.begin(); coord != coords.end(); ++coord)
     129    {
     130        result += serialise_double(coord->latitude);
     131        result += serialise_double(coord->longitude);
     132    }
     133    return result;
     134}
     135
     136std::string
     137LatLongCoords::get_description() const
     138{
     139    std::string res("Xapian::LatLongCoords(");
     140    std::set<LatLongCoord>::const_iterator coord;
     141    for (coord = coords.begin(); coord != coords.end(); ++coord) {
     142        if (coord != coords.begin()) {
     143            res += ", ";
     144        }
     145        res += "(";
     146        res += str(coord->latitude);
     147        res += ", ";
     148        res += str(coord->longitude);
     149        res += ")";
     150    }
     151    res += ")";
     152    return res;
     153}
  • xapian-core/geospatial/Makefile

    Property changes on: xapian-core/geospatial/latlongcoord.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1# Makefile for use in directories built by non-recursive make.
     2
     3SHELL = /bin/sh
     4
     5all check:
     6        cd .. && $(MAKE) $@
     7
     8clean:
     9        rm -f *.o *.obj *.lo
  • xapian-core/tests/api_geospatial.cc

    Property changes on: xapian-core/geospatial/Makefile
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1/** @file api_geospatial.cc
     2 * @brief Tests of geospatial functionality.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 * Copyright 2010 Richard Boulton
     6 *
     7 * This program is free software; you can redistribute it and/or
     8 * modify it under the terms of the GNU General Public License as
     9 * published by the Free Software Foundation; either version 2 of the
     10 * License, or (at your option) any later version.
     11 *
     12 * This program is distributed in the hope that it will be useful,
     13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     15 * GNU General Public License for more details.
     16 *
     17 * You should have received a copy of the GNU General Public License
     18 * along with this program; if not, write to the Free Software
     19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     20 * USA
     21 */
     22
     23#include <config.h>
     24#include "api_geospatial.h"
     25#include <xapian/geospatial.h>
     26#include <xapian/error.h>
     27
     28#include "apitest.h"
     29#include "testsuite.h"
     30#include "testutils.h"
     31#include <iomanip>
     32
     33using namespace std;
     34using namespace Xapian;
     35
     36// #######################################################################
     37// # Tests start here
     38
     39static void
     40builddb_coords1(Xapian::WritableDatabase &db, const string &)
     41{
     42    Xapian::LatLongCoord coord1(10, 10);
     43    Xapian::LatLongCoord coord2(20, 10);
     44    Xapian::LatLongCoord coord3(30, 10);
     45
     46    Xapian::Document doc;
     47    doc.add_value(0, coord1.serialise());
     48    db.add_document(doc);
     49
     50    doc = Xapian::Document();
     51    doc.add_value(0, coord2.serialise());
     52    db.add_document(doc);
     53
     54    doc = Xapian::Document();
     55    doc.add_value(0, coord3.serialise());
     56    db.add_document(doc);
     57}
     58
     59/// Test behaviour of the LatLongPostingSource
     60DEFINE_TESTCASE(latlongpostingsource1, backend && writable && !remote && !inmemory) {
     61    Xapian::Database db = get_database("coords1", builddb_coords1, "");
     62    Xapian::LatLongCoord coord1(10, 10);
     63    Xapian::LatLongCoord coord2(20, 10);
     64    Xapian::LatLongCoord coord3(30, 10);
     65
     66    // Chert doesn't currently support opening a value iterator for a writable database.
     67    SKIP_TEST_FOR_BACKEND("chert");
     68
     69    Xapian::GreatCircleMetric metric;
     70    Xapian::LatLongCoords centre;
     71    centre.insert(coord1);
     72    double coorddist = metric(coord1, coord2);
     73    TEST_EQUAL_DOUBLE(coorddist, metric(coord2, coord3));
     74
     75    // Test a search with no range restriction.
     76    {
     77        Xapian::LatLongDistancePostingSource ps(0, centre, metric);
     78        ps.init(db);
     79
     80        ps.next(0.0);
     81        TEST_EQUAL(ps.at_end(), false);
     82        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     83        TEST_EQUAL(ps.get_docid(), 1);
     84
     85        ps.next(0.0);
     86        TEST_EQUAL(ps.at_end(), false);
     87        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     88        TEST_EQUAL(ps.get_docid(), 2);
     89
     90        ps.next(0.0);
     91        TEST_EQUAL(ps.at_end(), false);
     92        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist * 2));
     93        TEST_EQUAL(ps.get_docid(), 3);
     94
     95        ps.next(0.0);
     96        TEST_EQUAL(ps.at_end(), true);
     97    }
     98
     99    // Test a search with a tight range restriction
     100    {
     101        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 0.5);
     102        ps.init(db);
     103
     104        ps.next(0.0);
     105        TEST_EQUAL(ps.at_end(), false);
     106        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     107
     108        ps.next(0.0);
     109        TEST_EQUAL(ps.at_end(), true);
     110    }
     111
     112    // Test a search with a looser range restriction
     113    {
     114        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist);
     115        ps.init(db);
     116
     117        ps.next(0.0);
     118        TEST_EQUAL(ps.at_end(), false);
     119        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     120
     121        ps.next(0.0);
     122        TEST_EQUAL(ps.at_end(), false);
     123        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     124        TEST_EQUAL(ps.get_docid(), 2);
     125
     126        ps.next(0.0);
     127        TEST_EQUAL(ps.at_end(), true);
     128    }
     129
     130    // Test a search with a looser range restriction, but not enough to return
     131    // the next document.
     132    {
     133        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 1.5);
     134        ps.init(db);
     135
     136        ps.next(0.0);
     137        TEST_EQUAL(ps.at_end(), false);
     138        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     139
     140        ps.next(0.0);
     141        TEST_EQUAL(ps.at_end(), false);
     142        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     143        TEST_EQUAL(ps.get_docid(), 2);
     144
     145        ps.next(0.0);
     146        TEST_EQUAL(ps.at_end(), true);
     147    }
     148
     149    // Test a search with a loose enough range restriction that all docs should
     150    // be returned.
     151    {
     152        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 2.5);
     153        ps.init(db);
     154
     155        ps.next(0.0);
     156        TEST_EQUAL(ps.at_end(), false);
     157        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     158
     159        ps.next(0.0);
     160        TEST_EQUAL(ps.at_end(), false);
     161        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     162        TEST_EQUAL(ps.get_docid(), 2);
     163
     164        ps.next(0.0);
     165        TEST_EQUAL(ps.at_end(), false);
     166        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist * 2));
     167        TEST_EQUAL(ps.get_docid(), 3);
     168
     169        ps.next(0.0);
     170        TEST_EQUAL(ps.at_end(), true);
     171    }
     172
     173    return true;
     174}
     175
     176// Test various methods of LatLongCoord and LatLongCoords
     177DEFINE_TESTCASE(latlongcoords1, !backend) {
     178    LatLongCoord c1(0, 0);
     179    LatLongCoord c2(1, 0);
     180    LatLongCoord c3(1, 0);
     181
     182    // Test comparison
     183    TEST_NOT_EQUAL(c1.get_description(), c2.get_description());
     184    TEST(c1 < c2 || c2 < c1);
     185    TEST_EQUAL(c2.get_description(), c3.get_description());
     186    TEST(!(c2 < c3) && !(c3 < c2));
     187
     188    // Test serialisation
     189    std::string s1 = c1.serialise();
     190    LatLongCoord c4 = LatLongCoord::unserialise(s1);
     191    TEST(!(c1 < c4 || c4 < c1));
     192    const char * ptr = s1.data();
     193    const char * end = ptr + s1.size();
     194    c4 = LatLongCoord::unserialise(&ptr, end);
     195    TEST_EQUAL(c1.get_description(), c4.get_description());
     196    TEST_EQUAL(c1.get_description(), "Xapian::LatLongCoord(0, 0)");
     197    TEST_EQUAL(ptr, end);
     198
     199    // Test building a set of LatLongCoords
     200    LatLongCoords g1(c1);
     201    TEST(!g1.empty());
     202    TEST_EQUAL(g1.size(), 1);
     203    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0))");
     204    g1.insert(c2);
     205    TEST_EQUAL(g1.size(), 2);
     206    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))");
     207    // c3 == c2, so already in the set, so no change if we add c3
     208    g1.insert(c3);
     209    TEST_EQUAL(g1.size(), 2);
     210    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))");
     211    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))");
     212    g1.erase(c3);
     213    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0))");
     214
     215    // Test building an empty LatLongCoords
     216    LatLongCoords g2;
     217    TEST(g2.empty());
     218    TEST_EQUAL(g2.size(), 0);
     219    TEST_EQUAL(g2.get_description(), "Xapian::LatLongCoords()");
     220
     221    return true;
     222}
     223
     224// Test various methods of LatLongMetric
     225DEFINE_TESTCASE(latlongmetric1, !backend) {
     226    LatLongCoord c1(0, 0);
     227    LatLongCoord c2(1, 0);
     228    Xapian::GreatCircleMetric m1;
     229    double d1 = m1(c1, c2);
     230    TEST_REL(d1, >, 111226.0);
     231    TEST_REL(d1, <, 111227.0);
     232
     233    // Let's make another metric, this time using the radius of mars, so
     234    // distances should be quite a bit smaller.
     235    Xapian::GreatCircleMetric m2(3310000);
     236    double d2 = m2(c1, c2);
     237    TEST_REL(d2, >, 57770.0);
     238    TEST_REL(d2, <, 57771.0);
     239
     240    // Check serialise and unserialise.
     241    Xapian::Registry registry;
     242    std::string s1 = m2.serialise();
     243    const Xapian::LatLongMetric * m3;
     244    m3 = registry.get_lat_long_metric(m2.name());
     245    TEST(m3 != NULL);
     246    m3 = m3->unserialise(s1);
     247    double d3 = (*m3)(c1, c2);
     248    TEST_EQUAL_DOUBLE(d2, d3);
     249
     250    delete m3;
     251
     252    return true;
     253}
     254
     255// Test a LatLongDistanceKeyMaker directly.
     256DEFINE_TESTCASE(latlongkeymaker1, !backend) {
     257    Xapian::GreatCircleMetric m1(3310000);
     258    LatLongCoord c1(0, 0);
     259    LatLongCoord c2(1, 0);
     260    LatLongCoord c3(2, 0);
     261    LatLongCoord c4(3, 0);
     262
     263    LatLongCoords g1(c1);
     264    g1.insert(c2);
     265
     266    LatLongDistanceKeyMaker keymaker(0, g1, m1);
     267    Xapian::Document doc1;
     268    doc1.add_value(0, g1.serialise());
     269    Xapian::Document doc2;
     270    doc2.add_value(0, c3.serialise());
     271    Xapian::Document doc3;
     272    doc3.add_value(0, c4.serialise());
     273    Xapian::Document doc4;
     274
     275    std::string k1 = keymaker(doc1);
     276    std::string k2 = keymaker(doc2);
     277    std::string k3 = keymaker(doc3);
     278    std::string k4 = keymaker(doc4);
     279    TEST_REL(k1, <, k2);
     280    TEST_REL(k2, <, k3);
     281    TEST_REL(k3, <, k4);
     282
     283    LatLongDistanceKeyMaker keymaker2(0, g1, m1, 0);
     284    std::string k3b = keymaker2(doc3);
     285    std::string k4b = keymaker2(doc4);
     286    TEST_EQUAL(k3, k3b);
     287    TEST_REL(k3b, >, k4b);
     288
     289    return true;
     290}
  • xapian-core/tests/Makefile.am

    Property changes on: xapian-core/tests/api_geospatial.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
    155155 api_unicode.cc \
    156156 api_valuestats.cc \
    157157 api_valuestream.cc \
     158 api_geospatial.cc \
    158159 api_wrdb.cc
    159160
    160161apitest_SOURCES = apitest.cc dbcheck.cc $(collated_apitest_sources) \
  • xapian-core/include/xapian/postingsource.h

     
    3131
    3232namespace Xapian {
    3333
     34class Registry;
     35
    3436/** Base class which provides an "external" source of postings.
    3537 *
    3638 *  Warning: the PostingSource interface is currently experimental, and is
     
    285287     */
    286288    virtual PostingSource * unserialise(const std::string &s) const;
    287289
     290    /** Create object given string serialisation returned by serialise().
     291     *
     292     *  Note that the returned object will be deallocated by Xapian after use
     293     *  with "delete".  It must therefore have been allocated with "new".
     294     *
     295     *  This method is supplied with a Registry object, which can be used when
     296     *  unserialising objects contained within the posting source.  The default
     297     *  implementation simply calls unserialise() which doesn't take the
     298     *  Registry object, so you do not need to implement this method unless you
     299     *  want to take advantage of the Registry object when unserialising.
     300     *
     301     *  @param s A serialised instance of this PostingSource subclass.
     302     */
     303    virtual PostingSource * unserialise_with_registry(const std::string &s,
     304                                      const Registry & registry) const;
     305
    288306    /** Set this PostingSource to the start of the list of postings.
    289307     *
    290308     *  This is called automatically by the matcher prior to each query being
  • xapian-core/include/xapian/registry.h

     
    3030namespace Xapian {
    3131
    3232// Forward declarations.
     33class LatLongMetric;
    3334class MatchSpy;
    3435class PostingSource;
    3536class Weight;
     
    105106     */
    106107    const Xapian::MatchSpy *
    107108            get_match_spy(const std::string & name) const;
     109
     110    /// Register a user-defined lat-long metric class.
     111    void register_lat_long_metric(const Xapian::LatLongMetric &metric);
     112
     113    /** Get a lat-long metric given a name.
     114     *
     115     *  The returned metric is owned by the registry object.
     116     *
     117     *  Returns NULL if the metric could not be found.
     118     */
     119    const Xapian::LatLongMetric *
     120            get_lat_long_metric(const std::string & name) const;
     121
    108122};
    109123
    110124}
  • xapian-core/include/xapian/geospatial.h

     
     1/** @file geospatial.h
     2 * @brief Geospatial search support routines.
     3 */
     4/* Copyright 2008,2009 Lemur Consulting Ltd
     5 * Copyright 2010 Richard Boulton
     6 *
     7 * This program is free software; you can redistribute it and/or
     8 * modify it under the terms of the GNU General Public License as
     9 * published by the Free Software Foundation; either version 2 of the
     10 * License, or (at your option) any later version.
     11 *
     12 * This program is distributed in the hope that it will be useful,
     13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     15 * GNU General Public License for more details.
     16 *
     17 * You should have received a copy of the GNU General Public License
     18 * along with this program; if not, write to the Free Software
     19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     20 * USA
     21 */
     22
     23#ifndef XAPIAN_INCLUDED_GEOSPATIAL_H
     24#define XAPIAN_INCLUDED_GEOSPATIAL_H
     25
     26#include <xapian/enquire.h>
     27#include <xapian/postingsource.h>
     28#include <xapian/queryparser.h> // For sortable_serialise
     29#include <xapian/keymaker.h>
     30#include <xapian/visibility.h>
     31#include <string>
     32#include <set>
     33
     34namespace Xapian {
     35
     36class Registry;
     37
     38/** Convert from miles to metres.
     39 */
     40inline XAPIAN_VISIBILITY_DEFAULT double
     41miles_to_metres(double miles)
     42{
     43    return 1609.344 * miles;
     44}
     45
     46/** Convert from metres to miles.
     47 */
     48inline XAPIAN_VISIBILITY_DEFAULT double
     49metres_to_miles(double metres)
     50{
     51    return metres * (1.0 / 1609.344);
     52}
     53
     54/** Convert from feet to metres.
     55 */
     56inline XAPIAN_VISIBILITY_DEFAULT double
     57feet_to_metres(double feet)
     58{
     59    return 0.3048 * feet;
     60}
     61
     62/** Convert from metres to feet.
     63 */
     64inline XAPIAN_VISIBILITY_DEFAULT double
     65metres_to_feet(double metres)
     66{
     67    return metres * (1.0 / 0.3048);
     68}
     69
     70/** Convert from nautical miles to metres.
     71 */
     72inline XAPIAN_VISIBILITY_DEFAULT double
     73nautical_miles_to_metres(double nautical_miles)
     74{
     75    return 1852.0 * nautical_miles;
     76}
     77
     78/** Convert from metres to nautical miles.
     79 */
     80inline XAPIAN_VISIBILITY_DEFAULT double
     81metres_to_nautical_miles(double metres)
     82{
     83    return metres * (1.0 / 1852.0);
     84}
     85
     86/** A latitude-longitude coordinate.
     87 *
     88 *  Note that latitude-longitude coordinates are only precisely meaningful if
     89 *  the datum used to define them is specified.  This class ignores this
     90 *  issue - it is up to the caller to ensure that the datum used for each
     91 *  coordinate in a system is consistent.
     92 */
     93struct XAPIAN_VISIBILITY_DEFAULT LatLongCoord {
     94  public:
     95    /** A latitude, as decimal degrees.
     96     *
     97     *  Should be in the range -90 <= longitude <= 90
     98     *
     99     *  Postive latitudes represent the northern hemisphere.
     100     */
     101    double latitude;
     102
     103    /** A longitude, as decimal degrees.
     104     *
     105     *  Should be in the range -180 < latitude <= 180
     106     *
     107     *  Positive longitudes represent the eastern hemisphere.
     108     */
     109    double longitude;
     110
     111    /** Construct a coordinate.
     112     *
     113     *  If the supplied longitude is out of range, an exception will be raised.
     114     *
     115     *  If the supplied latitude is out of range, it will be normalised to the
     116     *  appropriate range.
     117     */
     118    LatLongCoord(double latitude_, double longitude_);
     119
     120    /** Construct a coordinate by unserialising a string.
     121     *
     122     *  @param serialised the string to unserialise the coordinate from.
     123     *
     124     *  @exception Xapian::InvalidArgumentError if the string does not contain
     125     *  a valid serialised latitude-longitude pair, or contains extra data at
     126     *  the end of it.
     127     */
     128    static LatLongCoord unserialise(const std::string & serialised);
     129
     130    /** Construct a coordinate by unserialising a string.
     131     *
     132     *  The string may contain further data after that for the coordinate.
     133     *
     134     *  @param ptr A pointer to the start of the string.  This will be updated
     135     *  to point to the end of the data representing the coordinate.
     136     *  @param end A pointer to the end of the string.
     137     *
     138     *  @exception Xapian::InvalidArgumentError if the string does not contain
     139     *  a valid serialised latitude-longitude pair.
     140     */
     141    static LatLongCoord unserialise(const char ** ptr, const char * end);
     142
     143    /** Return a serialised representation of the coordinate.
     144     */
     145    std::string serialise() const;
     146
     147    /** Compare with another LatLongCoord.
     148     */
     149    bool operator<(const LatLongCoord & other) const
     150    {
     151        if (latitude < other.latitude) return true;
     152        return (longitude < other.longitude);
     153    }
     154
     155    /// Return a string describing this object.
     156    std::string get_description() const;
     157};
     158
     159/** A set of latitude-longitude coordinate.
     160 */
     161class XAPIAN_VISIBILITY_DEFAULT LatLongCoords {
     162    /// The coordinates.
     163    std::set<LatLongCoord> coords;
     164
     165  public:
     166    std::set<LatLongCoord>::const_iterator begin() const
     167    {
     168        return coords.begin();
     169    }
     170
     171    std::set<LatLongCoord>::const_iterator end() const
     172    {
     173        return coords.end();
     174    }
     175
     176    size_t size() const
     177    {
     178        return coords.size();
     179    }
     180
     181    size_t empty() const
     182    {
     183        return coords.empty();
     184    }
     185
     186    void insert(const LatLongCoord & coord)
     187    {
     188        coords.insert(coord);
     189    }
     190
     191    void erase(const LatLongCoord & coord)
     192    {
     193        coords.erase(coord);
     194    }
     195
     196    /// Construct an empty set of coordinates.
     197    LatLongCoords() : coords() {}
     198
     199    /// Construct a set of coordinates containing one coordinate.
     200    LatLongCoords(const LatLongCoord & coord) : coords()
     201    {
     202        coords.insert(coord);
     203    }
     204
     205    /** Construct a set of coordinates by unserialising a string.
     206     *
     207     *  @param serialised the string to unserialise the coordinates from.
     208     *
     209     *  @exception Xapian::InvalidArgumentError if the string does not contain
     210     *  a valid serialised latitude-longitude pair, or contains junk at the end
     211     *  of it.
     212     */
     213    static LatLongCoords unserialise(const std::string & serialised);
     214
     215    /** Construct a set of coordinates by unserialising a string.
     216     *
     217     *  The string may NOT contain further data after the coordinates (the
     218     *  representation of the list of coordinates is not self-terminating).
     219     *
     220     *  @param ptr A pointer to the start of the string.
     221     *  @param end A pointer to the end of the string.
     222     *
     223     *  @exception Xapian::InvalidArgumentError if the string does not contain
     224     *  a valid serialised latitude-longitude pair, or contains junk at the end
     225     *  of it.
     226     */
     227    static LatLongCoords unserialise(const char * ptr, const char * end);
     228
     229    /** Return a serialised form of the coordinate list.
     230     */
     231    std::string serialise() const;
     232
     233    /// Return a string describing this object.
     234    std::string get_description() const;
     235};
     236
     237/** Base class for calculating distances between two lat/long coordinates.
     238 */
     239class XAPIAN_VISIBILITY_DEFAULT LatLongMetric {
     240  public:
     241    /// Destructor.
     242    virtual ~LatLongMetric();
     243
     244    /** Return the distance between two coordinates, in metres.
     245     */
     246    virtual double operator()(const LatLongCoord & a, const LatLongCoord &b) const = 0;
     247
     248    /** Return the distance between two coordinate lists, in metres.
     249     *
     250     *  The distance between the coordinate lists is defined to the be minimum
     251     *  pairwise distance between coordinates in the lists.
     252     *
     253     *  If either of the lists is empty, an InvalidArgumentError will be raised.
     254     */
     255    double operator()(const LatLongCoords & a, const LatLongCoords &b) const;
     256
     257    /** Clone the metric. */
     258    virtual LatLongMetric * clone() const = 0;
     259
     260    /** Return the full name of the metric.
     261     *
     262     *  This is used when serialising and unserialising metrics; for example,
     263     *  for performing remote searches.
     264     *
     265     *  If the subclass is in a C++ namespace, the namespace should be included
     266     *  in the name, using "::" as a separator.  For example, for a
     267     *  LatLongMetric subclass called "FooLatLongMetric" in the "Xapian"
     268     *  namespace the result of this call should be "Xapian::FooLatLongMetric".
     269     */
     270    virtual std::string name() const = 0;
     271
     272    /** Serialise object parameters into a string.
     273     *
     274     *  The serialised parameters should represent the configuration of the
     275     *  metric.
     276     */
     277    virtual std::string serialise() const = 0;
     278
     279    /** Create object given string serialisation returned by serialise().
     280     *
     281     *  @param s A serialised instance of this LatLongMetric subclass.
     282     */
     283    virtual LatLongMetric * unserialise(const std::string & s) const = 0;
     284};
     285
     286/** Calculate the great-circle distance between two coordinates on a sphere.
     287 *
     288 *  This uses the haversine formula to calculate the distance.  Note that this
     289 *  formula is subject to inaccuracy due to numerical errors for coordinates on
     290 *  the opposite side of the sphere.
     291 *
     292 *  See http://en.wikipedia.org/wiki/Haversine_formula
     293 */
     294class XAPIAN_VISIBILITY_DEFAULT GreatCircleMetric : public LatLongMetric {
     295    /** The radius of the sphere in metres.
     296     */
     297    double radius;
     298
     299  public:
     300    /** Construct a GreatCircleMetric.
     301     *
     302     *  The (quadratic mean) radius of the earth will be used by this
     303     *  calculator.
     304     */
     305    GreatCircleMetric();
     306
     307    /** Construct a GreatCircleMetric using a specified radius.
     308     *
     309     *  @param radius_ The radius of to use, in metres.
     310     */
     311    GreatCircleMetric(double radius_);
     312
     313    /** Return the great-circle distance between points on the sphere.
     314     */
     315    double operator()(const LatLongCoord & a, const LatLongCoord &b) const;
     316
     317    LatLongMetric * clone() const;
     318    std::string name() const;
     319    std::string serialise() const;
     320    LatLongMetric * unserialise(const std::string & s) const;
     321};
     322
     323/** Posting source which returns a weight based on geospatial distance.
     324 *
     325 *  Results are weighted by the distance from a fixed point, or list of points,
     326 *  calculated according to the metric supplied.  If multiple points are
     327 *  supplied (either in the constructor, or in the coordinates stored in a
     328 *  document) , the closest pointwise distance is returned.
     329 *
     330 *  Documents further away than a specified maximum range (or with no location
     331 *  stored in the specified slot) will not be returned.
     332 *
     333 *  The weight returned will be computed from the distance using the formula:
     334 *  k1 * (distance + k1) ** (- k2)
     335 *
     336 *  (Where k1 and k2 are (strictly) positive, floating point, constants, and
     337 *  default to 1000 and 1, respectively.  Distance is measured in metres, so
     338 *  this means that something at the centre gets a weight of 1.0, something 1km
     339 *  away gets a weight of 0.5, and something 3km away gets a weight of 0.25,
     340 *  etc)
     341 */
     342class XAPIAN_VISIBILITY_DEFAULT LatLongDistancePostingSource : public ValuePostingSource
     343{
     344    /// Current distance from centre.
     345    double dist;
     346
     347    /// Centre, to compute distance from.
     348    LatLongCoords centre;
     349
     350    /// Metric to compute the distance with.
     351    const LatLongMetric * metric;
     352
     353    /// Maximum range to allow.  If set to 0, there is no maximum range.
     354    double max_range;
     355
     356    /// Constant used in weighting function.
     357    double k1;
     358
     359    /// Constant used in weighting function.
     360    double k2;
     361
     362    /** Calculate the distance for the current document.
     363     *
     364     *  Returns true if the distance was calculated ok, or false if the
     365     *  document didn't contain a valid serialised set of coordinates in the
     366     *  appropriate value slot.
     367     */
     368    void calc_distance();
     369
     370    /// Internal constructor; used by clone() and serialise().
     371    LatLongDistancePostingSource(Xapian::valueno slot_,
     372                                 const LatLongCoords & centre_,
     373                                 const LatLongMetric * metric_,
     374                                 double max_range_,
     375                                 double k1_,
     376                                 double k2_);
     377
     378  public:
     379    /** Construct a new match decider which returns only documents within
     380     *  range of one of the central coordinates.
     381     *
     382     *  @param db_ The database to read values from.
     383     *  @param slot_ The value slot to read values from.
     384     *  @param centre_ The centre point to use for distance calculations.
     385     *  @param metric_ The metric to use for distance calculations.
     386     *  @param max_range_ The maximum distance for documents which are returned.
     387     *  @param k1_ The k1 constant to use in the weighting function.
     388     *  @param k2_ The k2 constant to use in the weighting function.
     389     */
     390    LatLongDistancePostingSource(Xapian::valueno slot_,
     391                                 const LatLongCoords & centre_,
     392                                 const LatLongMetric & metric_,
     393                                 double max_range_ = 0.0,
     394                                 double k1_ = 1000.0,
     395                                 double k2_ = 1.0);
     396    ~LatLongDistancePostingSource();
     397
     398    void next(Xapian::weight min_wt);
     399    void skip_to(Xapian::docid min_docid, Xapian::weight min_wt);
     400    bool check(Xapian::docid min_docid, Xapian::weight min_wt);
     401
     402    Xapian::weight get_weight() const;
     403    LatLongDistancePostingSource * clone() const;
     404    std::string name() const;
     405    std::string serialise() const;
     406    LatLongDistancePostingSource *
     407            unserialise_with_registry(const std::string &s,
     408                                      const Registry & registry) const;
     409    void init(const Database & db_);
     410
     411    std::string get_description() const;
     412};
     413
     414/** KeyMaker subclass which sorts by distance from a latitude/longitude.
     415 *
     416 *  Results are ordered by the distance from a fixed point, or list of points,
     417 *  calculated according to the metric supplied.  If multiple points are
     418 *  supplied (either in the constructor, or in the coordinates stored in a
     419 *  document), the closest pointwise distance is returned.
     420 *
     421 *  If a document contains no
     422 */
     423class XAPIAN_VISIBILITY_DEFAULT LatLongDistanceKeyMaker : public KeyMaker {
     424
     425    /// The value slot to read.
     426    Xapian::valueno valno;
     427
     428    /// The centre point (or points) for distance calculation.
     429    LatLongCoords centre;
     430
     431    /// The metric to use when calculating distances.
     432    const LatLongMetric * metric;
     433
     434    /// The default key to return, for documents with no value stored.
     435    std::string defkey;
     436
     437  public:
     438    LatLongDistanceKeyMaker(Xapian::valueno valno_,
     439                            const LatLongCoords & centre_,
     440                            const LatLongMetric & metric_,
     441                            double defdistance = 10E10)
     442            : valno(valno_),
     443              centre(centre_),
     444              metric(metric_.clone()),
     445              defkey(sortable_serialise(defdistance))
     446    {}
     447
     448    LatLongDistanceKeyMaker(Xapian::valueno valno_,
     449                            const LatLongCoord & centre_,
     450                            const LatLongMetric & metric_,
     451                            double defdistance = 10E10)
     452            : valno(valno_),
     453              centre(),
     454              metric(metric_.clone()),
     455              defkey(sortable_serialise(defdistance))
     456    {
     457        centre.insert(centre_);
     458    }
     459
     460    ~LatLongDistanceKeyMaker();
     461
     462    std::string operator()(const Xapian::Document & doc) const;
     463};
     464
     465}
     466
     467#endif /* XAPIAN_INCLUDED_GEOSPATIAL_H */
  • xapian-core/include/Makefile.mk

    Property changes on: xapian-core/include/xapian/geospatial.h
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
    3434        include/xapian/unicode.h\
    3535        include/xapian/valueiterator.h\
    3636        include/xapian/valuesetmatchdecider.h\
     37        include/xapian/geospatial.h\
    3738        include/xapian/visibility.h\
    3839        include/xapian/weight.h
    3940
  • xapian-core/include/xapian.h

     
    6262// Unicode support
    6363#include <xapian/unicode.h>
    6464
     65// Geospatial
     66#include <xapian/geospatial.h>
     67
    6568// ELF visibility annotations for GCC.
    6669#include <xapian/visibility.h>
    6770
  • xapian-core/common/registryinternal.h

     
    3232    class Weight;
    3333    class PostingSource;
    3434    class MatchSpy;
     35    class LatLongMetric;
    3536}
    3637
    3738class Xapian::Registry::Internal : public Xapian::Internal::RefCntBase {
     
    4647    /// Registered match spies.
    4748    std::map<std::string, Xapian::MatchSpy *> matchspies;
    4849
     50    /// Registered lat-long metrics.
     51    std::map<std::string, Xapian::LatLongMetric *> lat_long_metrics;
     52
    4953    /// Add the standard subclasses provided in the API.
    5054    void add_defaults();
    5155
     
    5862    /// Clear all registered match spies.
    5963    void clear_match_spies();
    6064
     65    /// Clear all registered lat-long metrics.
     66    void clear_lat_long_metrics();
     67
    6168  public:
    6269    Internal();
    6370    ~Internal();
  • xapian-core/common/output.h

     
    66 * Copyright 2002 Ananova Ltd
    77 * Copyright 2002,2003,2004,2007,2009 Olly Betts
    88 * Copyright 2007 Lemur Consulting Ltd
     9 * Copyright 2010 Richard Boulton
    910 *
    1011 * This program is free software; you can redistribute it and/or
    1112 * modify it under the terms of the GNU General Public License as
     
    6263XAPIAN_OUTPUT_FUNCTION(Xapian::ESet)
    6364XAPIAN_OUTPUT_FUNCTION(Xapian::Enquire)
    6465
     66#include <xapian/geospatial.h>
     67XAPIAN_OUTPUT_FUNCTION(Xapian::LatLongCoord)
     68XAPIAN_OUTPUT_FUNCTION(Xapian::LatLongCoords)
     69
    6570#include <xapian/stem.h>
    6671XAPIAN_OUTPUT_FUNCTION(Xapian::Stem)
    6772
  • xapian-core/api/postingsource.cc

     
    102102    throw Xapian::UnimplementedError("unserialise() not supported for this PostingSource");
    103103}
    104104
     105PostingSource *
     106PostingSource::unserialise_with_registry(const std::string &s,
     107                                         const Registry &) const
     108{
     109    return unserialise(s);
     110}
     111
    105112string
    106113PostingSource::get_description() const
    107114{
  • xapian-core/api/registry.cc

     
    2424#include "xapian/registry.h"
    2525
    2626#include "xapian/error.h"
     27#include "xapian/geospatial.h"
    2728#include "xapian/matchspy.h"
    2829#include "xapian/postingsource.h"
    2930#include "xapian/weight.h"
     
    154155    RETURN(lookup_object(internal->matchspies, name));
    155156}
    156157
     158void
     159Registry::register_lat_long_metric(const Xapian::LatLongMetric &metric)
     160{
     161    LOGCALL_VOID(API, "Xapian::Registry::register_lat_long_metric", metric.name());
     162    register_object(internal->lat_long_metrics, metric);
     163}
    157164
     165const Xapian::LatLongMetric *
     166Registry::get_lat_long_metric(const string & name) const
     167{
     168    LOGCALL(API, const Xapian::MatchSpy *, "Xapian::Registry::get_lat_long_metric", name);
     169    RETURN(lookup_object(internal->lat_long_metrics, name));
     170}
     171
    158172Registry::Internal::Internal()
    159173        : Xapian::Internal::RefCntBase(),
    160174          wtschemes(),
    161           postingsources()
     175          postingsources(),
     176          lat_long_metrics()
    162177{
    163178    add_defaults();
    164179}
     
    168183    clear_weighting_schemes();
    169184    clear_posting_sources();
    170185    clear_match_spies();
     186    clear_lat_long_metrics();
    171187}
    172188
    173189void
     
    190206    postingsources[source->name()] = source;
    191207    source = new Xapian::FixedWeightPostingSource(0.0);
    192208    postingsources[source->name()] = source;
     209    source = new Xapian::LatLongDistancePostingSource(0,
     210        Xapian::LatLongCoords(),
     211        Xapian::GreatCircleMetric());
     212    postingsources[source->name()] = source;
    193213
    194214    Xapian::MatchSpy * spy;
    195215    spy = new Xapian::ValueCountMatchSpy();
    196216    matchspies[spy->name()] = spy;
     217
     218    Xapian::LatLongMetric * metric;
     219    metric = new Xapian::GreatCircleMetric();
     220    lat_long_metrics[metric->name()] = metric;
    197221}
    198222
    199223void
     
    223247    }
    224248}
    225249
     250void
     251Registry::Internal::clear_lat_long_metrics()
     252{
     253    map<string, Xapian::LatLongMetric *>::const_iterator i;
     254    for (i = lat_long_metrics.begin(); i != lat_long_metrics.end(); ++i) {
     255        delete i->second;
     256    }
    226257}
     258
     259}
  • xapian-core/Makefile.am

     
    140140include common/Makefile.mk
    141141include examples/Makefile.mk
    142142include expand/Makefile.mk
     143include geospatial/Makefile.mk
    143144include include/Makefile.mk
    144145include languages/Makefile.mk
    145146include matcher/Makefile.mk
  • xapian-bindings/xapian.i

     
    790790
    791791%include <xapian/valuesetmatchdecider.h>
    792792
     793%ignore Xapian::LatLongCoord::operator< const;
     794%include <xapian/geospatial.h>
     795
    793796namespace Xapian {
    794797
    795798#if defined SWIGPYTHON