Ticket #481: geospatial_clean2.patch

File geospatial_clean2.patch, 60.5 KB (added by Richard Boulton, 14 years ago)

Updated patch

  • xapian-core/docs/geospatial.rst

     
     1.. Copyright (C) 2008 Lemur Consulting Ltd
     2
     3================================
     4Geospatial searching with Xapian
     5================================
     6
     7.. contents:: Table of contents
     8
     9Introduction
     10============
     11
     12This document describes a set of features present in Xapian which are designed
     13to allow geospatial searches to be supported.  Currently, the geospatial
     14support allows sets of locations to be stored associated with each document, as
     15latitude/longitude coordinates, and allows searches to be restricted or
     16reordered on the basis of distance from a second set of locations.
     17
     18Three types of geospatial searches are supported:
     19
     20 - Returning a list of documents in order of distance from a query location.
     21   This may be used in conjunction with any Xapian query.
     22
     23 - Returning a list of documents within a given distance of a query location.
     24   This may be used in conjunction with any other Xapian query, and with any
     25   Xapian sort order.
     26
     27 - Returning a set of documents in a combined order based on distance from a
     28   query location, and relevance.
     29
     30Locations are stored in value slots, allowing multiple independent locations to
     31be used for a single document.  It is also possible to store multiple
     32coordinates in a single value slot, in which case the closest coordinate will
     33be used for distance calculations.
     34
     35Metrics
     36=======
     37
     38A metric is a function which calculates the distance between two points.
     39
     40Calculating the exact distance between two geographical points is an involved
     41subject.  In fact, even defining the meaning of a geographical point is very
     42hard to do precisely - not only do you need to define a mathematical projection
     43used to calculate the coordinates, you also need to choose a model of the shape
     44of the earth, and identify a few sample points to identify the coordinates of
     45particular locations.  Since the earth is constantly changing shape, these
     46coordinates also need to be defined at a particular date.
     47
     48There are a few standard datums which define all these - a very common datum is
     49the WGS84 datum, which is the datum used by the GPS system.  Unless you have a
     50good reason not to, we recommend using the WGS84 datum, since this will ensure
     51that preset parameters of the functions built in to Xapian will have the
     52correct values (currently, the only such parameter is the earth radius used by
     53the GreatCircleMetric, but more may be added in future).
     54
     55Since there are lots of ways of calculating distances between two points, using
     56different assumptions about the approximations which are valid, Xapian allows
     57user-implemented metrics.  These are subclasses of the Xapian::LatLongMetric
     58class; see the API documentation for details on how to implement the various
     59required methods.
     60
     61There is currently only one built-in metric - the GreatCircleMetric.  As the
     62name suggests, this calculates the distance between a latitude and longitude
     63based on the assumption that the world is a perfect sphere.  The radius of the
     64world can be specified as a constructor parameter, but defaults to a reasonable
     65approximation of the radius of the Earth.  The calculation uses the Haversine
     66formula, which is accurate for points which are close together, but can have
     67significant error for coordinates which are on opposite sides of the sphere: on
     68the other hand, such points are likely to be at the end of a ranked list of
     69search results, so this probably doesn't matter.
     70
     71Indexing
     72========
     73
     74To index a set of documents with location, you need to store serialised
     75latitude-longitude coordinates in a value slot in your documents.  To do this,
     76use the LatLongCoord class.  For example, this is how you might store a
     77latitude and longitude corresponding to "London" in value slot 0::
     78
     79  Xapian::Document doc;
     80  doc.add_value(0, Xapian::LatLongCoord(51.53, 0.08).serialise());
     81
     82Of course, often a location is a bit more complicated than a single point - for
     83example, postcode regions in the UK can cover a fairly wide area.  If a search
     84were to treat such a location as a single point, the distances returned could
     85be incorrect by as much as a couple of miles.  Xapian therefore allows you to
     86store a set of points in a single slot - the distance calculation will return
     87the distance to the closest of these points.  This is often a good enough work
     88around for this problem - if you require greater accuracy, you will need to
     89filter the results after they are returned from Xapian.
     90
     91To store multiple coordinates in a single slot, use the LatLongCoords class::
     92
     93  Xapian::Document doc;
     94  Xapian::LatLongCoords coords;
     95  coords.insert(Xapian::LatLongCoord(51.53, 0.08));
     96  coords.insert(Xapian::LatLongCoord(51.51, 0.07));
     97  coords.insert(Xapian::LatLongCoord(51.52, 0.09));
     98  doc.add_value(0, coords.serialise());
     99
     100(Note that the serialised form of a LatLongCoords object containing a single
     101coordinate is exactly the same as the serialised form of the corresponding
     102LatLongCoord object.)
     103
     104Searching
     105=========
     106
     107Sorting results by distance
     108---------------------------
     109
     110If you simply want your results to be returned in order of distance, you can
     111use the LatLongDistanceKeyMaker class to calculate sort keys.  For example, to
     112return results in order of distance from the coordinate (51.00, 0.50), based on
     113the values stored in slot 0, and using the great-circle distance::
     114
     115  Xapian::Database db("my_database");
     116  Xapian::Enquire enq(db);
     117  enq.set_query(Xapian::Query("my_query"));
     118  GreatCircleMetric metric;
     119  LatLongCoord centre(51.00, 0.50);
     120  Xapian::LatLongDistanceKeyMaker keymaker(0, centre, metric);
     121  enq.set_sort_by_key(keymaker, False);
     122
     123Filtering results by distance
     124-----------------------------
     125
     126To return only those results within a given distance, you can use the
     127LatLongDistancePostingSource.  For example, to return only those results within
     1285 miles of coordinate (51.00, 0.50), based on the values stored in slot 0, and
     129using the great-circle distance::
     130
     131  Xapian::Database db("my_database");
     132  Xapian::Enquire enq(db);
     133  Xapian::Query q("my_query");
     134  GreatCircleMetric metric;
     135  LatLongCoord centre(51.00, 0.50);
     136  double max_range = Xapian::miles_to_metres(5);
     137  Xapian::LatLongDistancePostingSource ps(0, centre, metric, max_range)
     138  q = Xapian::Query(Xapian::Query::OP_FILTER, q, Xapian::Query(ps));
     139  enq.set_query(q);
     140
     141Ranking results on a combination of distance and relevance
     142----------------------------------------------------------
     143
     144To return results ranked by a combination of their relevance and their
     145distance, you can also use the LatLongDistancePostingSource.  Beware that
     146getting the right balance of weights is tricky: there is little solid
     147theoretical basis for this, so the best approach is often to try various
     148different parameters, evalutate the results, and settle on the best.  The
     149LatLongDistancePostingSource returns a weight of 1.0 for a document which is at
     150the specified location, and a lower, but always positive, weight for points
     151further away. It has two parameters, k1 and k2, which control how fast the
     152weight decays, which can be specified to the constructor (but aren't in this
     153example) - see the API documentation for details of these parameters.::
     154
     155  Xapian::Database db("my_database");
     156  Xapian::Enquire enq(db);
     157  Xapian::Query q("my_query");
     158  GreatCircleMetric metric;
     159  LatLongCoord centre(51.00, 0.50);
     160  double max_range = Xapian::miles_to_metres(5);
     161  Xapian::LatLongDistancePostingSource ps(0, centre, metric, max_range)
     162  q = Xapian::Query(Xapian::Query::AND, q, Xapian::Query(ps));
     163  enq.set_query(q);
     164
     165
     166Performance
     167===========
     168
     169The location information associated with each document is stored in a document
     170value.  This allows it to be looked up quickly at search time, so that the
     171exact distance from the query location can be calculated.  However, this method
     172requires that the distance of each potential match is checked, which can be
     173expensive.
     174
     175Some experimental code exists to produce terms corresponding to a hierarchical
     176index of locations (using the O-QTM algorithm - see references below), which
     177can be used to narrow down the search so that only a small number of potential
     178matches need to be checked.  Contact the Xapian developers (on email or IRC) if
     179you would like to help finish and test this code.
     180
     181It is entirely possible that a more efficient implementation could be performed
     182using "R trees" or "KD trees" (or one of the many other tree structures used
     183for geospatial indexing - see http://en.wikipedia.org/wiki/Spatial_index for a
     184list of some of these).  However, using the QTM approach will require minimal
     185effort and make use of the existing, and well tested, Xapian database.
     186Additionally, by simply generating special terms to restrict the search, the
     187existing optimisations of the Xapian query parser are taken advantage of.
     188
     189References
     190==========
     191
     192The O-QTM algorithm is described in "Dutton, G. (1996). Encoding and handling
     193geospatial data with hierarchical triangular meshes. In Kraak, M.J. and
     194Molenaar, M. (eds.)  Advances in GIS Research II. London: Taylor & Francis,
     195505-518." , a copy of which is available from
     196http://www.spatial-effects.com/papers/conf/GDutton_SDH96.pdf
     197
     198Some of the geometry needed to calculate the correct set of QTM IDs to cover a
     199particular region is detailed in
     200ftp://ftp.research.microsoft.com/pub/tr/tr-2005-123.pdf
     201
     202Also, see:
     203http://www.sdss.jhu.edu/htm/doc/c++/htmInterface.html
  • xapian-core/geospatial/Makefile.mk

     
     1EXTRA_DIST += \
     2        geospatial/dir_contents \
     3        geospatial/Makefile
     4
     5lib_src += \
     6        geospatial/latlongcoord.cc \
     7        geospatial/latlong_distance_keymaker.cc \
     8        geospatial/latlong_metrics.cc \
     9        geospatial/latlong_posting_source.cc
  • xapian-core/geospatial/latlong_distance_keymaker.cc

     
     1/** \file latlong_distance_keymaker.cc
     2 * \brief LatLongDistanceKeyMaker implementation.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 *
     6 * This program is free software; you can redistribute it and/or
     7 * modify it under the terms of the GNU General Public License as
     8 * published by the Free Software Foundation; either version 2 of the
     9 * License, or (at your option) any later version.
     10 *
     11 * This program is distributed in the hope that it will be useful,
     12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14 * GNU General Public License for more details.
     15 *
     16 * You should have received a copy of the GNU General Public License
     17 * along with this program; if not, write to the Free Software
     18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     19 * USA
     20 */
     21
     22#include <config.h>
     23
     24#include "xapian/geospatial.h"
     25#include "xapian/document.h"
     26#include "xapian/queryparser.h" // For sortable_serialise.
     27
     28using namespace Xapian;
     29using namespace std;
     30
     31string
     32LatLongDistanceKeyMaker::operator()(const Document &doc) const
     33{
     34    string val(doc.get_value(valno));
     35    LatLongCoords doccoords = LatLongCoords::unserialise(val);
     36    if (doccoords.empty()) {
     37        return defkey;
     38    }
     39    double distance = (*metric)(centre, doccoords);
     40    return sortable_serialise(distance);
     41}
     42
     43LatLongDistanceKeyMaker::~LatLongDistanceKeyMaker()
     44{
     45    delete metric;
     46}
  • xapian-core/geospatial/latlong_posting_source.cc

    Property changes on: xapian-core/geospatial/latlong_distance_keymaker.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1/** @file latlong_posting_source.cc
     2 * @brief LatLongPostingSource implementation.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 * Copyright 2010 Richard Boulton
     6 *
     7 * This program is free software; you can redistribute it and/or
     8 * modify it under the terms of the GNU General Public License as
     9 * published by the Free Software Foundation; either version 2 of the
     10 * License, or (at your option) any later version.
     11 *
     12 * This program is distributed in the hope that it will be useful,
     13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     15 * GNU General Public License for more details.
     16 *
     17 * You should have received a copy of the GNU General Public License
     18 * along with this program; if not, write to the Free Software
     19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     20 * USA
     21 */
     22
     23#include <config.h>
     24
     25#include "xapian/geospatial.h"
     26
     27#include "xapian/document.h"
     28#include "xapian/error.h"
     29#include "xapian/registry.h"
     30
     31#include "serialise.h"
     32#include "serialise-double.h"
     33#include "str.h"
     34
     35#include <cmath>
     36
     37using namespace Xapian;
     38using namespace std;
     39
     40static double
     41weight_from_distance(double dist, double k1, double k2)
     42{
     43    return k1 * pow(dist + k1, -k2);
     44}
     45
     46void
     47LatLongDistancePostingSource::calc_distance()
     48{
     49    string val(*value_it);
     50    LatLongCoords coords = LatLongCoords::unserialise(val);
     51    dist = (*metric)(centre, coords);
     52}
     53
     54LatLongDistancePostingSource::LatLongDistancePostingSource(
     55        valueno slot_,
     56        const LatLongCoords & centre_,
     57        const LatLongMetric * metric_,
     58        double max_range_,
     59        double k1_,
     60        double k2_)
     61        : ValuePostingSource(slot_),
     62          centre(centre_),
     63          metric(metric_),
     64          max_range(max_range_),
     65          k1(k1_),
     66          k2(k2_)
     67{
     68    if (k1 <= 0)
     69        throw InvalidArgumentError(
     70            "k1 parameter to LatLongDistancePostingSource must be greater "
     71            "than 0; was " + str(k1));
     72    if (k2 <= 0)
     73        throw InvalidArgumentError(
     74            "k2 parameter to LatLongDistancePostingSource must be greater "
     75            "than 0; was " + str(k2));
     76    set_maxweight(weight_from_distance(0, k1, k2));
     77}
     78
     79LatLongDistancePostingSource::LatLongDistancePostingSource(
     80        valueno slot_,
     81        const LatLongCoords & centre_,
     82        const LatLongMetric & metric_,
     83        double max_range_,
     84        double k1_,
     85        double k2_)
     86        : ValuePostingSource(slot_),
     87          centre(centre_),
     88          metric(metric_.clone()),
     89          max_range(max_range_),
     90          k1(k1_),
     91          k2(k2_)
     92{
     93    if (k1 <= 0)
     94        throw InvalidArgumentError(
     95            "k1 parameter to LatLongDistancePostingSource must be greater "
     96            "than 0; was " + str(k1));
     97    if (k2 <= 0)
     98        throw InvalidArgumentError(
     99            "k2 parameter to LatLongDistancePostingSource must be greater "
     100            "than 0; was " + str(k2));
     101    set_maxweight(weight_from_distance(0, k1, k2));
     102}
     103
     104LatLongDistancePostingSource::~LatLongDistancePostingSource()
     105{
     106    delete metric;
     107}
     108
     109void
     110LatLongDistancePostingSource::next(weight min_wt)
     111{
     112    ValuePostingSource::next(min_wt);
     113
     114    while (value_it != db.valuestream_end(slot)) {
     115        calc_distance();
     116        if (max_range == 0 || dist <= max_range)
     117            break;
     118        ++value_it;
     119    }
     120}
     121
     122void
     123LatLongDistancePostingSource::skip_to(docid min_docid,
     124                                      weight min_wt)
     125{
     126    ValuePostingSource::skip_to(min_docid, min_wt);
     127
     128    while (value_it != db.valuestream_end(slot)) {
     129        calc_distance();
     130        if (max_range == 0 || dist <= max_range)
     131            break;
     132        ++value_it;
     133    }
     134}
     135
     136bool
     137LatLongDistancePostingSource::check(docid min_docid,
     138                                    weight min_wt)
     139{
     140    if (!ValuePostingSource::check(min_docid, min_wt)) {
     141        // check returned false, so we know the document is not in the source.
     142        return false;
     143    }
     144    if (value_it == db.valuestream_end(slot)) {
     145        // return true, since we're definitely at the end of the list.
     146        return true;
     147    }
     148
     149    calc_distance();
     150    if (max_range > 0 && dist > max_range) {
     151        return false;
     152    }
     153    return true;
     154}
     155
     156weight
     157LatLongDistancePostingSource::get_weight() const
     158{
     159    return weight_from_distance(dist, k1, k2);
     160}
     161
     162LatLongDistancePostingSource *
     163LatLongDistancePostingSource::clone() const
     164{
     165    return new LatLongDistancePostingSource(slot, centre,
     166                                            metric->clone(),
     167                                            max_range, k1, k2);
     168}
     169
     170string
     171LatLongDistancePostingSource::name() const
     172{
     173    return string("Xapian::LatLongDistancePostingSource");
     174}
     175
     176string
     177LatLongDistancePostingSource::serialise() const
     178{
     179    string serialised_centre = centre.serialise();
     180    string metric_name = metric->name();
     181    string serialised_metric = metric->serialise();
     182
     183    string result = encode_length(slot);
     184    result += encode_length(serialised_centre.size());
     185    result += serialised_centre;
     186    result += encode_length(metric_name.size());
     187    result += metric_name;
     188    result += encode_length(serialised_metric.size());
     189    result += serialised_metric;
     190    result += serialise_double(max_range);
     191    result += serialise_double(k1);
     192    result += serialise_double(k2);
     193    return result;
     194}
     195
     196LatLongDistancePostingSource *
     197LatLongDistancePostingSource::unserialise_with_registry(const string &s,
     198                                             const Registry & registry) const
     199{
     200    const char * p = s.data();
     201    const char * end = p + s.size();
     202
     203    valueno new_slot = decode_length(&p, end, false);
     204    size_t len = decode_length(&p, end, true);
     205    string new_serialised_centre(p, len);
     206    p += len;
     207    len = decode_length(&p, end, true);
     208    string new_metric_name(p, len);
     209    p += len;
     210    len = decode_length(&p, end, true);
     211    string new_serialised_metric(p, len);
     212    p += len;
     213    double new_max_range = unserialise_double(&p, end);
     214    double new_k1 = unserialise_double(&p, end);
     215    double new_k2 = unserialise_double(&p, end);
     216    if (p != end) {
     217        throw NetworkError("Bad serialised LatLongDistancePostingSource - junk at end");
     218    }
     219
     220    LatLongCoords new_centre =
     221            LatLongCoords::unserialise(new_serialised_centre);
     222
     223    const Xapian::LatLongMetric * metric_type =
     224            registry.get_lat_long_metric(new_metric_name);
     225    if (metric_type == NULL) {
     226        throw InvalidArgumentError("LatLongMetric " + new_metric_name +
     227                                   " not registered");
     228    }
     229    LatLongMetric * new_metric =
     230            metric_type->unserialise(new_serialised_metric);
     231
     232    return new LatLongDistancePostingSource(new_slot, new_centre,
     233                                            new_metric,
     234                                            new_max_range, new_k1, new_k2);
     235}
     236
     237void
     238LatLongDistancePostingSource::init(const Database & db_)
     239{
     240    ValuePostingSource::init(db_);
     241    if (max_range > 0.0) {
     242        // Possible that no documents are in range.
     243        termfreq_min = 0;
     244        // Note - would be good to improve termfreq_est here, too, but
     245        // I can't think of anything we can do with the information
     246        // available.
     247    }
     248}
     249
     250string
     251LatLongDistancePostingSource::get_description() const
     252{
     253    return "Xapian::LatLongDistancePostingSource(slot=" + str(slot) + ")";
     254}
  • xapian-core/geospatial/latlong_metrics.cc

    Property changes on: xapian-core/geospatial/latlong_posting_source.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1/** \file latlong_metrics.cc
     2 * \brief Geospatial distance metrics.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 *
     6 * This program is free software; you can redistribute it and/or
     7 * modify it under the terms of the GNU General Public License as
     8 * published by the Free Software Foundation; either version 2 of the
     9 * License, or (at your option) any later version.
     10 *
     11 * This program is distributed in the hope that it will be useful,
     12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14 * GNU General Public License for more details.
     15 *
     16 * You should have received a copy of the GNU General Public License
     17 * along with this program; if not, write to the Free Software
     18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     19 * USA
     20 */
     21
     22#include <config.h>
     23
     24#include "xapian/geospatial.h"
     25#include "xapian/error.h"
     26#include "serialise-double.h"
     27
     28#include <cmath>
     29
     30using namespace Xapian;
     31using namespace std;
     32
     33/** Quadratic mean radius of the earth in metres.
     34 */
     35#define QUAD_EARTH_RADIUS_METRES 6372797.6
     36
     37/** Set M_PI if it's not already set.
     38 */
     39#ifndef M_PI
     40#define M_PI 3.14159265358979323846
     41#endif
     42
     43LatLongMetric::~LatLongMetric()
     44{
     45}
     46
     47double
     48LatLongMetric::operator()(const LatLongCoords & a, const LatLongCoords &b) const
     49{
     50    if (a.empty() || b.empty()) {
     51        throw InvalidArgumentError("Empty coordinate list supplied to LatLongMetric::operator()().");
     52    }
     53    double min_dist = 0.0;
     54    bool have_min = false;
     55    for (set<LatLongCoord>::const_iterator a_iter = a.begin();
     56         a_iter != a.end();
     57         ++a_iter)
     58    {
     59        for (set<LatLongCoord>::const_iterator b_iter = b.begin();
     60             b_iter != b.end();
     61             ++b_iter)
     62        {
     63            double dist = operator()(*a_iter, *b_iter);
     64            if (!have_min) {
     65                min_dist = dist;
     66                have_min = true;
     67            } else if (dist < min_dist) {
     68                min_dist = dist;
     69            }
     70        }
     71    }
     72    return min_dist;
     73}
     74
     75
     76GreatCircleMetric::GreatCircleMetric()
     77        : radius(QUAD_EARTH_RADIUS_METRES)
     78{}
     79
     80GreatCircleMetric::GreatCircleMetric(double radius_)
     81        : radius(radius_)
     82{}
     83
     84double
     85GreatCircleMetric::operator()(const LatLongCoord & a,
     86                              const LatLongCoord & b) const
     87{
     88    double lata = a.latitude * (M_PI / 180.0);
     89    double latb = b.latitude * (M_PI / 180.0);
     90
     91    double latdiff = lata - latb;
     92    double longdiff = (a.longitude - b.longitude) * (M_PI / 180.0);
     93
     94    double sin_half_lat = sin(latdiff / 2);
     95    double sin_half_long = sin(longdiff / 2);
     96    double h = sin_half_lat * sin_half_lat +
     97            sin_half_long * sin_half_long * cos(lata) * cos(latb);
     98    double sqrt_h = sqrt(h);
     99    if (sqrt_h > 1.0) sqrt_h = 1.0;
     100    return 2 * radius * asin(sqrt_h);
     101}
     102
     103LatLongMetric *
     104GreatCircleMetric::clone() const
     105{
     106    return new GreatCircleMetric(radius);
     107}
     108
     109string
     110GreatCircleMetric::name() const
     111{
     112    return "Xapian::GreatCircleMetric";
     113}
     114
     115string
     116GreatCircleMetric::serialise() const
     117{
     118    return serialise_double(radius);
     119}
     120
     121LatLongMetric *
     122GreatCircleMetric::unserialise(const string & s) const
     123{
     124    const char * p = s.data();
     125    const char * end = p + s.size();
     126
     127    double new_radius = unserialise_double(&p, end);
     128    if (p != end) {
     129        throw Xapian::NetworkError("Bad serialised GreatCircleMetric - junk at end");
     130    }
     131
     132    return new GreatCircleMetric(new_radius);
     133}
  • xapian-core/geospatial/dir_contents

    Property changes on: xapian-core/geospatial/latlong_metrics.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1<Directory>geospatial</Directory>
     2
     3<Description>
     4Support for geospatial matching, and parsing of locations.
     5</Description>
  • xapian-core/geospatial/latlongcoord.cc

     
     1/** \file latlong.cc
     2 * \brief Latitude and longitude representations.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 *
     6 * This program is free software; you can redistribute it and/or
     7 * modify it under the terms of the GNU General Public License as
     8 * published by the Free Software Foundation; either version 2 of the
     9 * License, or (at your option) any later version.
     10 *
     11 * This program is distributed in the hope that it will be useful,
     12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     14 * GNU General Public License for more details.
     15 *
     16 * You should have received a copy of the GNU General Public License
     17 * along with this program; if not, write to the Free Software
     18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     19 * USA
     20 */
     21
     22#include <config.h>
     23
     24#include "xapian/geospatial.h"
     25#include "xapian/error.h"
     26
     27#include "serialise.h"
     28#include "serialise-double.h"
     29#include "str.h"
     30
     31#include <cmath>
     32
     33using namespace Xapian;
     34using namespace std;
     35
     36LatLongCoord::LatLongCoord(double latitude_, double longitude_)
     37        : latitude(latitude_),
     38          longitude(longitude_)
     39{
     40    if (latitude < -90.0 || latitude > 90.0)
     41        throw InvalidArgumentError("Latitude out-of-range");
     42    longitude = fmod(longitude_, 360);
     43    if (longitude <= -180) longitude += 360;
     44    if (longitude > 180) longitude -= 360;
     45    if (longitude == -0.0) longitude = 0.0;
     46}
     47
     48LatLongCoord
     49LatLongCoord::unserialise(const string & serialised)
     50{
     51    const char * ptr = serialised.data();
     52    const char * end = ptr + serialised.size();
     53    LatLongCoord result = unserialise(&ptr, end);
     54    if (ptr != end)
     55        throw InvalidArgumentError(
     56                "Junk found at end of serialised LatLongCoord");
     57    return result;
     58}
     59
     60LatLongCoord
     61LatLongCoord::unserialise(const char ** ptr, const char * end)
     62{
     63    try {
     64        // This will raise NetworkError for invalid serialisations.
     65        double latitude = unserialise_double(ptr, end);
     66        double longitude = unserialise_double(ptr, end);
     67        return LatLongCoord(latitude, longitude);
     68    } catch (const NetworkError & e) {
     69        // FIXME - modify unserialise_double somehow so we don't have to catch
     70        // and rethrow the exceptions it raises.
     71        throw InvalidArgumentError(e.get_msg());
     72    }
     73}
     74
     75string
     76LatLongCoord::serialise() const
     77{
     78    string result(serialise_double(latitude));
     79    result += serialise_double(longitude);
     80    return result;
     81}
     82
     83string
     84LatLongCoord::get_description() const
     85{
     86    string res("Xapian::LatLongCoord(");
     87    res += str(latitude);
     88    res += ", ";
     89    res += str(longitude);
     90    res += ")";
     91    return res;
     92}
     93
     94LatLongCoords
     95LatLongCoords::unserialise(const string & serialised)
     96{
     97    const char * ptr = serialised.data();
     98    const char * end = ptr + serialised.size();
     99    LatLongCoords coords = unserialise(ptr, end);
     100    return coords;
     101}
     102
     103LatLongCoords
     104LatLongCoords::unserialise(const char * ptr, const char * end)
     105{
     106    LatLongCoords result;
     107    try {
     108        while (ptr != end) {
     109            // This will raise NetworkError for invalid serialisations (so we
     110            // catch and re-throw it).
     111            result.coords.insert(LatLongCoord::unserialise(&ptr, end));
     112        }
     113    } catch (const NetworkError & e) {
     114        // FIXME - modify unserialise_double somehow so we don't have to catch
     115        // and rethrow the exceptions it raises.
     116        throw InvalidArgumentError(e.get_msg());
     117    }
     118    if (ptr != end)
     119        throw InvalidArgumentError(
     120                "Junk found at end of serialised LatLongCoords");
     121    return result;
     122}
     123
     124string
     125LatLongCoords::serialise() const
     126{
     127    string result;
     128    set<LatLongCoord>::const_iterator coord;
     129    for (coord = coords.begin(); coord != coords.end(); ++coord)
     130    {
     131        result += serialise_double(coord->latitude);
     132        result += serialise_double(coord->longitude);
     133    }
     134    return result;
     135}
     136
     137string
     138LatLongCoords::get_description() const
     139{
     140    string res("Xapian::LatLongCoords(");
     141    set<LatLongCoord>::const_iterator coord;
     142    for (coord = coords.begin(); coord != coords.end(); ++coord) {
     143        if (coord != coords.begin()) {
     144            res += ", ";
     145        }
     146        res += "(";
     147        res += str(coord->latitude);
     148        res += ", ";
     149        res += str(coord->longitude);
     150        res += ")";
     151    }
     152    res += ")";
     153    return res;
     154}
  • xapian-core/geospatial/Makefile

    Property changes on: xapian-core/geospatial/latlongcoord.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1# Makefile for use in directories built by non-recursive make.
     2
     3SHELL = /bin/sh
     4
     5all check:
     6        cd .. && $(MAKE) $@
     7
     8clean:
     9        rm -f *.o *.obj *.lo
  • xapian-core/tests/api_geospatial.cc

    Property changes on: xapian-core/geospatial/Makefile
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
     1/** @file api_geospatial.cc
     2 * @brief Tests of geospatial functionality.
     3 */
     4/* Copyright 2008 Lemur Consulting Ltd
     5 * Copyright 2010 Richard Boulton
     6 *
     7 * This program is free software; you can redistribute it and/or
     8 * modify it under the terms of the GNU General Public License as
     9 * published by the Free Software Foundation; either version 2 of the
     10 * License, or (at your option) any later version.
     11 *
     12 * This program is distributed in the hope that it will be useful,
     13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     15 * GNU General Public License for more details.
     16 *
     17 * You should have received a copy of the GNU General Public License
     18 * along with this program; if not, write to the Free Software
     19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     20 * USA
     21 */
     22
     23#include <config.h>
     24#include "api_geospatial.h"
     25#include <xapian/geospatial.h>
     26#include <xapian/error.h>
     27
     28#include "apitest.h"
     29#include "testsuite.h"
     30#include "testutils.h"
     31#include <iomanip>
     32
     33using namespace std;
     34using namespace Xapian;
     35
     36// #######################################################################
     37// # Tests start here
     38
     39static void
     40builddb_coords1(Xapian::WritableDatabase &db, const string &)
     41{
     42    Xapian::LatLongCoord coord1(10, 10);
     43    Xapian::LatLongCoord coord2(20, 10);
     44    Xapian::LatLongCoord coord3(30, 10);
     45
     46    Xapian::Document doc;
     47    doc.add_value(0, coord1.serialise());
     48    db.add_document(doc);
     49
     50    doc = Xapian::Document();
     51    doc.add_value(0, coord2.serialise());
     52    db.add_document(doc);
     53
     54    doc = Xapian::Document();
     55    doc.add_value(0, coord3.serialise());
     56    db.add_document(doc);
     57}
     58
     59/// Test behaviour of the LatLongPostingSource
     60DEFINE_TESTCASE(latlongpostingsource1, backend && writable && !remote && !inmemory) {
     61    Xapian::Database db = get_database("coords1", builddb_coords1, "");
     62    Xapian::LatLongCoord coord1(10, 10);
     63    Xapian::LatLongCoord coord2(20, 10);
     64    Xapian::LatLongCoord coord3(30, 10);
     65
     66    // Chert doesn't currently support opening a value iterator for a writable database.
     67    SKIP_TEST_FOR_BACKEND("chert");
     68
     69    Xapian::GreatCircleMetric metric;
     70    Xapian::LatLongCoords centre;
     71    centre.insert(coord1);
     72    double coorddist = metric(coord1, coord2);
     73    TEST_EQUAL_DOUBLE(coorddist, metric(coord2, coord3));
     74
     75    // Test a search with no range restriction.
     76    {
     77        Xapian::LatLongDistancePostingSource ps(0, centre, metric);
     78        ps.init(db);
     79
     80        ps.next(0.0);
     81        TEST_EQUAL(ps.at_end(), false);
     82        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     83        TEST_EQUAL(ps.get_docid(), 1);
     84
     85        ps.next(0.0);
     86        TEST_EQUAL(ps.at_end(), false);
     87        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     88        TEST_EQUAL(ps.get_docid(), 2);
     89
     90        ps.next(0.0);
     91        TEST_EQUAL(ps.at_end(), false);
     92        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist * 2));
     93        TEST_EQUAL(ps.get_docid(), 3);
     94
     95        ps.next(0.0);
     96        TEST_EQUAL(ps.at_end(), true);
     97    }
     98
     99    // Test a search with a tight range restriction
     100    {
     101        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 0.5);
     102        ps.init(db);
     103
     104        ps.next(0.0);
     105        TEST_EQUAL(ps.at_end(), false);
     106        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     107
     108        ps.next(0.0);
     109        TEST_EQUAL(ps.at_end(), true);
     110    }
     111
     112    // Test a search with a looser range restriction
     113    {
     114        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist);
     115        ps.init(db);
     116
     117        ps.next(0.0);
     118        TEST_EQUAL(ps.at_end(), false);
     119        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     120
     121        ps.next(0.0);
     122        TEST_EQUAL(ps.at_end(), false);
     123        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     124        TEST_EQUAL(ps.get_docid(), 2);
     125
     126        ps.next(0.0);
     127        TEST_EQUAL(ps.at_end(), true);
     128    }
     129
     130    // Test a search with a looser range restriction, but not enough to return
     131    // the next document.
     132    {
     133        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 1.5);
     134        ps.init(db);
     135
     136        ps.next(0.0);
     137        TEST_EQUAL(ps.at_end(), false);
     138        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     139
     140        ps.next(0.0);
     141        TEST_EQUAL(ps.at_end(), false);
     142        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     143        TEST_EQUAL(ps.get_docid(), 2);
     144
     145        ps.next(0.0);
     146        TEST_EQUAL(ps.at_end(), true);
     147    }
     148
     149    // Test a search with a loose enough range restriction that all docs should
     150    // be returned.
     151    {
     152        Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 2.5);
     153        ps.init(db);
     154
     155        ps.next(0.0);
     156        TEST_EQUAL(ps.at_end(), false);
     157        TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0);
     158
     159        ps.next(0.0);
     160        TEST_EQUAL(ps.at_end(), false);
     161        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist));
     162        TEST_EQUAL(ps.get_docid(), 2);
     163
     164        ps.next(0.0);
     165        TEST_EQUAL(ps.at_end(), false);
     166        TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist * 2));
     167        TEST_EQUAL(ps.get_docid(), 3);
     168
     169        ps.next(0.0);
     170        TEST_EQUAL(ps.at_end(), true);
     171    }
     172
     173    return true;
     174}
     175
     176// Test various methods of LatLongCoord and LatLongCoords
     177DEFINE_TESTCASE(latlongcoords1, !backend) {
     178    LatLongCoord c1(0, 0);
     179    LatLongCoord c2(1, 0);
     180    LatLongCoord c3(1, 0);
     181
     182    // Test comparison
     183    TEST_NOT_EQUAL(c1.get_description(), c2.get_description());
     184    TEST(c1 < c2 || c2 < c1);
     185    TEST_EQUAL(c2.get_description(), c3.get_description());
     186    TEST(!(c2 < c3) && !(c3 < c2));
     187
     188    // Test serialisation
     189    std::string s1 = c1.serialise();
     190    LatLongCoord c4 = LatLongCoord::unserialise(s1);
     191    TEST(!(c1 < c4 || c4 < c1));
     192    const char * ptr = s1.data();
     193    const char * end = ptr + s1.size();
     194    c4 = LatLongCoord::unserialise(&ptr, end);
     195    TEST_EQUAL(c1.get_description(), c4.get_description());
     196    TEST_EQUAL(c1.get_description(), "Xapian::LatLongCoord(0, 0)");
     197    TEST_EQUAL(ptr, end);
     198
     199    // Test building a set of LatLongCoords
     200    LatLongCoords g1(c1);
     201    TEST(!g1.empty());
     202    TEST_EQUAL(g1.size(), 1);
     203    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0))");
     204    g1.insert(c2);
     205    TEST_EQUAL(g1.size(), 2);
     206    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))");
     207    // c3 == c2, so already in the set, so no change if we add c3
     208    g1.insert(c3);
     209    TEST_EQUAL(g1.size(), 2);
     210    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))");
     211    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))");
     212    g1.erase(c3);
     213    TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0))");
     214
     215    // Test building an empty LatLongCoords
     216    LatLongCoords g2;
     217    TEST(g2.empty());
     218    TEST_EQUAL(g2.size(), 0);
     219    TEST_EQUAL(g2.get_description(), "Xapian::LatLongCoords()");
     220
     221    return true;
     222}
     223
     224// Test various methods of LatLongMetric
     225DEFINE_TESTCASE(latlongmetric1, !backend) {
     226    LatLongCoord c1(0, 0);
     227    LatLongCoord c2(1, 0);
     228    Xapian::GreatCircleMetric m1;
     229    double d1 = m1(c1, c2);
     230    TEST_REL(d1, >, 111226.0);
     231    TEST_REL(d1, <, 111227.0);
     232
     233    // Let's make another metric, this time using the radius of mars, so
     234    // distances should be quite a bit smaller.
     235    Xapian::GreatCircleMetric m2(3310000);
     236    double d2 = m2(c1, c2);
     237    TEST_REL(d2, >, 57770.0);
     238    TEST_REL(d2, <, 57771.0);
     239
     240    // Check serialise and unserialise.
     241    Xapian::Registry registry;
     242    std::string s1 = m2.serialise();
     243    const Xapian::LatLongMetric * m3;
     244    m3 = registry.get_lat_long_metric(m2.name());
     245    TEST(m3 != NULL);
     246    m3 = m3->unserialise(s1);
     247    double d3 = (*m3)(c1, c2);
     248    TEST_EQUAL_DOUBLE(d2, d3);
     249
     250    delete m3;
     251
     252    return true;
     253}
     254
     255// Test a LatLongDistanceKeyMaker directly.
     256DEFINE_TESTCASE(latlongkeymaker1, !backend) {
     257    Xapian::GreatCircleMetric m1(3310000);
     258    LatLongCoord c1(0, 0);
     259    LatLongCoord c2(1, 0);
     260    LatLongCoord c3(2, 0);
     261    LatLongCoord c4(3, 0);
     262
     263    LatLongCoords g1(c1);
     264    g1.insert(c2);
     265
     266    LatLongDistanceKeyMaker keymaker(0, g1, m1);
     267    Xapian::Document doc1;
     268    doc1.add_value(0, g1.serialise());
     269    Xapian::Document doc2;
     270    doc2.add_value(0, c3.serialise());
     271    Xapian::Document doc3;
     272    doc3.add_value(0, c4.serialise());
     273    Xapian::Document doc4;
     274
     275    std::string k1 = keymaker(doc1);
     276    std::string k2 = keymaker(doc2);
     277    std::string k3 = keymaker(doc3);
     278    std::string k4 = keymaker(doc4);
     279    TEST_REL(k1, <, k2);
     280    TEST_REL(k2, <, k3);
     281    TEST_REL(k3, <, k4);
     282
     283    LatLongDistanceKeyMaker keymaker2(0, g1, m1, 0);
     284    std::string k3b = keymaker2(doc3);
     285    std::string k4b = keymaker2(doc4);
     286    TEST_EQUAL(k3, k3b);
     287    TEST_REL(k3b, >, k4b);
     288
     289    return true;
     290}
  • xapian-core/tests/Makefile.am

    Property changes on: xapian-core/tests/api_geospatial.cc
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
    155155 api_unicode.cc \
    156156 api_valuestats.cc \
    157157 api_valuestream.cc \
     158 api_geospatial.cc \
    158159 api_wrdb.cc
    159160
    160161apitest_SOURCES = apitest.cc dbcheck.cc $(collated_apitest_sources) \
  • xapian-core/include/xapian/postingsource.h

     
    3131
    3232namespace Xapian {
    3333
     34class Registry;
     35
    3436/** Base class which provides an "external" source of postings.
    3537 *
    3638 *  Warning: the PostingSource interface is currently experimental, and is
     
    285287     */
    286288    virtual PostingSource * unserialise(const std::string &s) const;
    287289
     290    /** Create object given string serialisation returned by serialise().
     291     *
     292     *  Note that the returned object will be deallocated by Xapian after use
     293     *  with "delete".  It must therefore have been allocated with "new".
     294     *
     295     *  This method is supplied with a Registry object, which can be used when
     296     *  unserialising objects contained within the posting source.  The default
     297     *  implementation simply calls unserialise() which doesn't take the
     298     *  Registry object, so you do not need to implement this method unless you
     299     *  want to take advantage of the Registry object when unserialising.
     300     *
     301     *  @param s A serialised instance of this PostingSource subclass.
     302     */
     303    virtual PostingSource * unserialise_with_registry(const std::string &s,
     304                                      const Registry & registry) const;
     305
    288306    /** Set this PostingSource to the start of the list of postings.
    289307     *
    290308     *  This is called automatically by the matcher prior to each query being
  • xapian-core/include/xapian/registry.h

     
    3030namespace Xapian {
    3131
    3232// Forward declarations.
     33class LatLongMetric;
    3334class MatchSpy;
    3435class PostingSource;
    3536class Weight;
     
    105106     */
    106107    const Xapian::MatchSpy *
    107108            get_match_spy(const std::string & name) const;
     109
     110    /// Register a user-defined lat-long metric class.
     111    void register_lat_long_metric(const Xapian::LatLongMetric &metric);
     112
     113    /** Get a lat-long metric given a name.
     114     *
     115     *  The returned metric is owned by the registry object.
     116     *
     117     *  Returns NULL if the metric could not be found.
     118     */
     119    const Xapian::LatLongMetric *
     120            get_lat_long_metric(const std::string & name) const;
     121
    108122};
    109123
    110124}
  • xapian-core/include/xapian/geospatial.h

     
     1/** @file geospatial.h
     2 * @brief Geospatial search support routines.
     3 */
     4/* Copyright 2008,2009 Lemur Consulting Ltd
     5 * Copyright 2010 Richard Boulton
     6 *
     7 * This program is free software; you can redistribute it and/or
     8 * modify it under the terms of the GNU General Public License as
     9 * published by the Free Software Foundation; either version 2 of the
     10 * License, or (at your option) any later version.
     11 *
     12 * This program is distributed in the hope that it will be useful,
     13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
     14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     15 * GNU General Public License for more details.
     16 *
     17 * You should have received a copy of the GNU General Public License
     18 * along with this program; if not, write to the Free Software
     19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
     20 * USA
     21 */
     22
     23#ifndef XAPIAN_INCLUDED_GEOSPATIAL_H
     24#define XAPIAN_INCLUDED_GEOSPATIAL_H
     25
     26#include <xapian/enquire.h>
     27#include <xapian/postingsource.h>
     28#include <xapian/queryparser.h> // For sortable_serialise
     29#include <xapian/keymaker.h>
     30#include <xapian/visibility.h>
     31#include <string>
     32#include <set>
     33
     34namespace Xapian {
     35
     36class Registry;
     37
     38/** Convert from miles to metres.
     39 */
     40inline XAPIAN_VISIBILITY_DEFAULT double
     41miles_to_metres(double miles)
     42{
     43    return 1609.344 * miles;
     44}
     45
     46/** Convert from metres to miles.
     47 */
     48inline XAPIAN_VISIBILITY_DEFAULT double
     49metres_to_miles(double metres)
     50{
     51    return metres * (1.0 / 1609.344);
     52}
     53
     54/** A latitude-longitude coordinate.
     55 *
     56 *  Note that latitude-longitude coordinates are only precisely meaningful if
     57 *  the datum used to define them is specified.  This class ignores this
     58 *  issue - it is up to the caller to ensure that the datum used for each
     59 *  coordinate in a system is consistent.
     60 */
     61struct XAPIAN_VISIBILITY_DEFAULT LatLongCoord {
     62  public:
     63    /** A latitude, as decimal degrees.
     64     *
     65     *  Should be in the range -90 <= longitude <= 90
     66     *
     67     *  Postive latitudes represent the northern hemisphere.
     68     */
     69    double latitude;
     70
     71    /** A longitude, as decimal degrees.
     72     *
     73     *  Should be in the range -180 < latitude <= 180
     74     *
     75     *  Positive longitudes represent the eastern hemisphere.
     76     */
     77    double longitude;
     78
     79    /** Construct a coordinate.
     80     *
     81     *  If the supplied longitude is out of range, an exception will be raised.
     82     *
     83     *  If the supplied latitude is out of range, it will be normalised to the
     84     *  appropriate range.
     85     */
     86    LatLongCoord(double latitude_, double longitude_);
     87
     88    /** Construct a coordinate by unserialising a string.
     89     *
     90     *  @param serialised the string to unserialise the coordinate from.
     91     *
     92     *  @exception Xapian::InvalidArgumentError if the string does not contain
     93     *  a valid serialised latitude-longitude pair, or contains extra data at
     94     *  the end of it.
     95     */
     96    static LatLongCoord unserialise(const std::string & serialised);
     97
     98    /** Construct a coordinate by unserialising a string.
     99     *
     100     *  The string may contain further data after that for the coordinate.
     101     *
     102     *  @param ptr A pointer to the start of the string.  This will be updated
     103     *  to point to the end of the data representing the coordinate.
     104     *  @param end A pointer to the end of the string.
     105     *
     106     *  @exception Xapian::InvalidArgumentError if the string does not contain
     107     *  a valid serialised latitude-longitude pair.
     108     */
     109    static LatLongCoord unserialise(const char ** ptr, const char * end);
     110
     111    /** Return a serialised representation of the coordinate.
     112     */
     113    std::string serialise() const;
     114
     115    /** Compare with another LatLongCoord.
     116     */
     117    bool operator<(const LatLongCoord & other) const
     118    {
     119        if (latitude < other.latitude) return true;
     120        return (longitude < other.longitude);
     121    }
     122
     123    /// Return a string describing this object.
     124    std::string get_description() const;
     125};
     126
     127/** A set of latitude-longitude coordinate.
     128 */
     129class XAPIAN_VISIBILITY_DEFAULT LatLongCoords {
     130    /// The coordinates.
     131    std::set<LatLongCoord> coords;
     132
     133  public:
     134    std::set<LatLongCoord>::const_iterator begin() const
     135    {
     136        return coords.begin();
     137    }
     138
     139    std::set<LatLongCoord>::const_iterator end() const
     140    {
     141        return coords.end();
     142    }
     143
     144    size_t size() const
     145    {
     146        return coords.size();
     147    }
     148
     149    size_t empty() const
     150    {
     151        return coords.empty();
     152    }
     153
     154    void insert(const LatLongCoord & coord)
     155    {
     156        coords.insert(coord);
     157    }
     158
     159    void erase(const LatLongCoord & coord)
     160    {
     161        coords.erase(coord);
     162    }
     163
     164    /// Construct an empty set of coordinates.
     165    LatLongCoords() : coords() {}
     166
     167    /// Construct a set of coordinates containing one coordinate.
     168    LatLongCoords(const LatLongCoord & coord) : coords()
     169    {
     170        coords.insert(coord);
     171    }
     172
     173    /** Construct a set of coordinates by unserialising a string.
     174     *
     175     *  @param serialised the string to unserialise the coordinates from.
     176     *
     177     *  @exception Xapian::InvalidArgumentError if the string does not contain
     178     *  a valid serialised latitude-longitude pair, or contains junk at the end
     179     *  of it.
     180     */
     181    static LatLongCoords unserialise(const std::string & serialised);
     182
     183    /** Construct a set of coordinates by unserialising a string.
     184     *
     185     *  The string may NOT contain further data after the coordinates (the
     186     *  representation of the list of coordinates is not self-terminating).
     187     *
     188     *  @param ptr A pointer to the start of the string.
     189     *  @param end A pointer to the end of the string.
     190     *
     191     *  @exception Xapian::InvalidArgumentError if the string does not contain
     192     *  a valid serialised latitude-longitude pair, or contains junk at the end
     193     *  of it.
     194     */
     195    static LatLongCoords unserialise(const char * ptr, const char * end);
     196
     197    /** Return a serialised form of the coordinate list.
     198     */
     199    std::string serialise() const;
     200
     201    /// Return a string describing this object.
     202    std::string get_description() const;
     203};
     204
     205/** Base class for calculating distances between two lat/long coordinates.
     206 */
     207class XAPIAN_VISIBILITY_DEFAULT LatLongMetric {
     208  public:
     209    /// Destructor.
     210    virtual ~LatLongMetric();
     211
     212    /** Return the distance between two coordinates, in metres.
     213     */
     214    virtual double operator()(const LatLongCoord & a, const LatLongCoord &b) const = 0;
     215
     216    /** Return the distance between two coordinate lists, in metres.
     217     *
     218     *  The distance between the coordinate lists is defined to the be minimum
     219     *  pairwise distance between coordinates in the lists.
     220     *
     221     *  If either of the lists is empty, an InvalidArgumentError will be raised.
     222     */
     223    double operator()(const LatLongCoords & a, const LatLongCoords &b) const;
     224
     225    /** Clone the metric. */
     226    virtual LatLongMetric * clone() const = 0;
     227
     228    /** Return the full name of the metric.
     229     *
     230     *  This is used when serialising and unserialising metrics; for example,
     231     *  for performing remote searches.
     232     *
     233     *  If the subclass is in a C++ namespace, the namespace should be included
     234     *  in the name, using "::" as a separator.  For example, for a
     235     *  LatLongMetric subclass called "FooLatLongMetric" in the "Xapian"
     236     *  namespace the result of this call should be "Xapian::FooLatLongMetric".
     237     */
     238    virtual std::string name() const = 0;
     239
     240    /** Serialise object parameters into a string.
     241     *
     242     *  The serialised parameters should represent the configuration of the
     243     *  metric.
     244     */
     245    virtual std::string serialise() const = 0;
     246
     247    /** Create object given string serialisation returned by serialise().
     248     *
     249     *  @param s A serialised instance of this LatLongMetric subclass.
     250     */
     251    virtual LatLongMetric * unserialise(const std::string & s) const = 0;
     252};
     253
     254/** Calculate the great-circle distance between two coordinates on a sphere.
     255 *
     256 *  This uses the haversine formula to calculate the distance.  Note that this
     257 *  formula is subject to inaccuracy due to numerical errors for coordinates on
     258 *  the opposite side of the sphere.
     259 *
     260 *  See http://en.wikipedia.org/wiki/Haversine_formula
     261 */
     262class XAPIAN_VISIBILITY_DEFAULT GreatCircleMetric : public LatLongMetric {
     263    /** The radius of the sphere in metres.
     264     */
     265    double radius;
     266
     267  public:
     268    /** Construct a GreatCircleMetric.
     269     *
     270     *  The (quadratic mean) radius of the earth will be used by this
     271     *  calculator.
     272     */
     273    GreatCircleMetric();
     274
     275    /** Construct a GreatCircleMetric using a specified radius.
     276     *
     277     *  @param radius_ The radius of to use, in metres.
     278     */
     279    GreatCircleMetric(double radius_);
     280
     281    /** Return the great-circle distance between points on the sphere.
     282     */
     283    double operator()(const LatLongCoord & a, const LatLongCoord &b) const;
     284
     285    LatLongMetric * clone() const;
     286    std::string name() const;
     287    std::string serialise() const;
     288    LatLongMetric * unserialise(const std::string & s) const;
     289};
     290
     291/** Posting source which returns a weight based on geospatial distance.
     292 *
     293 *  Results are weighted by the distance from a fixed point, or list of points,
     294 *  calculated according to the metric supplied.  If multiple points are
     295 *  supplied (either in the constructor, or in the coordinates stored in a
     296 *  document) , the closest pointwise distance is returned.
     297 *
     298 *  Documents further away than a specified maximum range (or with no location
     299 *  stored in the specified slot) will not be returned.
     300 *
     301 *  The weight returned will be computed from the distance using the formula:
     302 *  k1 * (distance + k1) ** (- k2)
     303 *
     304 *  (Where k1 and k2 are (strictly) positive, floating point, constants, and
     305 *  default to 1000 and 1, respectively.  Distance is measured in metres, so
     306 *  this means that something at the centre gets a weight of 1.0, something 1km
     307 *  away gets a weight of 0.5, and something 3km away gets a weight of 0.25,
     308 *  etc)
     309 */
     310class XAPIAN_VISIBILITY_DEFAULT LatLongDistancePostingSource : public ValuePostingSource
     311{
     312    /// Current distance from centre.
     313    double dist;
     314
     315    /// Centre, to compute distance from.
     316    LatLongCoords centre;
     317
     318    /// Metric to compute the distance with.
     319    const LatLongMetric * metric;
     320
     321    /// Maximum range to allow.  If set to 0, there is no maximum range.
     322    double max_range;
     323
     324    /// Constant used in weighting function.
     325    double k1;
     326
     327    /// Constant used in weighting function.
     328    double k2;
     329
     330    /** Calculate the distance for the current document.
     331     *
     332     *  Returns true if the distance was calculated ok, or false if the
     333     *  document didn't contain a valid serialised set of coordinates in the
     334     *  appropriate value slot.
     335     */
     336    void calc_distance();
     337
     338    /// Internal constructor; used by clone() and serialise().
     339    LatLongDistancePostingSource(Xapian::valueno slot_,
     340                                 const LatLongCoords & centre_,
     341                                 const LatLongMetric * metric_,
     342                                 double max_range_,
     343                                 double k1_,
     344                                 double k2_);
     345
     346  public:
     347    /** Construct a new match decider which returns only documents within
     348     *  range of one of the central coordinates.
     349     *
     350     *  @param db_ The database to read values from.
     351     *  @param slot_ The value slot to read values from.
     352     *  @param centre_ The centre point to use for distance calculations.
     353     *  @param metric_ The metric to use for distance calculations.
     354     *  @param max_range_ The maximum distance for documents which are returned.
     355     *  @param k1_ The k1 constant to use in the weighting function.
     356     *  @param k2_ The k2 constant to use in the weighting function.
     357     */
     358    LatLongDistancePostingSource(Xapian::valueno slot_,
     359                                 const LatLongCoords & centre_,
     360                                 const LatLongMetric & metric_,
     361                                 double max_range_ = 0.0,
     362                                 double k1_ = 1000.0,
     363                                 double k2_ = 1.0);
     364    ~LatLongDistancePostingSource();
     365
     366    void next(Xapian::weight min_wt);
     367    void skip_to(Xapian::docid min_docid, Xapian::weight min_wt);
     368    bool check(Xapian::docid min_docid, Xapian::weight min_wt);
     369
     370    Xapian::weight get_weight() const;
     371    LatLongDistancePostingSource * clone() const;
     372    std::string name() const;
     373    std::string serialise() const;
     374    LatLongDistancePostingSource *
     375            unserialise_with_registry(const std::string &s,
     376                                      const Registry & registry) const;
     377    void init(const Database & db_);
     378
     379    std::string get_description() const;
     380};
     381
     382/** KeyMaker subclass which sorts by distance from a latitude/longitude.
     383 *
     384 *  Results are ordered by the distance from a fixed point, or list of points,
     385 *  calculated according to the metric supplied.  If multiple points are
     386 *  supplied (either in the constructor, or in the coordinates stored in a
     387 *  document), the closest pointwise distance is returned.
     388 *
     389 *  If a document contains no
     390 */
     391class XAPIAN_VISIBILITY_DEFAULT LatLongDistanceKeyMaker : public KeyMaker {
     392
     393    /// The value slot to read.
     394    Xapian::valueno valno;
     395
     396    /// The centre point (or points) for distance calculation.
     397    LatLongCoords centre;
     398
     399    /// The metric to use when calculating distances.
     400    const LatLongMetric * metric;
     401
     402    /// The default key to return, for documents with no value stored.
     403    std::string defkey;
     404
     405  public:
     406    LatLongDistanceKeyMaker(Xapian::valueno valno_,
     407                            const LatLongCoords & centre_,
     408                            const LatLongMetric & metric_,
     409                            double defdistance = 10E10)
     410            : valno(valno_),
     411              centre(centre_),
     412              metric(metric_.clone()),
     413              defkey(sortable_serialise(defdistance))
     414    {}
     415
     416    LatLongDistanceKeyMaker(Xapian::valueno valno_,
     417                            const LatLongCoord & centre_,
     418                            const LatLongMetric & metric_,
     419                            double defdistance = 10E10)
     420            : valno(valno_),
     421              centre(),
     422              metric(metric_.clone()),
     423              defkey(sortable_serialise(defdistance))
     424    {
     425        centre.insert(centre_);
     426    }
     427
     428    ~LatLongDistanceKeyMaker();
     429
     430    std::string operator()(const Xapian::Document & doc) const;
     431};
     432
     433}
     434
     435#endif /* XAPIAN_INCLUDED_GEOSPATIAL_H */
  • xapian-core/include/Makefile.mk

    Property changes on: xapian-core/include/xapian/geospatial.h
    ___________________________________________________________________
    Added: svn:eol-style
       + native
    
     
    3434        include/xapian/unicode.h\
    3535        include/xapian/valueiterator.h\
    3636        include/xapian/valuesetmatchdecider.h\
     37        include/xapian/geospatial.h\
    3738        include/xapian/visibility.h\
    3839        include/xapian/weight.h
    3940
  • xapian-core/include/xapian.h

     
    6262// Unicode support
    6363#include <xapian/unicode.h>
    6464
     65// Geospatial
     66#include <xapian/geospatial.h>
     67
    6568// ELF visibility annotations for GCC.
    6669#include <xapian/visibility.h>
    6770
  • xapian-core/common/registryinternal.h

     
    3232    class Weight;
    3333    class PostingSource;
    3434    class MatchSpy;
     35    class LatLongMetric;
    3536}
    3637
    3738class Xapian::Registry::Internal : public Xapian::Internal::RefCntBase {
     
    4647    /// Registered match spies.
    4748    std::map<std::string, Xapian::MatchSpy *> matchspies;
    4849
     50    /// Registered lat-long metrics.
     51    std::map<std::string, Xapian::LatLongMetric *> lat_long_metrics;
     52
    4953    /// Add the standard subclasses provided in the API.
    5054    void add_defaults();
    5155
     
    5862    /// Clear all registered match spies.
    5963    void clear_match_spies();
    6064
     65    /// Clear all registered lat-long metrics.
     66    void clear_lat_long_metrics();
     67
    6168  public:
    6269    Internal();
    6370    ~Internal();
  • xapian-core/common/output.h

     
    66 * Copyright 2002 Ananova Ltd
    77 * Copyright 2002,2003,2004,2007,2009 Olly Betts
    88 * Copyright 2007 Lemur Consulting Ltd
     9 * Copyright 2010 Richard Boulton
    910 *
    1011 * This program is free software; you can redistribute it and/or
    1112 * modify it under the terms of the GNU General Public License as
     
    6263XAPIAN_OUTPUT_FUNCTION(Xapian::ESet)
    6364XAPIAN_OUTPUT_FUNCTION(Xapian::Enquire)
    6465
     66#include <xapian/geospatial.h>
     67XAPIAN_OUTPUT_FUNCTION(Xapian::LatLongCoord)
     68XAPIAN_OUTPUT_FUNCTION(Xapian::LatLongCoords)
     69
    6570#include <xapian/stem.h>
    6671XAPIAN_OUTPUT_FUNCTION(Xapian::Stem)
    6772
  • xapian-core/api/postingsource.cc

     
    102102    throw Xapian::UnimplementedError("unserialise() not supported for this PostingSource");
    103103}
    104104
     105PostingSource *
     106PostingSource::unserialise_with_registry(const std::string &s,
     107                                         const Registry &) const
     108{
     109    return unserialise(s);
     110}
     111
    105112string
    106113PostingSource::get_description() const
    107114{
  • xapian-core/api/registry.cc

     
    2424#include "xapian/registry.h"
    2525
    2626#include "xapian/error.h"
     27#include "xapian/geospatial.h"
    2728#include "xapian/matchspy.h"
    2829#include "xapian/postingsource.h"
    2930#include "xapian/weight.h"
     
    154155    RETURN(lookup_object(internal->matchspies, name));
    155156}
    156157
     158void
     159Registry::register_lat_long_metric(const Xapian::LatLongMetric &metric)
     160{
     161    LOGCALL_VOID(API, "Xapian::Registry::register_lat_long_metric", metric.name());
     162    register_object(internal->lat_long_metrics, metric);
     163}
    157164
     165const Xapian::LatLongMetric *
     166Registry::get_lat_long_metric(const string & name) const
     167{
     168    LOGCALL(API, const Xapian::MatchSpy *, "Xapian::Registry::get_lat_long_metric", name);
     169    RETURN(lookup_object(internal->lat_long_metrics, name));
     170}
     171
    158172Registry::Internal::Internal()
    159173        : Xapian::Internal::RefCntBase(),
    160174          wtschemes(),
    161           postingsources()
     175          postingsources(),
     176          lat_long_metrics()
    162177{
    163178    add_defaults();
    164179}
     
    168183    clear_weighting_schemes();
    169184    clear_posting_sources();
    170185    clear_match_spies();
     186    clear_lat_long_metrics();
    171187}
    172188
    173189void
     
    190206    postingsources[source->name()] = source;
    191207    source = new Xapian::FixedWeightPostingSource(0.0);
    192208    postingsources[source->name()] = source;
     209    source = new Xapian::LatLongDistancePostingSource(0,
     210        Xapian::LatLongCoords(),
     211        Xapian::GreatCircleMetric());
     212    postingsources[source->name()] = source;
    193213
    194214    Xapian::MatchSpy * spy;
    195215    spy = new Xapian::ValueCountMatchSpy();
    196216    matchspies[spy->name()] = spy;
     217
     218    Xapian::LatLongMetric * metric;
     219    metric = new Xapian::GreatCircleMetric();
     220    lat_long_metrics[metric->name()] = metric;
    197221}
    198222
    199223void
     
    223247    }
    224248}
    225249
     250void
     251Registry::Internal::clear_lat_long_metrics()
     252{
     253    map<string, Xapian::LatLongMetric *>::const_iterator i;
     254    for (i = lat_long_metrics.begin(); i != lat_long_metrics.end(); ++i) {
     255        delete i->second;
     256    }
    226257}
     258
     259}
  • xapian-core/Makefile.am

     
    140140include common/Makefile.mk
    141141include examples/Makefile.mk
    142142include expand/Makefile.mk
     143include geospatial/Makefile.mk
    143144include include/Makefile.mk
    144145include languages/Makefile.mk
    145146include matcher/Makefile.mk
  • xapian-bindings/xapian.i

     
    790790
    791791%include <xapian/valuesetmatchdecider.h>
    792792
     793%ignore Xapian::LatLongCoord::operator< const;
     794%include <xapian/geospatial.h>
     795
    793796namespace Xapian {
    794797
    795798#if defined SWIGPYTHON