Ticket #481: geospatial_clean2.patch
File geospatial_clean2.patch, 60.5 KB (added by , 15 years ago) |
---|
-
xapian-core/docs/geospatial.rst
1 .. Copyright (C) 2008 Lemur Consulting Ltd 2 3 ================================ 4 Geospatial searching with Xapian 5 ================================ 6 7 .. contents:: Table of contents 8 9 Introduction 10 ============ 11 12 This document describes a set of features present in Xapian which are designed 13 to allow geospatial searches to be supported. Currently, the geospatial 14 support allows sets of locations to be stored associated with each document, as 15 latitude/longitude coordinates, and allows searches to be restricted or 16 reordered on the basis of distance from a second set of locations. 17 18 Three types of geospatial searches are supported: 19 20 - Returning a list of documents in order of distance from a query location. 21 This may be used in conjunction with any Xapian query. 22 23 - Returning a list of documents within a given distance of a query location. 24 This may be used in conjunction with any other Xapian query, and with any 25 Xapian sort order. 26 27 - Returning a set of documents in a combined order based on distance from a 28 query location, and relevance. 29 30 Locations are stored in value slots, allowing multiple independent locations to 31 be used for a single document. It is also possible to store multiple 32 coordinates in a single value slot, in which case the closest coordinate will 33 be used for distance calculations. 34 35 Metrics 36 ======= 37 38 A metric is a function which calculates the distance between two points. 39 40 Calculating the exact distance between two geographical points is an involved 41 subject. In fact, even defining the meaning of a geographical point is very 42 hard to do precisely - not only do you need to define a mathematical projection 43 used to calculate the coordinates, you also need to choose a model of the shape 44 of the earth, and identify a few sample points to identify the coordinates of 45 particular locations. Since the earth is constantly changing shape, these 46 coordinates also need to be defined at a particular date. 47 48 There are a few standard datums which define all these - a very common datum is 49 the WGS84 datum, which is the datum used by the GPS system. Unless you have a 50 good reason not to, we recommend using the WGS84 datum, since this will ensure 51 that preset parameters of the functions built in to Xapian will have the 52 correct values (currently, the only such parameter is the earth radius used by 53 the GreatCircleMetric, but more may be added in future). 54 55 Since there are lots of ways of calculating distances between two points, using 56 different assumptions about the approximations which are valid, Xapian allows 57 user-implemented metrics. These are subclasses of the Xapian::LatLongMetric 58 class; see the API documentation for details on how to implement the various 59 required methods. 60 61 There is currently only one built-in metric - the GreatCircleMetric. As the 62 name suggests, this calculates the distance between a latitude and longitude 63 based on the assumption that the world is a perfect sphere. The radius of the 64 world can be specified as a constructor parameter, but defaults to a reasonable 65 approximation of the radius of the Earth. The calculation uses the Haversine 66 formula, which is accurate for points which are close together, but can have 67 significant error for coordinates which are on opposite sides of the sphere: on 68 the other hand, such points are likely to be at the end of a ranked list of 69 search results, so this probably doesn't matter. 70 71 Indexing 72 ======== 73 74 To index a set of documents with location, you need to store serialised 75 latitude-longitude coordinates in a value slot in your documents. To do this, 76 use the LatLongCoord class. For example, this is how you might store a 77 latitude and longitude corresponding to "London" in value slot 0:: 78 79 Xapian::Document doc; 80 doc.add_value(0, Xapian::LatLongCoord(51.53, 0.08).serialise()); 81 82 Of course, often a location is a bit more complicated than a single point - for 83 example, postcode regions in the UK can cover a fairly wide area. If a search 84 were to treat such a location as a single point, the distances returned could 85 be incorrect by as much as a couple of miles. Xapian therefore allows you to 86 store a set of points in a single slot - the distance calculation will return 87 the distance to the closest of these points. This is often a good enough work 88 around for this problem - if you require greater accuracy, you will need to 89 filter the results after they are returned from Xapian. 90 91 To store multiple coordinates in a single slot, use the LatLongCoords class:: 92 93 Xapian::Document doc; 94 Xapian::LatLongCoords coords; 95 coords.insert(Xapian::LatLongCoord(51.53, 0.08)); 96 coords.insert(Xapian::LatLongCoord(51.51, 0.07)); 97 coords.insert(Xapian::LatLongCoord(51.52, 0.09)); 98 doc.add_value(0, coords.serialise()); 99 100 (Note that the serialised form of a LatLongCoords object containing a single 101 coordinate is exactly the same as the serialised form of the corresponding 102 LatLongCoord object.) 103 104 Searching 105 ========= 106 107 Sorting results by distance 108 --------------------------- 109 110 If you simply want your results to be returned in order of distance, you can 111 use the LatLongDistanceKeyMaker class to calculate sort keys. For example, to 112 return results in order of distance from the coordinate (51.00, 0.50), based on 113 the values stored in slot 0, and using the great-circle distance:: 114 115 Xapian::Database db("my_database"); 116 Xapian::Enquire enq(db); 117 enq.set_query(Xapian::Query("my_query")); 118 GreatCircleMetric metric; 119 LatLongCoord centre(51.00, 0.50); 120 Xapian::LatLongDistanceKeyMaker keymaker(0, centre, metric); 121 enq.set_sort_by_key(keymaker, False); 122 123 Filtering results by distance 124 ----------------------------- 125 126 To return only those results within a given distance, you can use the 127 LatLongDistancePostingSource. For example, to return only those results within 128 5 miles of coordinate (51.00, 0.50), based on the values stored in slot 0, and 129 using the great-circle distance:: 130 131 Xapian::Database db("my_database"); 132 Xapian::Enquire enq(db); 133 Xapian::Query q("my_query"); 134 GreatCircleMetric metric; 135 LatLongCoord centre(51.00, 0.50); 136 double max_range = Xapian::miles_to_metres(5); 137 Xapian::LatLongDistancePostingSource ps(0, centre, metric, max_range) 138 q = Xapian::Query(Xapian::Query::OP_FILTER, q, Xapian::Query(ps)); 139 enq.set_query(q); 140 141 Ranking results on a combination of distance and relevance 142 ---------------------------------------------------------- 143 144 To return results ranked by a combination of their relevance and their 145 distance, you can also use the LatLongDistancePostingSource. Beware that 146 getting the right balance of weights is tricky: there is little solid 147 theoretical basis for this, so the best approach is often to try various 148 different parameters, evalutate the results, and settle on the best. The 149 LatLongDistancePostingSource returns a weight of 1.0 for a document which is at 150 the specified location, and a lower, but always positive, weight for points 151 further away. It has two parameters, k1 and k2, which control how fast the 152 weight decays, which can be specified to the constructor (but aren't in this 153 example) - see the API documentation for details of these parameters.:: 154 155 Xapian::Database db("my_database"); 156 Xapian::Enquire enq(db); 157 Xapian::Query q("my_query"); 158 GreatCircleMetric metric; 159 LatLongCoord centre(51.00, 0.50); 160 double max_range = Xapian::miles_to_metres(5); 161 Xapian::LatLongDistancePostingSource ps(0, centre, metric, max_range) 162 q = Xapian::Query(Xapian::Query::AND, q, Xapian::Query(ps)); 163 enq.set_query(q); 164 165 166 Performance 167 =========== 168 169 The location information associated with each document is stored in a document 170 value. This allows it to be looked up quickly at search time, so that the 171 exact distance from the query location can be calculated. However, this method 172 requires that the distance of each potential match is checked, which can be 173 expensive. 174 175 Some experimental code exists to produce terms corresponding to a hierarchical 176 index of locations (using the O-QTM algorithm - see references below), which 177 can be used to narrow down the search so that only a small number of potential 178 matches need to be checked. Contact the Xapian developers (on email or IRC) if 179 you would like to help finish and test this code. 180 181 It is entirely possible that a more efficient implementation could be performed 182 using "R trees" or "KD trees" (or one of the many other tree structures used 183 for geospatial indexing - see http://en.wikipedia.org/wiki/Spatial_index for a 184 list of some of these). However, using the QTM approach will require minimal 185 effort and make use of the existing, and well tested, Xapian database. 186 Additionally, by simply generating special terms to restrict the search, the 187 existing optimisations of the Xapian query parser are taken advantage of. 188 189 References 190 ========== 191 192 The O-QTM algorithm is described in "Dutton, G. (1996). Encoding and handling 193 geospatial data with hierarchical triangular meshes. In Kraak, M.J. and 194 Molenaar, M. (eds.) Advances in GIS Research II. London: Taylor & Francis, 195 505-518." , a copy of which is available from 196 http://www.spatial-effects.com/papers/conf/GDutton_SDH96.pdf 197 198 Some of the geometry needed to calculate the correct set of QTM IDs to cover a 199 particular region is detailed in 200 ftp://ftp.research.microsoft.com/pub/tr/tr-2005-123.pdf 201 202 Also, see: 203 http://www.sdss.jhu.edu/htm/doc/c++/htmInterface.html -
xapian-core/geospatial/Makefile.mk
1 EXTRA_DIST += \ 2 geospatial/dir_contents \ 3 geospatial/Makefile 4 5 lib_src += \ 6 geospatial/latlongcoord.cc \ 7 geospatial/latlong_distance_keymaker.cc \ 8 geospatial/latlong_metrics.cc \ 9 geospatial/latlong_posting_source.cc -
xapian-core/geospatial/latlong_distance_keymaker.cc
1 /** \file latlong_distance_keymaker.cc 2 * \brief LatLongDistanceKeyMaker implementation. 3 */ 4 /* Copyright 2008 Lemur Consulting Ltd 5 * 6 * This program is free software; you can redistribute it and/or 7 * modify it under the terms of the GNU General Public License as 8 * published by the Free Software Foundation; either version 2 of the 9 * License, or (at your option) any later version. 10 * 11 * This program is distributed in the hope that it will be useful, 12 * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 * GNU General Public License for more details. 15 * 16 * You should have received a copy of the GNU General Public License 17 * along with this program; if not, write to the Free Software 18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 19 * USA 20 */ 21 22 #include <config.h> 23 24 #include "xapian/geospatial.h" 25 #include "xapian/document.h" 26 #include "xapian/queryparser.h" // For sortable_serialise. 27 28 using namespace Xapian; 29 using namespace std; 30 31 string 32 LatLongDistanceKeyMaker::operator()(const Document &doc) const 33 { 34 string val(doc.get_value(valno)); 35 LatLongCoords doccoords = LatLongCoords::unserialise(val); 36 if (doccoords.empty()) { 37 return defkey; 38 } 39 double distance = (*metric)(centre, doccoords); 40 return sortable_serialise(distance); 41 } 42 43 LatLongDistanceKeyMaker::~LatLongDistanceKeyMaker() 44 { 45 delete metric; 46 } -
xapian-core/geospatial/latlong_posting_source.cc
Property changes on: xapian-core/geospatial/latlong_distance_keymaker.cc ___________________________________________________________________ Added: svn:eol-style + native
1 /** @file latlong_posting_source.cc 2 * @brief LatLongPostingSource implementation. 3 */ 4 /* Copyright 2008 Lemur Consulting Ltd 5 * Copyright 2010 Richard Boulton 6 * 7 * This program is free software; you can redistribute it and/or 8 * modify it under the terms of the GNU General Public License as 9 * published by the Free Software Foundation; either version 2 of the 10 * License, or (at your option) any later version. 11 * 12 * This program is distributed in the hope that it will be useful, 13 * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 * GNU General Public License for more details. 16 * 17 * You should have received a copy of the GNU General Public License 18 * along with this program; if not, write to the Free Software 19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 20 * USA 21 */ 22 23 #include <config.h> 24 25 #include "xapian/geospatial.h" 26 27 #include "xapian/document.h" 28 #include "xapian/error.h" 29 #include "xapian/registry.h" 30 31 #include "serialise.h" 32 #include "serialise-double.h" 33 #include "str.h" 34 35 #include <cmath> 36 37 using namespace Xapian; 38 using namespace std; 39 40 static double 41 weight_from_distance(double dist, double k1, double k2) 42 { 43 return k1 * pow(dist + k1, -k2); 44 } 45 46 void 47 LatLongDistancePostingSource::calc_distance() 48 { 49 string val(*value_it); 50 LatLongCoords coords = LatLongCoords::unserialise(val); 51 dist = (*metric)(centre, coords); 52 } 53 54 LatLongDistancePostingSource::LatLongDistancePostingSource( 55 valueno slot_, 56 const LatLongCoords & centre_, 57 const LatLongMetric * metric_, 58 double max_range_, 59 double k1_, 60 double k2_) 61 : ValuePostingSource(slot_), 62 centre(centre_), 63 metric(metric_), 64 max_range(max_range_), 65 k1(k1_), 66 k2(k2_) 67 { 68 if (k1 <= 0) 69 throw InvalidArgumentError( 70 "k1 parameter to LatLongDistancePostingSource must be greater " 71 "than 0; was " + str(k1)); 72 if (k2 <= 0) 73 throw InvalidArgumentError( 74 "k2 parameter to LatLongDistancePostingSource must be greater " 75 "than 0; was " + str(k2)); 76 set_maxweight(weight_from_distance(0, k1, k2)); 77 } 78 79 LatLongDistancePostingSource::LatLongDistancePostingSource( 80 valueno slot_, 81 const LatLongCoords & centre_, 82 const LatLongMetric & metric_, 83 double max_range_, 84 double k1_, 85 double k2_) 86 : ValuePostingSource(slot_), 87 centre(centre_), 88 metric(metric_.clone()), 89 max_range(max_range_), 90 k1(k1_), 91 k2(k2_) 92 { 93 if (k1 <= 0) 94 throw InvalidArgumentError( 95 "k1 parameter to LatLongDistancePostingSource must be greater " 96 "than 0; was " + str(k1)); 97 if (k2 <= 0) 98 throw InvalidArgumentError( 99 "k2 parameter to LatLongDistancePostingSource must be greater " 100 "than 0; was " + str(k2)); 101 set_maxweight(weight_from_distance(0, k1, k2)); 102 } 103 104 LatLongDistancePostingSource::~LatLongDistancePostingSource() 105 { 106 delete metric; 107 } 108 109 void 110 LatLongDistancePostingSource::next(weight min_wt) 111 { 112 ValuePostingSource::next(min_wt); 113 114 while (value_it != db.valuestream_end(slot)) { 115 calc_distance(); 116 if (max_range == 0 || dist <= max_range) 117 break; 118 ++value_it; 119 } 120 } 121 122 void 123 LatLongDistancePostingSource::skip_to(docid min_docid, 124 weight min_wt) 125 { 126 ValuePostingSource::skip_to(min_docid, min_wt); 127 128 while (value_it != db.valuestream_end(slot)) { 129 calc_distance(); 130 if (max_range == 0 || dist <= max_range) 131 break; 132 ++value_it; 133 } 134 } 135 136 bool 137 LatLongDistancePostingSource::check(docid min_docid, 138 weight min_wt) 139 { 140 if (!ValuePostingSource::check(min_docid, min_wt)) { 141 // check returned false, so we know the document is not in the source. 142 return false; 143 } 144 if (value_it == db.valuestream_end(slot)) { 145 // return true, since we're definitely at the end of the list. 146 return true; 147 } 148 149 calc_distance(); 150 if (max_range > 0 && dist > max_range) { 151 return false; 152 } 153 return true; 154 } 155 156 weight 157 LatLongDistancePostingSource::get_weight() const 158 { 159 return weight_from_distance(dist, k1, k2); 160 } 161 162 LatLongDistancePostingSource * 163 LatLongDistancePostingSource::clone() const 164 { 165 return new LatLongDistancePostingSource(slot, centre, 166 metric->clone(), 167 max_range, k1, k2); 168 } 169 170 string 171 LatLongDistancePostingSource::name() const 172 { 173 return string("Xapian::LatLongDistancePostingSource"); 174 } 175 176 string 177 LatLongDistancePostingSource::serialise() const 178 { 179 string serialised_centre = centre.serialise(); 180 string metric_name = metric->name(); 181 string serialised_metric = metric->serialise(); 182 183 string result = encode_length(slot); 184 result += encode_length(serialised_centre.size()); 185 result += serialised_centre; 186 result += encode_length(metric_name.size()); 187 result += metric_name; 188 result += encode_length(serialised_metric.size()); 189 result += serialised_metric; 190 result += serialise_double(max_range); 191 result += serialise_double(k1); 192 result += serialise_double(k2); 193 return result; 194 } 195 196 LatLongDistancePostingSource * 197 LatLongDistancePostingSource::unserialise_with_registry(const string &s, 198 const Registry & registry) const 199 { 200 const char * p = s.data(); 201 const char * end = p + s.size(); 202 203 valueno new_slot = decode_length(&p, end, false); 204 size_t len = decode_length(&p, end, true); 205 string new_serialised_centre(p, len); 206 p += len; 207 len = decode_length(&p, end, true); 208 string new_metric_name(p, len); 209 p += len; 210 len = decode_length(&p, end, true); 211 string new_serialised_metric(p, len); 212 p += len; 213 double new_max_range = unserialise_double(&p, end); 214 double new_k1 = unserialise_double(&p, end); 215 double new_k2 = unserialise_double(&p, end); 216 if (p != end) { 217 throw NetworkError("Bad serialised LatLongDistancePostingSource - junk at end"); 218 } 219 220 LatLongCoords new_centre = 221 LatLongCoords::unserialise(new_serialised_centre); 222 223 const Xapian::LatLongMetric * metric_type = 224 registry.get_lat_long_metric(new_metric_name); 225 if (metric_type == NULL) { 226 throw InvalidArgumentError("LatLongMetric " + new_metric_name + 227 " not registered"); 228 } 229 LatLongMetric * new_metric = 230 metric_type->unserialise(new_serialised_metric); 231 232 return new LatLongDistancePostingSource(new_slot, new_centre, 233 new_metric, 234 new_max_range, new_k1, new_k2); 235 } 236 237 void 238 LatLongDistancePostingSource::init(const Database & db_) 239 { 240 ValuePostingSource::init(db_); 241 if (max_range > 0.0) { 242 // Possible that no documents are in range. 243 termfreq_min = 0; 244 // Note - would be good to improve termfreq_est here, too, but 245 // I can't think of anything we can do with the information 246 // available. 247 } 248 } 249 250 string 251 LatLongDistancePostingSource::get_description() const 252 { 253 return "Xapian::LatLongDistancePostingSource(slot=" + str(slot) + ")"; 254 } -
xapian-core/geospatial/latlong_metrics.cc
Property changes on: xapian-core/geospatial/latlong_posting_source.cc ___________________________________________________________________ Added: svn:eol-style + native
1 /** \file latlong_metrics.cc 2 * \brief Geospatial distance metrics. 3 */ 4 /* Copyright 2008 Lemur Consulting Ltd 5 * 6 * This program is free software; you can redistribute it and/or 7 * modify it under the terms of the GNU General Public License as 8 * published by the Free Software Foundation; either version 2 of the 9 * License, or (at your option) any later version. 10 * 11 * This program is distributed in the hope that it will be useful, 12 * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 * GNU General Public License for more details. 15 * 16 * You should have received a copy of the GNU General Public License 17 * along with this program; if not, write to the Free Software 18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 19 * USA 20 */ 21 22 #include <config.h> 23 24 #include "xapian/geospatial.h" 25 #include "xapian/error.h" 26 #include "serialise-double.h" 27 28 #include <cmath> 29 30 using namespace Xapian; 31 using namespace std; 32 33 /** Quadratic mean radius of the earth in metres. 34 */ 35 #define QUAD_EARTH_RADIUS_METRES 6372797.6 36 37 /** Set M_PI if it's not already set. 38 */ 39 #ifndef M_PI 40 #define M_PI 3.14159265358979323846 41 #endif 42 43 LatLongMetric::~LatLongMetric() 44 { 45 } 46 47 double 48 LatLongMetric::operator()(const LatLongCoords & a, const LatLongCoords &b) const 49 { 50 if (a.empty() || b.empty()) { 51 throw InvalidArgumentError("Empty coordinate list supplied to LatLongMetric::operator()()."); 52 } 53 double min_dist = 0.0; 54 bool have_min = false; 55 for (set<LatLongCoord>::const_iterator a_iter = a.begin(); 56 a_iter != a.end(); 57 ++a_iter) 58 { 59 for (set<LatLongCoord>::const_iterator b_iter = b.begin(); 60 b_iter != b.end(); 61 ++b_iter) 62 { 63 double dist = operator()(*a_iter, *b_iter); 64 if (!have_min) { 65 min_dist = dist; 66 have_min = true; 67 } else if (dist < min_dist) { 68 min_dist = dist; 69 } 70 } 71 } 72 return min_dist; 73 } 74 75 76 GreatCircleMetric::GreatCircleMetric() 77 : radius(QUAD_EARTH_RADIUS_METRES) 78 {} 79 80 GreatCircleMetric::GreatCircleMetric(double radius_) 81 : radius(radius_) 82 {} 83 84 double 85 GreatCircleMetric::operator()(const LatLongCoord & a, 86 const LatLongCoord & b) const 87 { 88 double lata = a.latitude * (M_PI / 180.0); 89 double latb = b.latitude * (M_PI / 180.0); 90 91 double latdiff = lata - latb; 92 double longdiff = (a.longitude - b.longitude) * (M_PI / 180.0); 93 94 double sin_half_lat = sin(latdiff / 2); 95 double sin_half_long = sin(longdiff / 2); 96 double h = sin_half_lat * sin_half_lat + 97 sin_half_long * sin_half_long * cos(lata) * cos(latb); 98 double sqrt_h = sqrt(h); 99 if (sqrt_h > 1.0) sqrt_h = 1.0; 100 return 2 * radius * asin(sqrt_h); 101 } 102 103 LatLongMetric * 104 GreatCircleMetric::clone() const 105 { 106 return new GreatCircleMetric(radius); 107 } 108 109 string 110 GreatCircleMetric::name() const 111 { 112 return "Xapian::GreatCircleMetric"; 113 } 114 115 string 116 GreatCircleMetric::serialise() const 117 { 118 return serialise_double(radius); 119 } 120 121 LatLongMetric * 122 GreatCircleMetric::unserialise(const string & s) const 123 { 124 const char * p = s.data(); 125 const char * end = p + s.size(); 126 127 double new_radius = unserialise_double(&p, end); 128 if (p != end) { 129 throw Xapian::NetworkError("Bad serialised GreatCircleMetric - junk at end"); 130 } 131 132 return new GreatCircleMetric(new_radius); 133 } -
xapian-core/geospatial/dir_contents
Property changes on: xapian-core/geospatial/latlong_metrics.cc ___________________________________________________________________ Added: svn:eol-style + native
1 <Directory>geospatial</Directory> 2 3 <Description> 4 Support for geospatial matching, and parsing of locations. 5 </Description> -
xapian-core/geospatial/latlongcoord.cc
1 /** \file latlong.cc 2 * \brief Latitude and longitude representations. 3 */ 4 /* Copyright 2008 Lemur Consulting Ltd 5 * 6 * This program is free software; you can redistribute it and/or 7 * modify it under the terms of the GNU General Public License as 8 * published by the Free Software Foundation; either version 2 of the 9 * License, or (at your option) any later version. 10 * 11 * This program is distributed in the hope that it will be useful, 12 * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 * GNU General Public License for more details. 15 * 16 * You should have received a copy of the GNU General Public License 17 * along with this program; if not, write to the Free Software 18 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 19 * USA 20 */ 21 22 #include <config.h> 23 24 #include "xapian/geospatial.h" 25 #include "xapian/error.h" 26 27 #include "serialise.h" 28 #include "serialise-double.h" 29 #include "str.h" 30 31 #include <cmath> 32 33 using namespace Xapian; 34 using namespace std; 35 36 LatLongCoord::LatLongCoord(double latitude_, double longitude_) 37 : latitude(latitude_), 38 longitude(longitude_) 39 { 40 if (latitude < -90.0 || latitude > 90.0) 41 throw InvalidArgumentError("Latitude out-of-range"); 42 longitude = fmod(longitude_, 360); 43 if (longitude <= -180) longitude += 360; 44 if (longitude > 180) longitude -= 360; 45 if (longitude == -0.0) longitude = 0.0; 46 } 47 48 LatLongCoord 49 LatLongCoord::unserialise(const string & serialised) 50 { 51 const char * ptr = serialised.data(); 52 const char * end = ptr + serialised.size(); 53 LatLongCoord result = unserialise(&ptr, end); 54 if (ptr != end) 55 throw InvalidArgumentError( 56 "Junk found at end of serialised LatLongCoord"); 57 return result; 58 } 59 60 LatLongCoord 61 LatLongCoord::unserialise(const char ** ptr, const char * end) 62 { 63 try { 64 // This will raise NetworkError for invalid serialisations. 65 double latitude = unserialise_double(ptr, end); 66 double longitude = unserialise_double(ptr, end); 67 return LatLongCoord(latitude, longitude); 68 } catch (const NetworkError & e) { 69 // FIXME - modify unserialise_double somehow so we don't have to catch 70 // and rethrow the exceptions it raises. 71 throw InvalidArgumentError(e.get_msg()); 72 } 73 } 74 75 string 76 LatLongCoord::serialise() const 77 { 78 string result(serialise_double(latitude)); 79 result += serialise_double(longitude); 80 return result; 81 } 82 83 string 84 LatLongCoord::get_description() const 85 { 86 string res("Xapian::LatLongCoord("); 87 res += str(latitude); 88 res += ", "; 89 res += str(longitude); 90 res += ")"; 91 return res; 92 } 93 94 LatLongCoords 95 LatLongCoords::unserialise(const string & serialised) 96 { 97 const char * ptr = serialised.data(); 98 const char * end = ptr + serialised.size(); 99 LatLongCoords coords = unserialise(ptr, end); 100 return coords; 101 } 102 103 LatLongCoords 104 LatLongCoords::unserialise(const char * ptr, const char * end) 105 { 106 LatLongCoords result; 107 try { 108 while (ptr != end) { 109 // This will raise NetworkError for invalid serialisations (so we 110 // catch and re-throw it). 111 result.coords.insert(LatLongCoord::unserialise(&ptr, end)); 112 } 113 } catch (const NetworkError & e) { 114 // FIXME - modify unserialise_double somehow so we don't have to catch 115 // and rethrow the exceptions it raises. 116 throw InvalidArgumentError(e.get_msg()); 117 } 118 if (ptr != end) 119 throw InvalidArgumentError( 120 "Junk found at end of serialised LatLongCoords"); 121 return result; 122 } 123 124 string 125 LatLongCoords::serialise() const 126 { 127 string result; 128 set<LatLongCoord>::const_iterator coord; 129 for (coord = coords.begin(); coord != coords.end(); ++coord) 130 { 131 result += serialise_double(coord->latitude); 132 result += serialise_double(coord->longitude); 133 } 134 return result; 135 } 136 137 string 138 LatLongCoords::get_description() const 139 { 140 string res("Xapian::LatLongCoords("); 141 set<LatLongCoord>::const_iterator coord; 142 for (coord = coords.begin(); coord != coords.end(); ++coord) { 143 if (coord != coords.begin()) { 144 res += ", "; 145 } 146 res += "("; 147 res += str(coord->latitude); 148 res += ", "; 149 res += str(coord->longitude); 150 res += ")"; 151 } 152 res += ")"; 153 return res; 154 } -
xapian-core/geospatial/Makefile
Property changes on: xapian-core/geospatial/latlongcoord.cc ___________________________________________________________________ Added: svn:eol-style + native
1 # Makefile for use in directories built by non-recursive make. 2 3 SHELL = /bin/sh 4 5 all check: 6 cd .. && $(MAKE) $@ 7 8 clean: 9 rm -f *.o *.obj *.lo -
xapian-core/tests/api_geospatial.cc
Property changes on: xapian-core/geospatial/Makefile ___________________________________________________________________ Added: svn:eol-style + native
1 /** @file api_geospatial.cc 2 * @brief Tests of geospatial functionality. 3 */ 4 /* Copyright 2008 Lemur Consulting Ltd 5 * Copyright 2010 Richard Boulton 6 * 7 * This program is free software; you can redistribute it and/or 8 * modify it under the terms of the GNU General Public License as 9 * published by the Free Software Foundation; either version 2 of the 10 * License, or (at your option) any later version. 11 * 12 * This program is distributed in the hope that it will be useful, 13 * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 * GNU General Public License for more details. 16 * 17 * You should have received a copy of the GNU General Public License 18 * along with this program; if not, write to the Free Software 19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 20 * USA 21 */ 22 23 #include <config.h> 24 #include "api_geospatial.h" 25 #include <xapian/geospatial.h> 26 #include <xapian/error.h> 27 28 #include "apitest.h" 29 #include "testsuite.h" 30 #include "testutils.h" 31 #include <iomanip> 32 33 using namespace std; 34 using namespace Xapian; 35 36 // ####################################################################### 37 // # Tests start here 38 39 static void 40 builddb_coords1(Xapian::WritableDatabase &db, const string &) 41 { 42 Xapian::LatLongCoord coord1(10, 10); 43 Xapian::LatLongCoord coord2(20, 10); 44 Xapian::LatLongCoord coord3(30, 10); 45 46 Xapian::Document doc; 47 doc.add_value(0, coord1.serialise()); 48 db.add_document(doc); 49 50 doc = Xapian::Document(); 51 doc.add_value(0, coord2.serialise()); 52 db.add_document(doc); 53 54 doc = Xapian::Document(); 55 doc.add_value(0, coord3.serialise()); 56 db.add_document(doc); 57 } 58 59 /// Test behaviour of the LatLongPostingSource 60 DEFINE_TESTCASE(latlongpostingsource1, backend && writable && !remote && !inmemory) { 61 Xapian::Database db = get_database("coords1", builddb_coords1, ""); 62 Xapian::LatLongCoord coord1(10, 10); 63 Xapian::LatLongCoord coord2(20, 10); 64 Xapian::LatLongCoord coord3(30, 10); 65 66 // Chert doesn't currently support opening a value iterator for a writable database. 67 SKIP_TEST_FOR_BACKEND("chert"); 68 69 Xapian::GreatCircleMetric metric; 70 Xapian::LatLongCoords centre; 71 centre.insert(coord1); 72 double coorddist = metric(coord1, coord2); 73 TEST_EQUAL_DOUBLE(coorddist, metric(coord2, coord3)); 74 75 // Test a search with no range restriction. 76 { 77 Xapian::LatLongDistancePostingSource ps(0, centre, metric); 78 ps.init(db); 79 80 ps.next(0.0); 81 TEST_EQUAL(ps.at_end(), false); 82 TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0); 83 TEST_EQUAL(ps.get_docid(), 1); 84 85 ps.next(0.0); 86 TEST_EQUAL(ps.at_end(), false); 87 TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist)); 88 TEST_EQUAL(ps.get_docid(), 2); 89 90 ps.next(0.0); 91 TEST_EQUAL(ps.at_end(), false); 92 TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist * 2)); 93 TEST_EQUAL(ps.get_docid(), 3); 94 95 ps.next(0.0); 96 TEST_EQUAL(ps.at_end(), true); 97 } 98 99 // Test a search with a tight range restriction 100 { 101 Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 0.5); 102 ps.init(db); 103 104 ps.next(0.0); 105 TEST_EQUAL(ps.at_end(), false); 106 TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0); 107 108 ps.next(0.0); 109 TEST_EQUAL(ps.at_end(), true); 110 } 111 112 // Test a search with a looser range restriction 113 { 114 Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist); 115 ps.init(db); 116 117 ps.next(0.0); 118 TEST_EQUAL(ps.at_end(), false); 119 TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0); 120 121 ps.next(0.0); 122 TEST_EQUAL(ps.at_end(), false); 123 TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist)); 124 TEST_EQUAL(ps.get_docid(), 2); 125 126 ps.next(0.0); 127 TEST_EQUAL(ps.at_end(), true); 128 } 129 130 // Test a search with a looser range restriction, but not enough to return 131 // the next document. 132 { 133 Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 1.5); 134 ps.init(db); 135 136 ps.next(0.0); 137 TEST_EQUAL(ps.at_end(), false); 138 TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0); 139 140 ps.next(0.0); 141 TEST_EQUAL(ps.at_end(), false); 142 TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist)); 143 TEST_EQUAL(ps.get_docid(), 2); 144 145 ps.next(0.0); 146 TEST_EQUAL(ps.at_end(), true); 147 } 148 149 // Test a search with a loose enough range restriction that all docs should 150 // be returned. 151 { 152 Xapian::LatLongDistancePostingSource ps(0, centre, metric, coorddist * 2.5); 153 ps.init(db); 154 155 ps.next(0.0); 156 TEST_EQUAL(ps.at_end(), false); 157 TEST_EQUAL_DOUBLE(ps.get_weight(), 1.0); 158 159 ps.next(0.0); 160 TEST_EQUAL(ps.at_end(), false); 161 TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist)); 162 TEST_EQUAL(ps.get_docid(), 2); 163 164 ps.next(0.0); 165 TEST_EQUAL(ps.at_end(), false); 166 TEST_EQUAL_DOUBLE(ps.get_weight(), 1000.0 / (1000.0 + coorddist * 2)); 167 TEST_EQUAL(ps.get_docid(), 3); 168 169 ps.next(0.0); 170 TEST_EQUAL(ps.at_end(), true); 171 } 172 173 return true; 174 } 175 176 // Test various methods of LatLongCoord and LatLongCoords 177 DEFINE_TESTCASE(latlongcoords1, !backend) { 178 LatLongCoord c1(0, 0); 179 LatLongCoord c2(1, 0); 180 LatLongCoord c3(1, 0); 181 182 // Test comparison 183 TEST_NOT_EQUAL(c1.get_description(), c2.get_description()); 184 TEST(c1 < c2 || c2 < c1); 185 TEST_EQUAL(c2.get_description(), c3.get_description()); 186 TEST(!(c2 < c3) && !(c3 < c2)); 187 188 // Test serialisation 189 std::string s1 = c1.serialise(); 190 LatLongCoord c4 = LatLongCoord::unserialise(s1); 191 TEST(!(c1 < c4 || c4 < c1)); 192 const char * ptr = s1.data(); 193 const char * end = ptr + s1.size(); 194 c4 = LatLongCoord::unserialise(&ptr, end); 195 TEST_EQUAL(c1.get_description(), c4.get_description()); 196 TEST_EQUAL(c1.get_description(), "Xapian::LatLongCoord(0, 0)"); 197 TEST_EQUAL(ptr, end); 198 199 // Test building a set of LatLongCoords 200 LatLongCoords g1(c1); 201 TEST(!g1.empty()); 202 TEST_EQUAL(g1.size(), 1); 203 TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0))"); 204 g1.insert(c2); 205 TEST_EQUAL(g1.size(), 2); 206 TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))"); 207 // c3 == c2, so already in the set, so no change if we add c3 208 g1.insert(c3); 209 TEST_EQUAL(g1.size(), 2); 210 TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))"); 211 TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0), (1, 0))"); 212 g1.erase(c3); 213 TEST_EQUAL(g1.get_description(), "Xapian::LatLongCoords((0, 0))"); 214 215 // Test building an empty LatLongCoords 216 LatLongCoords g2; 217 TEST(g2.empty()); 218 TEST_EQUAL(g2.size(), 0); 219 TEST_EQUAL(g2.get_description(), "Xapian::LatLongCoords()"); 220 221 return true; 222 } 223 224 // Test various methods of LatLongMetric 225 DEFINE_TESTCASE(latlongmetric1, !backend) { 226 LatLongCoord c1(0, 0); 227 LatLongCoord c2(1, 0); 228 Xapian::GreatCircleMetric m1; 229 double d1 = m1(c1, c2); 230 TEST_REL(d1, >, 111226.0); 231 TEST_REL(d1, <, 111227.0); 232 233 // Let's make another metric, this time using the radius of mars, so 234 // distances should be quite a bit smaller. 235 Xapian::GreatCircleMetric m2(3310000); 236 double d2 = m2(c1, c2); 237 TEST_REL(d2, >, 57770.0); 238 TEST_REL(d2, <, 57771.0); 239 240 // Check serialise and unserialise. 241 Xapian::Registry registry; 242 std::string s1 = m2.serialise(); 243 const Xapian::LatLongMetric * m3; 244 m3 = registry.get_lat_long_metric(m2.name()); 245 TEST(m3 != NULL); 246 m3 = m3->unserialise(s1); 247 double d3 = (*m3)(c1, c2); 248 TEST_EQUAL_DOUBLE(d2, d3); 249 250 delete m3; 251 252 return true; 253 } 254 255 // Test a LatLongDistanceKeyMaker directly. 256 DEFINE_TESTCASE(latlongkeymaker1, !backend) { 257 Xapian::GreatCircleMetric m1(3310000); 258 LatLongCoord c1(0, 0); 259 LatLongCoord c2(1, 0); 260 LatLongCoord c3(2, 0); 261 LatLongCoord c4(3, 0); 262 263 LatLongCoords g1(c1); 264 g1.insert(c2); 265 266 LatLongDistanceKeyMaker keymaker(0, g1, m1); 267 Xapian::Document doc1; 268 doc1.add_value(0, g1.serialise()); 269 Xapian::Document doc2; 270 doc2.add_value(0, c3.serialise()); 271 Xapian::Document doc3; 272 doc3.add_value(0, c4.serialise()); 273 Xapian::Document doc4; 274 275 std::string k1 = keymaker(doc1); 276 std::string k2 = keymaker(doc2); 277 std::string k3 = keymaker(doc3); 278 std::string k4 = keymaker(doc4); 279 TEST_REL(k1, <, k2); 280 TEST_REL(k2, <, k3); 281 TEST_REL(k3, <, k4); 282 283 LatLongDistanceKeyMaker keymaker2(0, g1, m1, 0); 284 std::string k3b = keymaker2(doc3); 285 std::string k4b = keymaker2(doc4); 286 TEST_EQUAL(k3, k3b); 287 TEST_REL(k3b, >, k4b); 288 289 return true; 290 } -
xapian-core/tests/Makefile.am
Property changes on: xapian-core/tests/api_geospatial.cc ___________________________________________________________________ Added: svn:eol-style + native
155 155 api_unicode.cc \ 156 156 api_valuestats.cc \ 157 157 api_valuestream.cc \ 158 api_geospatial.cc \ 158 159 api_wrdb.cc 159 160 160 161 apitest_SOURCES = apitest.cc dbcheck.cc $(collated_apitest_sources) \ -
xapian-core/include/xapian/postingsource.h
31 31 32 32 namespace Xapian { 33 33 34 class Registry; 35 34 36 /** Base class which provides an "external" source of postings. 35 37 * 36 38 * Warning: the PostingSource interface is currently experimental, and is … … 285 287 */ 286 288 virtual PostingSource * unserialise(const std::string &s) const; 287 289 290 /** Create object given string serialisation returned by serialise(). 291 * 292 * Note that the returned object will be deallocated by Xapian after use 293 * with "delete". It must therefore have been allocated with "new". 294 * 295 * This method is supplied with a Registry object, which can be used when 296 * unserialising objects contained within the posting source. The default 297 * implementation simply calls unserialise() which doesn't take the 298 * Registry object, so you do not need to implement this method unless you 299 * want to take advantage of the Registry object when unserialising. 300 * 301 * @param s A serialised instance of this PostingSource subclass. 302 */ 303 virtual PostingSource * unserialise_with_registry(const std::string &s, 304 const Registry & registry) const; 305 288 306 /** Set this PostingSource to the start of the list of postings. 289 307 * 290 308 * This is called automatically by the matcher prior to each query being -
xapian-core/include/xapian/registry.h
30 30 namespace Xapian { 31 31 32 32 // Forward declarations. 33 class LatLongMetric; 33 34 class MatchSpy; 34 35 class PostingSource; 35 36 class Weight; … … 105 106 */ 106 107 const Xapian::MatchSpy * 107 108 get_match_spy(const std::string & name) const; 109 110 /// Register a user-defined lat-long metric class. 111 void register_lat_long_metric(const Xapian::LatLongMetric &metric); 112 113 /** Get a lat-long metric given a name. 114 * 115 * The returned metric is owned by the registry object. 116 * 117 * Returns NULL if the metric could not be found. 118 */ 119 const Xapian::LatLongMetric * 120 get_lat_long_metric(const std::string & name) const; 121 108 122 }; 109 123 110 124 } -
xapian-core/include/xapian/geospatial.h
1 /** @file geospatial.h 2 * @brief Geospatial search support routines. 3 */ 4 /* Copyright 2008,2009 Lemur Consulting Ltd 5 * Copyright 2010 Richard Boulton 6 * 7 * This program is free software; you can redistribute it and/or 8 * modify it under the terms of the GNU General Public License as 9 * published by the Free Software Foundation; either version 2 of the 10 * License, or (at your option) any later version. 11 * 12 * This program is distributed in the hope that it will be useful, 13 * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 * GNU General Public License for more details. 16 * 17 * You should have received a copy of the GNU General Public License 18 * along with this program; if not, write to the Free Software 19 * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 20 * USA 21 */ 22 23 #ifndef XAPIAN_INCLUDED_GEOSPATIAL_H 24 #define XAPIAN_INCLUDED_GEOSPATIAL_H 25 26 #include <xapian/enquire.h> 27 #include <xapian/postingsource.h> 28 #include <xapian/queryparser.h> // For sortable_serialise 29 #include <xapian/keymaker.h> 30 #include <xapian/visibility.h> 31 #include <string> 32 #include <set> 33 34 namespace Xapian { 35 36 class Registry; 37 38 /** Convert from miles to metres. 39 */ 40 inline XAPIAN_VISIBILITY_DEFAULT double 41 miles_to_metres(double miles) 42 { 43 return 1609.344 * miles; 44 } 45 46 /** Convert from metres to miles. 47 */ 48 inline XAPIAN_VISIBILITY_DEFAULT double 49 metres_to_miles(double metres) 50 { 51 return metres * (1.0 / 1609.344); 52 } 53 54 /** A latitude-longitude coordinate. 55 * 56 * Note that latitude-longitude coordinates are only precisely meaningful if 57 * the datum used to define them is specified. This class ignores this 58 * issue - it is up to the caller to ensure that the datum used for each 59 * coordinate in a system is consistent. 60 */ 61 struct XAPIAN_VISIBILITY_DEFAULT LatLongCoord { 62 public: 63 /** A latitude, as decimal degrees. 64 * 65 * Should be in the range -90 <= longitude <= 90 66 * 67 * Postive latitudes represent the northern hemisphere. 68 */ 69 double latitude; 70 71 /** A longitude, as decimal degrees. 72 * 73 * Should be in the range -180 < latitude <= 180 74 * 75 * Positive longitudes represent the eastern hemisphere. 76 */ 77 double longitude; 78 79 /** Construct a coordinate. 80 * 81 * If the supplied longitude is out of range, an exception will be raised. 82 * 83 * If the supplied latitude is out of range, it will be normalised to the 84 * appropriate range. 85 */ 86 LatLongCoord(double latitude_, double longitude_); 87 88 /** Construct a coordinate by unserialising a string. 89 * 90 * @param serialised the string to unserialise the coordinate from. 91 * 92 * @exception Xapian::InvalidArgumentError if the string does not contain 93 * a valid serialised latitude-longitude pair, or contains extra data at 94 * the end of it. 95 */ 96 static LatLongCoord unserialise(const std::string & serialised); 97 98 /** Construct a coordinate by unserialising a string. 99 * 100 * The string may contain further data after that for the coordinate. 101 * 102 * @param ptr A pointer to the start of the string. This will be updated 103 * to point to the end of the data representing the coordinate. 104 * @param end A pointer to the end of the string. 105 * 106 * @exception Xapian::InvalidArgumentError if the string does not contain 107 * a valid serialised latitude-longitude pair. 108 */ 109 static LatLongCoord unserialise(const char ** ptr, const char * end); 110 111 /** Return a serialised representation of the coordinate. 112 */ 113 std::string serialise() const; 114 115 /** Compare with another LatLongCoord. 116 */ 117 bool operator<(const LatLongCoord & other) const 118 { 119 if (latitude < other.latitude) return true; 120 return (longitude < other.longitude); 121 } 122 123 /// Return a string describing this object. 124 std::string get_description() const; 125 }; 126 127 /** A set of latitude-longitude coordinate. 128 */ 129 class XAPIAN_VISIBILITY_DEFAULT LatLongCoords { 130 /// The coordinates. 131 std::set<LatLongCoord> coords; 132 133 public: 134 std::set<LatLongCoord>::const_iterator begin() const 135 { 136 return coords.begin(); 137 } 138 139 std::set<LatLongCoord>::const_iterator end() const 140 { 141 return coords.end(); 142 } 143 144 size_t size() const 145 { 146 return coords.size(); 147 } 148 149 size_t empty() const 150 { 151 return coords.empty(); 152 } 153 154 void insert(const LatLongCoord & coord) 155 { 156 coords.insert(coord); 157 } 158 159 void erase(const LatLongCoord & coord) 160 { 161 coords.erase(coord); 162 } 163 164 /// Construct an empty set of coordinates. 165 LatLongCoords() : coords() {} 166 167 /// Construct a set of coordinates containing one coordinate. 168 LatLongCoords(const LatLongCoord & coord) : coords() 169 { 170 coords.insert(coord); 171 } 172 173 /** Construct a set of coordinates by unserialising a string. 174 * 175 * @param serialised the string to unserialise the coordinates from. 176 * 177 * @exception Xapian::InvalidArgumentError if the string does not contain 178 * a valid serialised latitude-longitude pair, or contains junk at the end 179 * of it. 180 */ 181 static LatLongCoords unserialise(const std::string & serialised); 182 183 /** Construct a set of coordinates by unserialising a string. 184 * 185 * The string may NOT contain further data after the coordinates (the 186 * representation of the list of coordinates is not self-terminating). 187 * 188 * @param ptr A pointer to the start of the string. 189 * @param end A pointer to the end of the string. 190 * 191 * @exception Xapian::InvalidArgumentError if the string does not contain 192 * a valid serialised latitude-longitude pair, or contains junk at the end 193 * of it. 194 */ 195 static LatLongCoords unserialise(const char * ptr, const char * end); 196 197 /** Return a serialised form of the coordinate list. 198 */ 199 std::string serialise() const; 200 201 /// Return a string describing this object. 202 std::string get_description() const; 203 }; 204 205 /** Base class for calculating distances between two lat/long coordinates. 206 */ 207 class XAPIAN_VISIBILITY_DEFAULT LatLongMetric { 208 public: 209 /// Destructor. 210 virtual ~LatLongMetric(); 211 212 /** Return the distance between two coordinates, in metres. 213 */ 214 virtual double operator()(const LatLongCoord & a, const LatLongCoord &b) const = 0; 215 216 /** Return the distance between two coordinate lists, in metres. 217 * 218 * The distance between the coordinate lists is defined to the be minimum 219 * pairwise distance between coordinates in the lists. 220 * 221 * If either of the lists is empty, an InvalidArgumentError will be raised. 222 */ 223 double operator()(const LatLongCoords & a, const LatLongCoords &b) const; 224 225 /** Clone the metric. */ 226 virtual LatLongMetric * clone() const = 0; 227 228 /** Return the full name of the metric. 229 * 230 * This is used when serialising and unserialising metrics; for example, 231 * for performing remote searches. 232 * 233 * If the subclass is in a C++ namespace, the namespace should be included 234 * in the name, using "::" as a separator. For example, for a 235 * LatLongMetric subclass called "FooLatLongMetric" in the "Xapian" 236 * namespace the result of this call should be "Xapian::FooLatLongMetric". 237 */ 238 virtual std::string name() const = 0; 239 240 /** Serialise object parameters into a string. 241 * 242 * The serialised parameters should represent the configuration of the 243 * metric. 244 */ 245 virtual std::string serialise() const = 0; 246 247 /** Create object given string serialisation returned by serialise(). 248 * 249 * @param s A serialised instance of this LatLongMetric subclass. 250 */ 251 virtual LatLongMetric * unserialise(const std::string & s) const = 0; 252 }; 253 254 /** Calculate the great-circle distance between two coordinates on a sphere. 255 * 256 * This uses the haversine formula to calculate the distance. Note that this 257 * formula is subject to inaccuracy due to numerical errors for coordinates on 258 * the opposite side of the sphere. 259 * 260 * See http://en.wikipedia.org/wiki/Haversine_formula 261 */ 262 class XAPIAN_VISIBILITY_DEFAULT GreatCircleMetric : public LatLongMetric { 263 /** The radius of the sphere in metres. 264 */ 265 double radius; 266 267 public: 268 /** Construct a GreatCircleMetric. 269 * 270 * The (quadratic mean) radius of the earth will be used by this 271 * calculator. 272 */ 273 GreatCircleMetric(); 274 275 /** Construct a GreatCircleMetric using a specified radius. 276 * 277 * @param radius_ The radius of to use, in metres. 278 */ 279 GreatCircleMetric(double radius_); 280 281 /** Return the great-circle distance between points on the sphere. 282 */ 283 double operator()(const LatLongCoord & a, const LatLongCoord &b) const; 284 285 LatLongMetric * clone() const; 286 std::string name() const; 287 std::string serialise() const; 288 LatLongMetric * unserialise(const std::string & s) const; 289 }; 290 291 /** Posting source which returns a weight based on geospatial distance. 292 * 293 * Results are weighted by the distance from a fixed point, or list of points, 294 * calculated according to the metric supplied. If multiple points are 295 * supplied (either in the constructor, or in the coordinates stored in a 296 * document) , the closest pointwise distance is returned. 297 * 298 * Documents further away than a specified maximum range (or with no location 299 * stored in the specified slot) will not be returned. 300 * 301 * The weight returned will be computed from the distance using the formula: 302 * k1 * (distance + k1) ** (- k2) 303 * 304 * (Where k1 and k2 are (strictly) positive, floating point, constants, and 305 * default to 1000 and 1, respectively. Distance is measured in metres, so 306 * this means that something at the centre gets a weight of 1.0, something 1km 307 * away gets a weight of 0.5, and something 3km away gets a weight of 0.25, 308 * etc) 309 */ 310 class XAPIAN_VISIBILITY_DEFAULT LatLongDistancePostingSource : public ValuePostingSource 311 { 312 /// Current distance from centre. 313 double dist; 314 315 /// Centre, to compute distance from. 316 LatLongCoords centre; 317 318 /// Metric to compute the distance with. 319 const LatLongMetric * metric; 320 321 /// Maximum range to allow. If set to 0, there is no maximum range. 322 double max_range; 323 324 /// Constant used in weighting function. 325 double k1; 326 327 /// Constant used in weighting function. 328 double k2; 329 330 /** Calculate the distance for the current document. 331 * 332 * Returns true if the distance was calculated ok, or false if the 333 * document didn't contain a valid serialised set of coordinates in the 334 * appropriate value slot. 335 */ 336 void calc_distance(); 337 338 /// Internal constructor; used by clone() and serialise(). 339 LatLongDistancePostingSource(Xapian::valueno slot_, 340 const LatLongCoords & centre_, 341 const LatLongMetric * metric_, 342 double max_range_, 343 double k1_, 344 double k2_); 345 346 public: 347 /** Construct a new match decider which returns only documents within 348 * range of one of the central coordinates. 349 * 350 * @param db_ The database to read values from. 351 * @param slot_ The value slot to read values from. 352 * @param centre_ The centre point to use for distance calculations. 353 * @param metric_ The metric to use for distance calculations. 354 * @param max_range_ The maximum distance for documents which are returned. 355 * @param k1_ The k1 constant to use in the weighting function. 356 * @param k2_ The k2 constant to use in the weighting function. 357 */ 358 LatLongDistancePostingSource(Xapian::valueno slot_, 359 const LatLongCoords & centre_, 360 const LatLongMetric & metric_, 361 double max_range_ = 0.0, 362 double k1_ = 1000.0, 363 double k2_ = 1.0); 364 ~LatLongDistancePostingSource(); 365 366 void next(Xapian::weight min_wt); 367 void skip_to(Xapian::docid min_docid, Xapian::weight min_wt); 368 bool check(Xapian::docid min_docid, Xapian::weight min_wt); 369 370 Xapian::weight get_weight() const; 371 LatLongDistancePostingSource * clone() const; 372 std::string name() const; 373 std::string serialise() const; 374 LatLongDistancePostingSource * 375 unserialise_with_registry(const std::string &s, 376 const Registry & registry) const; 377 void init(const Database & db_); 378 379 std::string get_description() const; 380 }; 381 382 /** KeyMaker subclass which sorts by distance from a latitude/longitude. 383 * 384 * Results are ordered by the distance from a fixed point, or list of points, 385 * calculated according to the metric supplied. If multiple points are 386 * supplied (either in the constructor, or in the coordinates stored in a 387 * document), the closest pointwise distance is returned. 388 * 389 * If a document contains no 390 */ 391 class XAPIAN_VISIBILITY_DEFAULT LatLongDistanceKeyMaker : public KeyMaker { 392 393 /// The value slot to read. 394 Xapian::valueno valno; 395 396 /// The centre point (or points) for distance calculation. 397 LatLongCoords centre; 398 399 /// The metric to use when calculating distances. 400 const LatLongMetric * metric; 401 402 /// The default key to return, for documents with no value stored. 403 std::string defkey; 404 405 public: 406 LatLongDistanceKeyMaker(Xapian::valueno valno_, 407 const LatLongCoords & centre_, 408 const LatLongMetric & metric_, 409 double defdistance = 10E10) 410 : valno(valno_), 411 centre(centre_), 412 metric(metric_.clone()), 413 defkey(sortable_serialise(defdistance)) 414 {} 415 416 LatLongDistanceKeyMaker(Xapian::valueno valno_, 417 const LatLongCoord & centre_, 418 const LatLongMetric & metric_, 419 double defdistance = 10E10) 420 : valno(valno_), 421 centre(), 422 metric(metric_.clone()), 423 defkey(sortable_serialise(defdistance)) 424 { 425 centre.insert(centre_); 426 } 427 428 ~LatLongDistanceKeyMaker(); 429 430 std::string operator()(const Xapian::Document & doc) const; 431 }; 432 433 } 434 435 #endif /* XAPIAN_INCLUDED_GEOSPATIAL_H */ -
xapian-core/include/Makefile.mk
Property changes on: xapian-core/include/xapian/geospatial.h ___________________________________________________________________ Added: svn:eol-style + native
34 34 include/xapian/unicode.h\ 35 35 include/xapian/valueiterator.h\ 36 36 include/xapian/valuesetmatchdecider.h\ 37 include/xapian/geospatial.h\ 37 38 include/xapian/visibility.h\ 38 39 include/xapian/weight.h 39 40 -
xapian-core/include/xapian.h
62 62 // Unicode support 63 63 #include <xapian/unicode.h> 64 64 65 // Geospatial 66 #include <xapian/geospatial.h> 67 65 68 // ELF visibility annotations for GCC. 66 69 #include <xapian/visibility.h> 67 70 -
xapian-core/common/registryinternal.h
32 32 class Weight; 33 33 class PostingSource; 34 34 class MatchSpy; 35 class LatLongMetric; 35 36 } 36 37 37 38 class Xapian::Registry::Internal : public Xapian::Internal::RefCntBase { … … 46 47 /// Registered match spies. 47 48 std::map<std::string, Xapian::MatchSpy *> matchspies; 48 49 50 /// Registered lat-long metrics. 51 std::map<std::string, Xapian::LatLongMetric *> lat_long_metrics; 52 49 53 /// Add the standard subclasses provided in the API. 50 54 void add_defaults(); 51 55 … … 58 62 /// Clear all registered match spies. 59 63 void clear_match_spies(); 60 64 65 /// Clear all registered lat-long metrics. 66 void clear_lat_long_metrics(); 67 61 68 public: 62 69 Internal(); 63 70 ~Internal(); -
xapian-core/common/output.h
6 6 * Copyright 2002 Ananova Ltd 7 7 * Copyright 2002,2003,2004,2007,2009 Olly Betts 8 8 * Copyright 2007 Lemur Consulting Ltd 9 * Copyright 2010 Richard Boulton 9 10 * 10 11 * This program is free software; you can redistribute it and/or 11 12 * modify it under the terms of the GNU General Public License as … … 62 63 XAPIAN_OUTPUT_FUNCTION(Xapian::ESet) 63 64 XAPIAN_OUTPUT_FUNCTION(Xapian::Enquire) 64 65 66 #include <xapian/geospatial.h> 67 XAPIAN_OUTPUT_FUNCTION(Xapian::LatLongCoord) 68 XAPIAN_OUTPUT_FUNCTION(Xapian::LatLongCoords) 69 65 70 #include <xapian/stem.h> 66 71 XAPIAN_OUTPUT_FUNCTION(Xapian::Stem) 67 72 -
xapian-core/api/postingsource.cc
102 102 throw Xapian::UnimplementedError("unserialise() not supported for this PostingSource"); 103 103 } 104 104 105 PostingSource * 106 PostingSource::unserialise_with_registry(const std::string &s, 107 const Registry &) const 108 { 109 return unserialise(s); 110 } 111 105 112 string 106 113 PostingSource::get_description() const 107 114 { -
xapian-core/api/registry.cc
24 24 #include "xapian/registry.h" 25 25 26 26 #include "xapian/error.h" 27 #include "xapian/geospatial.h" 27 28 #include "xapian/matchspy.h" 28 29 #include "xapian/postingsource.h" 29 30 #include "xapian/weight.h" … … 154 155 RETURN(lookup_object(internal->matchspies, name)); 155 156 } 156 157 158 void 159 Registry::register_lat_long_metric(const Xapian::LatLongMetric &metric) 160 { 161 LOGCALL_VOID(API, "Xapian::Registry::register_lat_long_metric", metric.name()); 162 register_object(internal->lat_long_metrics, metric); 163 } 157 164 165 const Xapian::LatLongMetric * 166 Registry::get_lat_long_metric(const string & name) const 167 { 168 LOGCALL(API, const Xapian::MatchSpy *, "Xapian::Registry::get_lat_long_metric", name); 169 RETURN(lookup_object(internal->lat_long_metrics, name)); 170 } 171 158 172 Registry::Internal::Internal() 159 173 : Xapian::Internal::RefCntBase(), 160 174 wtschemes(), 161 postingsources() 175 postingsources(), 176 lat_long_metrics() 162 177 { 163 178 add_defaults(); 164 179 } … … 168 183 clear_weighting_schemes(); 169 184 clear_posting_sources(); 170 185 clear_match_spies(); 186 clear_lat_long_metrics(); 171 187 } 172 188 173 189 void … … 190 206 postingsources[source->name()] = source; 191 207 source = new Xapian::FixedWeightPostingSource(0.0); 192 208 postingsources[source->name()] = source; 209 source = new Xapian::LatLongDistancePostingSource(0, 210 Xapian::LatLongCoords(), 211 Xapian::GreatCircleMetric()); 212 postingsources[source->name()] = source; 193 213 194 214 Xapian::MatchSpy * spy; 195 215 spy = new Xapian::ValueCountMatchSpy(); 196 216 matchspies[spy->name()] = spy; 217 218 Xapian::LatLongMetric * metric; 219 metric = new Xapian::GreatCircleMetric(); 220 lat_long_metrics[metric->name()] = metric; 197 221 } 198 222 199 223 void … … 223 247 } 224 248 } 225 249 250 void 251 Registry::Internal::clear_lat_long_metrics() 252 { 253 map<string, Xapian::LatLongMetric *>::const_iterator i; 254 for (i = lat_long_metrics.begin(); i != lat_long_metrics.end(); ++i) { 255 delete i->second; 256 } 226 257 } 258 259 } -
xapian-core/Makefile.am
140 140 include common/Makefile.mk 141 141 include examples/Makefile.mk 142 142 include expand/Makefile.mk 143 include geospatial/Makefile.mk 143 144 include include/Makefile.mk 144 145 include languages/Makefile.mk 145 146 include matcher/Makefile.mk -
xapian-bindings/xapian.i
790 790 791 791 %include <xapian/valuesetmatchdecider.h> 792 792 793 %ignore Xapian::LatLongCoord::operator< const; 794 %include <xapian/geospatial.h> 795 793 796 namespace Xapian { 794 797 795 798 #if defined SWIGPYTHON