I have a large Postgres table contain geographic positions (latitude and longitude) both fields are indexed, and both are defined as
If I run a query looking for an exact position match, something like this:
WHERE latitude = 1.234567890123456789 AND longitude = 9.876543210987654321
Then get a very fast response, but I get very few results because the database is searching for a very precise match.
For my purposes, I’m looking for positions that match to within a few meters so a match to 4 or 5 decimal places should be fine. This gives me the results I’m looking for:
WHERE ABS(latitude - 1.234567890123456789) < 0.0001 AND ABS(longitude - 9.876543210987654321) < 0.0001
But NOT the performance (it can take 5 minutes to run, compared to a fraction of a second for the exact search)
Next I tried rounding the precision down:
WHERE ROUND( latitude, 4) = ROUND( 1.234567890123456789, 4) AND ROUND( longitude,4) = ROUND( 9.876543210987654321, 4)
Again, same problem. Got the results I wanted, but took far too long.
So, my question is how can I search for a close match between two numbers, without losing performance?
UPDATE – SOLVED:
As a couple of commenters have observed, using
BETWEEN seems to work fine.
The smart and fast solution for this class of problems is an index-backed “nearest neighbor” search.
For the record: if you want precise results with spatial data use PostGis and operate with
geography types. Here is a starting point. And operate with
Sticking to your setup (2-D points, no PostGis), and ignoring the additional approximation error of handling spatial data in a 2-D plain, which seems negligible for the case at hand – I suggest a space-partitioned GiST index (on an expression in your case):
CREATE INDEX tbl_location_spgist_idx ON tbl USING SPGIST (point(longitude, latitude));
Why SP-Gist? See:
To get a maximum of 10 “nearest neighbors” in next to no time:
SELECT * FROM tbl ORDER BY point (longitude, latitude) <-> point '(9.876543210987654321,1.234567890123456789)' LIMIT 10;
You can then filter the ones close enough. To get a maximum of 10 closest within a maximum distance:
SELECT *, ((longitude - 9.876543210987654321) ^ 2 + (latitude - 1.234567890123456789) ^ 2) AS squared_dist FROM tbl WHERE ((longitude - 9.876543210987654321) ^ 2 + (latitude - 1.234567890123456789) ^ 2) < 0.000000001 -- or whatever ORDER BY point (longitude, latitude) <-> point '(9.876543210987654321,1.234567890123456789)' LIMIT 10;
I use a squared distance to get nearest neighbors based on simple Pythagorean theorem. The beauty of it: the calculation is only performed on the nearest neighbors, so it’s still very fast when the calculation gets more expensive – even in big tables.