Searching for a close numeric match on spatial coordinates

Posted on

Question :

I have a large Postgres table contain geographic positions (latitude and longitude) both fields are indexed, and both are defined as NUMERIC(9,6).

If I run a query looking for an exact position match, something like this:

WHERE latitude  = 1.234567890123456789
AND   longitude = 9.876543210987654321

Then get a very fast response, but I get very few results because the database is searching for a very precise match.

For my purposes, I’m looking for positions that match to within a few meters so a match to 4 or 5 decimal places should be fine. This gives me the results I’m looking for:

WHERE ABS(latitude  - 1.234567890123456789) < 0.0001
AND   ABS(longitude - 9.876543210987654321) < 0.0001

But NOT the performance (it can take 5 minutes to run, compared to a fraction of a second for the exact search)

Next I tried rounding the precision down:

WHERE ROUND( latitude, 4) = ROUND( 1.234567890123456789, 4)
AND   ROUND( longitude,4) = ROUND( 9.876543210987654321, 4)

Again, same problem. Got the results I wanted, but took far too long.

So, my question is how can I search for a close match between two numbers, without losing performance?

As a couple of commenters have observed, using BETWEEN seems to work fine.

Answer :

The smart and fast solution for this class of problems is an index-backed “nearest neighbor” search.

For the record: if you want precise results with spatial data use PostGis and operate with geometry or geography types. Here is a starting point. And operate with ST_DWithin(). Examples:

Sticking to your setup (2-D points, no PostGis), and ignoring the additional approximation error of handling spatial data in a 2-D plain, which seems negligible for the case at hand – I suggest a space-partitioned GiST index (on an expression in your case):

CREATE INDEX tbl_location_spgist_idx ON tbl USING SPGIST (point(longitude, latitude));

Why SP-Gist? See:

To get a maximum of 10 “nearest neighbors” in next to no time:

FROM   tbl
ORDER  BY point (longitude, latitude)
      <-> point '(9.876543210987654321,1.234567890123456789)'
LIMIT  10;

You can then filter the ones close enough. To get a maximum of 10 closest within a maximum distance:

SELECT *, ((longitude - 9.876543210987654321) ^ 2
        + (latitude   - 1.234567890123456789) ^ 2) AS squared_dist
FROM   tbl
WHERE  ((longitude - 9.876543210987654321) ^ 2
      + (latitude  - 1.234567890123456789) ^ 2) < 0.000000001  -- or whatever
ORDER  BY point (longitude, latitude)
      <-> point '(9.876543210987654321,1.234567890123456789)'
LIMIT  10;

db<>fiddle here

I use a squared distance to get nearest neighbors based on simple Pythagorean theorem. The beauty of it: the calculation is only performed on the nearest neighbors, so it’s still very fast when the calculation gets more expensive – even in big tables.

Leave a Reply

Your email address will not be published. Required fields are marked *