What does “Rows Removed by Index Recheck” without “lossy” mean?

Posted on

Question :

Here’s part of an EXPLAIN ANALYZE from Postgres 9.6:

->  Bitmap Heap Scan on cities  (cost=90.05..806.49 rows=265 width=4) (actual time=4.733..45.772 rows=17 loops=1)
       Recheck Cond: (regexp_replace(regexp_replace(replace(replace(replace(replace(lower(name), 'ä'::text, 'ae'::text), 'ö'::text, 'oe'::text), 'ü'::text, 'ue'::text), 'ß'::text, 'ss'::text), 'strasseM'::text, 'strasse'::text, 'g'::text), 'W'::text, ''::text, 'g'::text) % 'coerde'::text)
      Rows Removed by Index Recheck: 567
      Heap Blocks: exact=497
      ->  Bitmap Index Scan on city_lookup_index  (cost=0.00..89.98 rows=265 width=0) (actual time=4.229..4.229 rows=584 loops=1)
             Index Cond: (regexp_replace(regexp_replace(replace(replace(replace(replace(lower(name), 'ä'::text, 'ae'::text), 'ö'::text, 'oe'::text), 'ü'::text, 'ue'::text), 'ß'::text, 'ss'::text), 'strasseM'::text, 'strasse'::text, 'g'::text), 'W'::text, ''::text, 'g'::text) % 'coerde'::text)

What does Rows Removed by Index Recheck mean?

Or in other words: The bitmap is non-lossy (only exact pages), so why did it contain tuple pointers that were (presumably) removed by the recheck?

Answer :

Some index strategies don’t definitely conclude that a tuple meets the criteria. They can definitely and rapidly eliminate most tuples which provably can’t meet the criteria (which is where the performance gain comes from) but some of the ones that pass might be false positives, so need to be rechecked.

I guess you are using gin_trgm_ops, which is an example of such an indexing strategy.

Leave a Reply

Your email address will not be published. Required fields are marked *