Would index lookup be noticeably faster with char vs varchar when all values are 36 chars

Posted on

Question :

I have a legacy schema (disclaimer!) that uses a hash-based generated id for the primary key for all tables (there are many). An example of such an id is:

922475bb-ad93-43ee-9487-d2671b886479

There is no possible hope of changing this approach, however performance with index access is poor. Setting aside the myriad of reasons this might be, there is one thing I noticed that seemed less than optimal – despite all id values in all many tables being exactly 36 characters in length, the column type is varchar(36), not char(36).

Would altering the column types to fixed length char(36) offer any significant index performance benefits, beyond the very small increase in the number of entries per index page etc?

Ie does postgres perform much faster when dealing with fixed-length types than with variable length types?

Please don’t mention the minuscule storage saving – that’s not going be matter compared with the surgery required to make the change to the columns.

Answer :

No. No gain at all. The manual explicitly states:

Tip: There is no performance difference among these three types, apart
from increased storage space when using the blank-padded type, and a
few extra CPU cycles to check the length when storing into a
length-constrained column. While character(n) has performance
advantages in some other database systems, there is no such advantage
in PostgreSQL; in fact character(n) is usually the slowest of the
three because of its additional storage costs. In most situations text
or character varying should be used instead
.

Bold emphasis mine.

char(n) is a largely outdated, useless type. Stick with varchar(n). Without need to enforce a maximum length, varchar or text are a tiny bit faster, with fewer complications.

If all strings are exactly 36 characters in length, there is no storage saving either way, not even a minuscule one. Both have exactly the same size on disk and in RAM. You can test with pg_column_size() (on an expression and on a table column).

And if all strings must have 36 characters, rather make it text with a CHECK (length(col) = 36) constraint enforcing exact length, not varchar(36) only enforcing max. length. See:

You didn’t ask for other options, but I’ll mention two:

1. COLLATION

Unless you are running your DB with the “C” collation. Collation is often overlooked and possibly expensive. Since your strings don’t seem to be meaningful in a natural language, there is probably no point in following COLLATION rules. Related:

Extensive benchmark comparing (among other) the effect of COLLATE "C" on performance:

2. UUID

Your string suspiciously looks like a UUID (32 hex digits separated by 4 delimiters in canonical way). It’s much more efficient to store UUIDs as actual uuid data type: faster in multiple ways, and only occupies 16 bytes per UUID – as opposed to 37 bytes in RAM for either char(36) or varchar(36) (stored without delimiters, just the 32 defining characters), or 33 bytes on disk. But alignment padding would result in 40 bytes either way in many cases.) COLLATION is irrelevant for the uuid data type, too.

SELECT '922475bb-ad93-43ee-9487-d2671b886479'::uuid

This may be helpful (last chapters):

See also:

Leave a Reply

Your email address will not be published.