Disk usage of integer column vs boolean column in Postgres

Posted on

Question :

I saw this question Bit vs. Boolean columns.

I’m asking myself the same for Postgres: does a single digit integer column occupy the same disk space of a boolean one? In big tables (~50 columns x ~50 million rows) which one perform best?
How can I find this out?

Answer :

You can find out the storage size with

SELECT typname, typlen FROM pg_type WHERE typname IN ('bool', 'int4');

 typname | typlen 
 bool    |      1
 int4    |      4
(2 rows)

However you need to take alignment into account:

SELECT typname, typlen, typalign FROM pg_type WHERE typname IN ('bool', 'int4');

 typname | typlen | typalign 
 bool    |      1 | c
 int4    |      4 | i
(2 rows)

c is “character” (1 byte), while i is “integer”.

If you define a table like this:

   b boolean,
   i integer

you will get three unused “padding” bytes between the columns, so that the integer can be stored at an address that is divisible by 4.

So the boolean would take up 4 rather than 1 bytes of storage.

If you specify the columns the other way around, the space taken up by the data in each row would only be 5 bytes.

The table row itself (the “tuple”) does not only consist on the raw data, but there is a 23-byte “tuple header” for each row (see the documentation). There may be padding after the header so that the actual tuple data are aligned at a multiple of MAXALIGN (typically 8).

So if you want to optimize your table to use as little storage as possible, you need to take the order of the columns in the table into account.

Leave a Reply

Your email address will not be published. Required fields are marked *