Question :
I saw this question Bit vs. Boolean columns.
I’m asking myself the same for Postgres: does a single digit integer column occupy the same disk space of a boolean one? In big tables (~50 columns x ~50 million rows) which one perform best?
How can I find this out?
Answer :
You can find out the storage size with
SELECT typname, typlen FROM pg_type WHERE typname IN ('bool', 'int4');
typname | typlen
---------+--------
bool | 1
int4 | 4
(2 rows)
However you need to take alignment into account:
SELECT typname, typlen, typalign FROM pg_type WHERE typname IN ('bool', 'int4');
typname | typlen | typalign
---------+--------+----------
bool | 1 | c
int4 | 4 | i
(2 rows)
c
is “character” (1 byte), while i
is “integer”.
If you define a table like this:
CREATE TABLE a (
b boolean,
i integer
);
you will get three unused “padding” bytes between the columns, so that the integer can be stored at an address that is divisible by 4.
So the boolean
would take up 4 rather than 1 bytes of storage.
If you specify the columns the other way around, the space taken up by the data in each row would only be 5 bytes.
The table row itself (the “tuple”) does not only consist on the raw data, but there is a 23-byte “tuple header” for each row (see the documentation). There may be padding after the header so that the actual tuple data are aligned at a multiple of MAXALIGN
(typically 8).
So if you want to optimize your table to use as little storage as possible, you need to take the order of the columns in the table into account.