Question :
The problem
I’m trying to reshape a table with many columns. I’m trying to do it independently of the specific table, so I’m trying to do it for any table.
Let’s use a very simple table foo
.
CREATE TABLE foo (id int, a text, b text, c text);
INSERT INTO foo VALUES (1, 'ant', 'cat', 'chimp'), (2, 'grape', 'mint', 'basil');
select * from foo;
| id| a | b | c |
|---|-----|----|-----|
| 1| ant | cat|chimp|
| 2|grape|mint|basil|
I want to transform the column a
, b
and c
to rows.
This query works (for this specific table):
SELECT id,
unnest(array['a', 'b', 'c']) AS colname,
unnest(array[a, b, c]) AS colvalue
FROM foo;
|id|colname|colvalue|
|--|-------|--------|
| 1| a | ant |
| 1| b | cat |
| 1| c | chimp |
| 2| a | grape |
| 2| b | mint |
| 2| c | basil |
But I want to make it generic for any table with lots of columns.
What I’ve already done
To get all the columns I want to transform in rows, I can use:
SELECT column_name FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'foo' and column_name ~ '^[a-z]$';
So using the previous query, I can do the following:
WITH tablecolumns AS (SELECT array_agg( column_name ) as cols FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'foo' and column_name ~ '^[a-z]$')
select id,
unnest( tablecolumns.cols ) AS colname,
unnest( array[a, b, c] ) AS colvalue
FROM foo, tablecolumns;
But I’m not able to replace array[a, b, c]
with something dynamic. If I use:
WITH tablecolumns AS (SELECT array_agg( column_name ) as cols FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'foo' and column_name ~ '^[a-z]$')
select id,
unnest( tablecolumns.cols ) AS colname,
unnest( tablecolumns.cols ) AS colvalue
FROM foo, tablecolumns;
the result is not the values of the columns, but just the name of the columns.
Question
How can I unnest
the values of the columns?
Answer :
There is no way to have the column names being evaluated as columns, as pointed out in this question https://stackoverflow.com/questions/15800367/select-columns-with-particular-column-names-in-postgresql
The way to do it, is using to_jsonb
, as suggested in unnest all columns from a given table, like:
select foo.id, x.*
from foo, jsonb_each_text(to_jsonb(foo)) as x(colname,colvalue)
where x.colname ~ '^[a-z]$';
In my specific case, I was trying to convert all data related to Corona Virus. The original data is available at: https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv. I’ve imported it as table covid
.
It has many columns named after the date, like: '1/22/20','1/23/20','1/24/20','1/25/20','1/26/20', ...
. The first columns are related with the location, so I want to keep them in the reshaped table.
The final query I’m using is:
select id, geom, "province/state" , "country/region", to_date( reportdate, 'MM/DD/YY'), nofcases
from covid, jsonb_each_text(to_jsonb(covid)) as x(reportdate,nofcases)
where x.reportdate ~ '[0-9]+'
order by 5;