Question :
We have a large database with millions of rows of data from ship AIS pings.
We were recently formatting and importing the data from our holder, ping_raw
relation into a ping
relation.
During our last go at this our query was stopped: ERROR: invalid value "2.01" for "YYYY", SQL state: 22007
. We are trying to import a text
string (ping_raw.datetime_pos
) into a timestamp with timezone
field in ping.dt_pos
using the to_datetime()
function (in PostgreSQL 9.3).
Last thing – we searched for any values SELECT * FROM ping_raw WHERE datetime_pos LIKE '%2.0%'
, which returned 0 rows. It seems like all the text values in ping_raw
are formatted YYYY-MM-DD HH24:MI:SS
.
Why is this error coming up? How can we successfully convert text
value ping_raw.datetime_pos
into a timestamp with timezone
datatype in ping.dt_pos
?
Answer :
We figured it out.
Some of our timestamp
text
data was imported in scientific notation (i.e. 2.0160924 E+13
. This threw an error, as we didn’t include the .
in the template format in the to_datetime()
function.
Our solution was to pull the datetime from another column that contained more or less duplicate data.
Point is – the to_datetime()
function in PostgreSQL is very particular and requires data going in to match the format you define. If you’re having a similar issue run a few tests to figure out where your data isn’t matching up to what the function expects.
We also saw that Postgres MAY support a conversion from scientific notation. We couldn’t find anything in 9.3 that did, but it seems like it is on their list of things to implement.