SELECT + INSERT or INSERT alone

Posted on

Question :

I have an application where several devices send data every second to my server. On the database I have a table with the devices and another table with the data sent every second.

The first time a device sends the data the server script should register (INSERT) that device id in the device table and add the data to the data table.

My question is, what would be faster?

  1. To do a SELECT EXISTS query to determine whether an INSERT is required on the device table. If SELECT EXISTS returns false then execute an INSERT, otherwise do nothing.

  2. To execute always an INSERT statement. Considering the device id is primary key on the table, if the device already exists it would return an error, otherwise it would insert it.

Only the first time a device sends data would require an actual INSERT, after that no INSERT would ever be required.

The PostgreSQL version is 9.4.

The table of devices is defined as follows:

CREATE TABLE common.tag
(
  customer integer,
  hostname character varying(150),
  description text,
  model integer,
  configprofile integer,
  id character varying(150) NOT NULL,
  host integer,
  CONSTRAINT tag_pk PRIMARY KEY (id),
  CONSTRAINT tag_fk_client FOREIGN KEY (customer)
      REFERENCES common.customer (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT tag_fk_configprofile FOREIGN KEY (configprofile)
      REFERENCES common.configprofile (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT tag_fk_tagmodel FOREIGN KEY (model)
      REFERENCES common.tagmodel (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)

Answer :

In Postgres 9.5, the most elegant, simple, safe and fast solution is the new UPSERT implementation:

INSERT INTO device (device_id) VALUES (1234567)
ON CONFLICT (device_id) DO NOTHING;

-- now we have either inserted the device or it was there already
INSERT INTO data (device_id, data)
VALUES (1234567, 'some data);

Obviously, there must be a some kind of unique index or exclusion constraint on (device_id).
Details in the manual.

For older versions or for auto-generated serial IDs:

I always try to register devices first and then only they can send data.But in your case ,you can try some thing like below

if exists(select 1 from devices where deviceid=@deviceid)
insert into datatable
else
insert into devicestable
insert into datatable

Exists is always faster,but you may to have see if the column which we check for existence is indexed.Your second option requires error handling (along with rollback,over head)and then do an insert.So its good to use above pseudo code model

In my opinion Select exists should be good in terms of performance. As select is faster than insert in terms of performance.Select only reads data and return the result, The insert query must read that data, insert in the table, then it check for primary key and found guilty of duplicate, then it revert the changes and raise an error.

Leave a Reply

Your email address will not be published. Required fields are marked *