Slow insert in SQL Server 2008

Posted on

Question :

Our products’ database contains an audit trail table:

CREATE TABLE gn_AuditTable(
  gn_ObjectId int NULL,
  gn_Action smallint NULL,
  gn_Time datetime NULL,
  gn_UserId int NULL,
  gn_Login varbinary(16) NULL,
  gn_ExtraObjectId int NULL,
  gn_ExtraInfo int NULL
) ON [PRIMARY]

We add rows to this table with a simple insert statement – no stored procedures:

insert into gn_AuditTable (
  gn_ObjectId,gn_Time,gn_UserId,gn_Login,gn_Action,gn_ExtraObjectId,gn_ExtraInfo
) values (
  ?,?,?,?,?,?,?
)

passing the values of the columns as parameters to the query; there are also no triggers.

The problem is that sometimes this insert is super-slow – taking in some cases more than 30 seconds and timing-out.

We have multiple installations of the same product, and we are experienceing the problem in some – but not all – of them.

The audit table can be quite big – up to some million of rows. The very slow insert happens with databases that have big tables (quite naturally) – but we have other databases with simliar number of rows that run fine, and even the databases experiencing the problem have it intermittently, in most cases the insert is reasonably fast (sub-second).

The table has 6 non-clustered, non-unique indexes:

  • gn_Login;

  • gn_ObjectId;

  • gn_Action and gn_ExtraInfo;

  • gn_ExtraObjectId;

  • gn_Time;

  • gn_UserId.

Having all these indexes could cause the problem, but again, we have the same indexes everywhere and the problem only somewhere / sometimes.

We had the insert timing-out at least once on a test database with only one user, so the problem does not appear to be related to the overall load but to the insert and the table themselves.

We used to have a clustered index on gn_ObjectId, but we got rid of it as an attempt to solve this problem. gn_ObjectId is not sequential and so we thought that it is a poor choice for a clustered index, forcing SQL Server to re-order rows on inserts. Probably gn_Time – that is always increasing – would be a better choice.

There is no primary key because there is no column or combination of columns that is ‘naturally’ unique – we would have to add an id column, and it seems just overhead.

Does anyone have any suggestion at what we could look at to diagnose the problem? What is an insert doing that can take so much time – intermittently?

Answer :

Why no clustered index? Why no primary key?

This is most likely your problem: you don’t have any order to the table (in the sense of, say, an IDENTITY column) this you are inserting into a heap

See

Edit, after update.

Having a non-unique clustered index requires a uniquifier (DBA.se links) per duplicate gn_ObjectId entry. This is not a good clustering key, which is best when it is

  • numeric
  • monotonically increasing
  • narrow

I also assume you haven’t tried a IDENTITY column then…? Try it, report back..

Edit 2, as @JNK says, an ID column is not an overhead in this situation because of how your writes will be managed on disk

Is your database large enough? Are you sure your INSERT don’t trigger auto-growth? If your deployements don’t have Instant File Initialization enabled then this is exactly the behavior one would expect when a database file growth is triggered: random blocking of writes for the duration of file growth and initialization. You could also be experiencing log file growth (this at least can eb quickly identified by looking at the Log Growths performance counter) but for log growths to occur you need to have a non-simple recovery model and a backup strategy that is basically absent (ie. log truncation does not occur).

As for the design of your table: time series are usually clustered by the time value, because most queries cover time ranges (eg. ‘select all events in the past 3 days’) so that would make gn_Time your likely non-unique clustered key candidate.

We found the problem:

  • there is a maintenance procedure that removes data from the database, including rows from gn_AuditTable;

  • the delete gn_AuditTable . . is in a transaction together with other statements (bad)

  • it can happen that quite a lot of rows are deleted in one go – in such a case SQL Server locks the entire table – and then insert gn_AuditTable time-out waiting for the transaction to finish

We discovered this because we were able to have a look at the locks when we got a time-out, and we noticed the table lock. We are moving the delete gn_AuditTable . . . out of the transaction, and possible split it into smaller batches, not to trigger a table lock.

We are implementing a clustered index on gn_Time and the Instant File Initialization – both are beneficial independently from this specific problem – thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *