How to handle too many inserts?

Posted on

Question :

I have a database which is accessed from remote locations in the world with very long delays and poor transfer speeds. I currently transfer the data using a transaction, since either all or no data should be inserted. Currently the values are inserted one at a time. It takes a total of about 37 seconds for 200 inserts before the commit.

The problem is that there may be 10 sites continuously trying to insert data within a 2 minute interval. If they fail (due to concurrency errors) they will retry until they succeed. This adds up to an impossible situation as they will start growing a backlog.

I understand I can use a snapshot isolation level to allow reads in parallel, but how should I handle the parallel insets? How do banks and other systems do it since they should also get a backlog if they do it in a queue?

I use SQL Server and on the remote client I use C# to talk to the SQL Server.

The inserts are of 10 different types and are they are linked in pairs of two. The inserts are connected in pairs. The first insert is a generic one which is the same for all and are inserted into the same table – let’s call it table_A.
Then the second insert is specific depending on the type of insert data. The second insert is made to 1 of 10 tables and the data varies greatly but that table also store the row index of the first insert in table_A. This is how they are linked.

The transfer speed from a good connection is about 1 second and that is no problem. The problem is the remote part that takes about 40 seconds. Table_A contains about 10 million rows and is well indexed. All other tables are somewhat smaller and well indexed.

Answer :

Create a set of staging tables in the target database. Write rows to these as they are generated, which seems to be one or two at a time. This can be inside a transaction. Once the whole batch (200 rows?) is in these staging tables use a stored procedure in the target database to move them en masse from the staging tables to the real ones. Be sure to log, somewhere, a report showing if any records failed validation.

I’ve seen similar problems addressed by decoupling the app from the central server:

  • Remote sites install a local SQL Server Express along with the app
  • Apps talk to the local Express instance. Low latency, good availability.
  • SQL Server Express uses Service Broker to deliver the updates to the central server
  • Service Broker handles the network availability, latency etc.
  • The central server uses Service Broker activation to modify the destination tables based on the updates received from the remote sites.

This way you can handle transparently pretty big backlogs when the sites experience poor network or no network. SSB will take care of the delivery (retry etc) while the remote apps continue to experience local availability times.

SSB isolates your central server data (transactions) from a slow network. There won’t be any data locks while the SSB messages are being exchanged, no matter how slow the transfer.

In one incident I know a remote site that was disconnected for 42 days, kept working, and then resumed delivery and pushed all the updated in about 30 minutes once it was reconnected.

See also this presentation for a similar case.

Leave a Reply

Your email address will not be published. Required fields are marked *