Large NOT IN Statement vs Creating/Populating Temp Table

Posted on

Question :

I have a large daily data feed which can contain up to around 5,000 ID’s. I need to delete all pre-existing records in the DB that are not referenced in the new feed. Should I create a temporary table for these 5,000 ID’s, and run a LEFT JOIN IF NULL DELETE statement? Or, is it all the same if I just pass the ID’s over and run a DELETE IF NOT IN statement. My assumption is that MySQL would do the work of creating the temp table for me when using IN statements.

Answer :

With a large set you may want to use an NOT EXISTS clause rather than a NOT IN clause. It will help if you index the id on your feed table. Using an index will allow the optimizer to use the index rather than reading the table. Try a query like:

DELETE
FROM    table
WHERE NOT EXISTS (
    SELECT  feed.id
    FROM    feed
    WHERE   feed.id = table.id )

Leave a Reply

Your email address will not be published.