MS SQL with insert into select

Today at work I had to debug an MS SQL Stored Procedure that I wrote and ran fine on the internal test systems but not on the customer’s pilot site; as is the case with strange failures. The script was cloning a row so it could be modified later without effecting the original. To do this I performed an insert with a select as below:

insert into exampleTbl
(a, b, c, link) — …etc…
select
a, b, c, @new_entity — …etc…
from
exampleTbl
where
link = @old_entity
and id not in (‘1′,’2′)

The stored procedure contained 3 queries doing similar things to those above (only 1 with the not in); the id was indexed. The query was timing out (even with a timeout of 30 seconds) when executed the C#/ASP.NET website on the customer test system. I was highly confused when I saw this taking so long. What had happened?! The select ran effectively instantaneously on the customer system and only ever returned very few rows a maximum of 5 with all my test data. How could it take more than 10 seconds (unscientifically split between the 3 queries) to retrieve and insert 15 rows?! So the investigation began…

A colleague of mine mentioned that it could be to do with the not in. This would scan all (1.3 million) rows in the table which could take a while even when using the index. Some further investigation discovered that this was not the case as all 3 versions of the insert ran equally as slow – with and without the not in!

I then spent some time trying to find another part of the system inserting into this somewhat populated table and checked to see how it ran expecting equally dogged performance, but this was not the case. The code was similar to that below:

insert into exampleTbl
(guid)
values
(’some guid’)

update exampleTbl
set a = ‘1′, b = ‘hello’, c = ‘dave’ — ..etc…
where guid = ’some guid’

This code generated the guid in the C# inserted it and then updated that row with the rest of the data.

So as this alternate method was running fast it and meant my issue was either to do with the large number of parameters (20-ish) being used during the insert or the use of the select. The former seemed too common to cause such a huge issue even with 1.3 million rows in the table; it must be the select. However I had already found that the select was executing near instantaneously outside of the insert and was very sceptical that creating a table variable would fix the issue. Surely MS SQL was running it just the same inside as out so the issue must lie with the overpopulated insert. But I had to test it as this was all really guess work with no numbers to back it up. So I added the table variable and SHAZAM it worked near instantly! >30 seconds down to <200 milliseconds. WIN! + confusion.

All I can say is that I will never be using a select in an insert statement without first performing any work on a table variable so that it only has what it needs and inserts that. This raises the question of what the query optimiser is doing when it encounters an insert with a select? Surely it can’t be too crazy to ask for a select in an insert to run just as fast outside of it?! Either way it was a fun 2 hour debugging session and I’m glad I got to the bottom of it… eventually.