Monday, August 31, 2020

SQL Server - Bulk Deletes Revisited

I wrote a blog post about 5 years ago about how to delete data in a database (for example, to clean out old data) in smaller chunks at a time, so as not to overwhelm the log file size. Please read it first if you haven't read it recently:

https://geek-goddess-bonnie.blogspot.com/2015/07/sql-server-chunking-large-deletes.html

I wasn't totally happy with how I ended up doing the deletes in that previous blog post. I also never took into account the effect of changing the AutoGrowth and MaxSize properties. Plus, there are various ways that these deletes can be done. I'll use the date as the condition to delete by:
  • Method #1
    • Do the DELETE by the deletion criteria in the loop:
    • DELETE TOP(X_number) MyTable WHERE saveddatetime < @time
  • Method #2
    • Select the deletion criteria into a Table parameter first. Then, in the loop, DELETE TOP()
    • DECLARE @keysToDelete TABLE (keys bigint)
    • INSERT INTO @keysToDelete SELECT TheKey FROM MyTable WHERE saveddatetime < @time
    • -- in the loop:
    • DELETE TOP (X_number) MyTable WHERE TheKey IN (SELECT keys FROM @keysToDelete)
  • Method #3
    • Do the SELECT into the Table parameter and the DELETE both inside the loop
    • INSERT INTO @keysToDelete SELECT TOP (X_number) TheKey FROM MyTable WHERE saveddatetime < @time
    • DELETE MyTable WHERE TheKey IN (SELECT keys FROM @keysToDelete)
Here is the code that I used to do the testing. I tested one method at a time, commenting out the other two and restored the database in between each test (obviously) and changing the AutoGrowth when necessary for the each set of tests.

DECLARE @StartTime datetime 
DECLARE @LoopStart datetime
DECLARE @Elapsed int

DECLARE @DaysToKeep int = 7

-- @DaysToKeep is passed into the SP as a parameter
IF @DaysToKeep > 0
SET @DaysToKeep = -@DaysToKeep -- need to make it negative

SET NOCOUNT ON

DECLARE @time datetime
DECLARE @rc INT
DECLARE @LatestTime datetime

SELECT @LatestTime = MAX(saveddatetime) FROM message
SET @time = DATEADD(day, @delay, @LatestTime)

-- method #1
SET @rc = 1
SET @StartTime = GETDATE()
WHILE @rc > 0
BEGIN
BEGIN TRANSACTION
SET @LoopStart = GETDATE()
--DELETE TOP (10000) message WHERE saveddatetime < @time
DELETE TOP (100000) message WHERE saveddatetime < @time
SET @rc = @@ROWCOUNT
COMMIT TRANSACTION
CHECKPOINT

SET @Elapsed = DATEDIFF(MILLISECOND, @LoopStart, GETDATE())
RAISERROR ('Elapsed %d milliseconds to delete %d rows', 0, 0, @Elapsed, @rc) WITH NOWAIT
END

SET @Elapsed = DATEDIFF(MILLISECOND, @StartTime, GETDATE())
RAISERROR ('TOTAL Elapsed %d milliseconds', 0, 0, @Elapsed) WITH NOWAIT


-- method #2
DECLARE @keysToDelete TABLE (keys bigint) -- this is also needed in method #3

SET @StartTime = GETDATE()
INSERT INTO @keysToDelete SELECT messagekey FROM message WHERE saveddatetime < @time
SET @rc = 1
WHILE @rc > 0
BEGIN
BEGIN TRANSACTION
SET @LoopStart = GETDATE()
--DELETE TOP (10000) message WHERE messagekey IN (SELECT keys FROM @keysToDelete)
DELETE TOP (100000) message WHERE messagekey IN (SELECT keys FROM @keysToDelete)
SET @rc = @@ROWCOUNT
COMMIT TRANSACTION
CHECKPOINT

SET @Elapsed = DATEDIFF(MILLISECOND, @LoopStart, GETDATE())
RAISERROR ('Elapsed %d milliseconds to delete %d rows', 0, 0, @Elapsed, @rc) WITH NOWAIT
END

SET @Elapsed = DATEDIFF(MILLISECOND, @StartTime, GETDATE())
RAISERROR ('TOTAL Elapsed %d milliseconds', 0, 0, @Elapsed) WITH NOWAIT


-- method #3
SET @rc = 1
SET @StartTime = GETDATE()
WHILE @rc > 0
BEGIN
BEGIN TRANSACTION
SET @LoopStart = GETDATE()
--INSERT INTO @keysToDelete SELECT TOP (10000) messagekey FROM message WHERE saveddatetime < @time
INSERT INTO @keysToDelete SELECT TOP (100000) messagekey FROM message WHERE saveddatetime < @time
DELETE message WHERE messagekey IN (SELECT keys FROM @keysToDelete)
SET @rc = @@ROWCOUNT
DELETE @keysToDelete
COMMIT TRANSACTION
CHECKPOINT

SET @Elapsed = DATEDIFF(MILLISECOND, @LoopStart, GETDATE())
RAISERROR ('Elapsed %d milliseconds to delete %d rows', 0, 0, @Elapsed, @rc) WITH NOWAIT
END

SET @Elapsed = DATEDIFF(MILLISECOND, @StartTime, GETDATE())
RAISERROR ('TOTAL Elapsed %d milliseconds', 0, 0, @Elapsed) WITH NOWAIT


For the purposes of this new blog post, we need to see how long the deletes take for each of the three methods and with two different log file Autogrowth settings. As you’ll see in the table below, the smaller Autogrowth setting meant that the log file grew many more times. Each time the log files grows, there’s a performance hit. So, you want to minimize the number of times it grows.

If you routinely have a lot of rows to delete, like 500,000 or a million or more, then you should think about deleting in larger chunks, but be aware that the more rows that you delete at one time, the larger the database log file will grow. Be sure to account for that by not limiting your log file to too small a size.

For my testing, I have a database table with 2 million rows and will be deleting approximately 1.8 million rows (which will leave 200,000 rows of the most recent data). This is from a test system that we had been hitting for months without a cleanup in place yet. I finally got the cleanup stored procedures to my liking and figured it was time to do it. There was a lot of tweaking going on as I tested this off and on (in between other work that had more priority), but what I learned is the subject of this blog.

The test database started with a 18 MB log file and a maxsize of 1000MB

Log File Growth of 1 MB
Deleting 100,000 at a time
Log File Growth of 1 MB
Deleting 10,000 at a time
Method #1 (log grew 796 times to 814 MB)
TOTAL Elapsed 226777 milliseconds
(3:48)
Method #1 (log grew 165 times to 183 MB)
TOTAL Elapsed 303567 milliseconds
(5:05)
Method #2 (log grew 796 times to 814 MB)
OTAL Elapsed 603477 milliseconds
(10:05)
Method #2 (log grew 110 times to 128 MB)
TOTAL Elapsed 1147310 milliseconds
(19:09)
Method #3 (log grew 940 times to 958 MB)
TOTAL Elapsed 610474 milliseconds
(10:12)
Method #3 (log grew 157 times to 175 MB)
TOTAL Elapsed 580877 milliseconds
(9:43)
   
Log File Growth of 32 MB
Deleting 100,000 at a time
Log File Growth of 32 MB
Deleting 10,000 at a time
Method #1 (log grew 28 times to 914 MB)
TOTAL Elapsed 206517 milliseconds
(3:28)
Method #1 (log grew 9 times to 306 MB)
TOTAL Elapsed 286067 milliseconds
(4:48)
Method #2 (log grew 25 times to 818 MB)
TOTAL Elapsed 605470 milliseconds
(10:07)
Method #2 (log grew 9 times to 306 MB)
TOTAL Elapsed 1196263 milliseconds
(19:58)
Method #3 (log grew 29  times to 946 MB)
TOTAL Elapsed 607330 milliseconds
(10:09)
Method #3 (log grew 8 times to 274 MB)
TOTAL Elapsed 599856 milliseconds
(10:01)

Notice the huge difference in time with Method #2 when deleting 100,000 vs 10,000. In both Autogrowth scenarios, the time it took to delete nearly doubled (10 minutes vs 19 to 20 minutes). That was unexpected, and I have no idea why. Very strange.

The times for all 3 methods were very similar between the 1 MB and 32 MB Autogrowth settings. However, I would avoid the smaller Autogrowth setting due to how many times the log file grows (as I said earlier, that can be bad for performance). I used the 1MB setting in this test because the database I tested with defaulted to that, which is too small I think. From the Googling I’ve done, 32 MB seemed reasonable.

The takeaway from all of this is that your individual preferences for your own databases will influence how you handle this:

If you prefer that this run as quickly as possible and you don’t mind if your log file is large, then you’ll want to do something along the lines of Method #1 (and, if you do the 10,000 delete with that methodology you even get a decent size log file).

If you’d rather keep your log file as small as possible, and you don’t mind if it takes longer, then Method #3 with 10,000 deletes seems to be the way to go.

Now, all that said, these 3 Methods are fairly simplistic in their approach. Your own delete conditions may involve much more complicated SELECTs with JOINs and who knows what else. Some of my real-life conditions were simple, but some were very, very complex. YMMV

One last note: if you are running this cleanup processor from a .NET application, be sure to run it in its own thread, so that it can keep on chugging until it’s done without interrupting anything else. If you set the SqlCommand.CommandTimeout = 0, then you won’t have to worry about your app crashing with a SqlTimeout exception.

2 comments:

  1. Hi Bonnie,

    Can you try replicating the tests for methods 2 and 3 but use a temp table this time. I have a sneaking suspicion that 2 performs so much worse than 3 because the optimization engine has a hard time coming up with a good plan on the table variable when it isn't just using the full table variable at once like in method 3. Additionally moving them to temp table in general might help close the gap from method 1 as well for similar reasons.

    ReplyDelete
    Replies
    1. Hi Fish, (not your real name, I'm sure --- ha ha)

      I think that I tried using a temp table at some point. Not when I was actually running the tests for the blog post, probably more like several months ago. If I recall, I didn't think it mattered. But then again, I know that I didn't test it with such a large volume of deletes that I tested with last week, as I was writing the blog.

      It might be worth a look at again --- maybe this weekend if I'm not too busy. Thanks for the suggestion!

      ~~Bonnie

      Delete