Monday, 28 November 2011

Speeding up batch SQL Processes that use temporary table variables

Problems with TABLE variables and how to improve performance for SQL Batch Jobs


We recently moved our production system to SQL 2008. Everything seemed okay until we realised after a fortnight that a particular MS Agent job had not been completing correctly.

One of the steps within the job had been failing due to a missing SQL Login. This job transferred large amounts of data from daily tables to historical ones and a large backlog had now been created numbering in the tens of millions.

Due to the use of our batch delete process that removed data that had been transferred in small chunks (500 rows) to reduce locking the data had got to a stage where new records were being added almost as fast as we were removing them.

When I looked at the query that was doing the insert and then delete I saw that it was using a temporary table variable to hold the temporary data whilst it was being transformed.As this was now 20 million plus records it was a problem on it's own due to it's limitations.

This is a common SQL Performance bottleneck and one I have come across a few times now.

SQL's TABLE variables are very good for small datasets (a few hundred at most) and are very useful for array like usage within stored procedures. However when you move to large amounts of records they just become a performance nightmare.

Unlike proper temporary tables or fixed permanent tables you cannot add indexes to TABLE variables and with large record sizes this is usually a must. The constant table scans the process must have been having to do to find rows must have been a major cause of the slow performance.

I rewrote the stored proc to make use of a temporary table, added a clustered index on the columns I was using for my joins to the real table and I used a batch DELETE process using the TOP (@X) statement to remove old records.

This has sped the process up immensely but the automatic creation of indexes and creation and dropping of tables requires permissions higher than those that the website login I use for my site has.

Therefore to ensure everything ran smoothly I had to use the WITH EXECUTE AS 'login' statement to allow the user to impersonate a login with higher privileges.

A cut down example of the proc is below and it shows the following:

  1. Impersonating logins with higher permissions to allow for DDL statements e.g CREATE and DROP.
  2. Checking TEMP DB for existing temporary tables and indexes and dropping them if they exist.
  3. Deleting large numbers of records in batches to prevent blocking.
  4. Ensuring a SARGABLE where clause is used instead of DATEDIFF to get yesterdays data



CREATE PROCEDURE [dbo].[usp_sql_job_data_transfer] 
 @Success BIT = 0 OUTPUT,
 @RowsCopied INT = 0 OUTPUT,
 @RowsDeleted INT = 0 OUTPUT

WITH EXECUTE AS 'admin_user'
AS
  BEGIN

  SET NOCOUNT ON
  SET DATEFORMAT YMD

  DECLARE @Stamp datetime, 
   @InsertDate datetime,
   @TmpRows int,
   @Rows int,  
   @RowsInserted int,
   @error int
   

  -- check for existance of old temp table, a local temp table wont persist but we may need to use a global one (Example usage)
  IF object_id('tempdb..#JobHits') IS NOT NULL
    BEGIN
    -- drop it
      
    DROP TABLE #JobHits
    END 
 
  -- create temp table   
  CREATE TABLE #JobHits(
   JobFK int, 
   Stamp Datetime, 
   ViewHits int, 
   SearchHits int
  )

  -- job runs after midnight and we want a SARGABLE WHERE clause
  SELECT @Stamp = CONVERT(datetime,CONVERT(varchar,getdate(),23))

   
  SELECT @TmpRows = 0, @RowsInserted = 0, @RowsDeleted = 0  

  -- insert into my temp table the search hits by date (SQL 2005)
  -- if this proves to be still slow or causing blocks then try inserting in batches of 1000 OR 20 min chunks
  INSERT INTO #JobHits
  (JobFK, Stamp, ViewHits, SearchHits)
  SELECT JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23)),0, count(jobFk) 
  FROM JOBS_DATA with (nolock)
  WHERE HitType = 'S' 
   AND Stamp < @Stamp
  GROUP BY JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23))

  SELECT @TmpRows = @@ROWCOUNT, @Error = @@ERROR
  
  IF @Error <> 0
    GOTO HANDLE_ERROR

  -- insert into my temp table view hits
  INSERT INTO #JobHits
  (JobFK, Stamp, ViewHits, SearchHits)
  SELECT JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23)),count(jobFk), 0
  FROM JOBS_DATA with (nolock)
  WHERE HitType = 'V' AND 
   Stamp < @Stamp -- SARGABLE WHERE CLAUSE - note I am not using DATEDIFF
  GROUP BY JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23))

  SELECT @TmpRows = @TmpRows + @@ROWCOUNT, @Error = @@ERROR


  -- if an error occurred jump to the error handler
  IF @Error <> 0 
    GOTO HANDLE_ERROR
  --If no error but no rows then its just a bad day for the site with no hits
  ELSE IF @TmpRows = 0  
    GOTO EXIT_PROC
  ELSE
    BEGIN
     
    -- add an index to aid lookups and searching 

    -- ensure no record exists already in Temp DB
    IF object_id('tempdb..clidx_#JobHits') IS NOT NULL 
      BEGIN
      
     DROP INDEX clidx_#JobHits ON #JobHits

      END

       
    -- Add a clustered index to help searching - cover the main join column JobFk and group by column Stamp 
    CREATE CLUSTERED INDEX [clidx_#JobHits] ON #JobHits
    (
     [JobFK] ASC,
     [Stamp] ASC    
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 100) ON [PRIMARY]
    
    
    END
  
  
  -- join my temp table to the main table to get info about the company the job belonged to
  INSERT INTO JOBS_DATA_HISTORY
  (JobFk,Stamp,ViewHits, SearchHits, ClientFK)
  SELECT a.JobFK, a.Stamp, Sum(a.ViewHits), Sum(a.SearchHits), j.ClientFK
  FROM  #JobHits as a
  JOIN  JOBS as j with (nolock)
   ON  a.JobFk = j.JobPK
  GROUP BY a.JobFK, a.Stamp, j.ClientFk

  SELECT @RowsInserted = @@rowcount, @error = @@ERROR
  

  --if we are here then there must be > 0 rows otherwise we would have exited earlier
  IF @RowsInserted=0 or @Error<>0
    GOTO HANDLE_ERROR
 

  -- Now we have copied our data we need to delete it all from the daily table
  -- as this table is being used by the site still we delete in batches to prevent blocking locks
  SELECT @Rows = 1
  
  -- loop until no more rows are left
  WHILE @Rows > 0
    BEGIN 
    -- delete data in table using the TOP command to ensure we only delete on our indexed column in batches
    DELETE TOP(1000)
    FROM JOBS_DATA
    WHERE Stamp < @Stamp -- indexed column in JOBS_DATA
      
    SELECT @Rows = @@ROWCOUNT, @RowsDeleted = @RowsDeleted + @Rows, @Error = @@ERROR

    -- should i wait for a second in between batches? cannot find a definitive answer
    -- if problems occur (which they haven't so far) then try this
    --WAITFOR DELAY '00:00:01' 
    
    IF @Error <> 0 
      GOTO HANDLE_ERROR
    END

  --if no rows were deleted then something went pete tong
  IF @RowsDeleted=0
    GOTO HANDLE_ERROR
  END

-- clean up
CLEAN_UP:
    
   -- add an index to aid lookups and searching (not sure if when table is dropped the index record remains or not so being safe - check this!)
  IF object_id('tempdb..clidx_#JobHits') IS NOT NULL 
    BEGIN
  -- drop our clustered index
  DROP INDEX clidx_#JobHits ON #JobHits

    END
    
  IF object_id('tempdb..#JobHits') IS NOT NULL
    BEGIN
  -- drop our temp table  
  DROP TABLE #JobHits
    END 
  
  
  -- jump over error handler and exit 
  GOTO EXIT_PROC
  

--handle error
HANDLE_ERROR:

  SELECT @Success = 0

  IF @FromProc = 0
    SELECT @Success as success, COALESCE(@RowsCopied,0) as rowscopied,COALESCE(@RowsDeleted,0) as rowsDeleted

  -- exit
  RETURN 0



--handle a successful exit
EXIT_PROC:
  SELECT @Success = 1, @RowsCopied = @RowsInserted

  -- exit
  RETURN 1

No comments: