We recently moved our production system to SQL 2008. Everything seemed okay until we realised after a fortnight that a particular MS Agent job had not been completing correctly.
One of the steps within the job had been failing due to a missing SQL Login. This job transferred large amounts of data from daily tables to historical ones and a large backlog had now been created numbering in the tens of millions.
Due to the use of our batch delete process that removed data that had been transferred in small chunks (500 rows) to reduce locking the data had got to a stage where new records were being added almost as fast as we were removing them.
When I looked at the query that was doing the insert and then delete I saw that it was using a temporary table variable to hold the temporary data whilst it was being transformed.As this was now 20 million plus records it was a problem on it's own due to it's limitations.
This is a common SQL Performance bottleneck and one I have come across a few times now.
SQL's TABLE variables are very good for small datasets (a few hundred at most) and are very useful for array like usage within stored procedures. However when you move to large amounts of records they just become a performance nightmare.
Unlike proper temporary tables or fixed permanent tables you cannot add indexes to TABLE variables and with large record sizes this is usually a must. The constant table scans the process must have been having to do to find rows must have been a major cause of the slow performance.
I rewrote the stored proc to make use of a temporary table, added a clustered index on the columns I was using for my joins to the real table and I used a batch DELETE process using the TOP (@X) statement to remove old records.
This has sped the process up immensely but the automatic creation of indexes and creation and dropping of tables requires permissions higher than those that the website login I use for my site has.
Therefore to ensure everything ran smoothly I had to use the WITH EXECUTE AS 'login' statement to allow the user to impersonate a login with higher privileges.
A cut down example of the proc is below and it shows the following:
- Impersonating logins with higher permissions to allow for DDL statements e.g CREATE and DROP.
- Checking TEMP DB for existing temporary tables and indexes and dropping them if they exist.
- Deleting large numbers of records in batches to prevent blocking.
- Ensuring a SARGABLE where clause is used instead of DATEDIFF to get yesterdays data
CREATE PROCEDURE [dbo].[usp_sql_job_data_transfer]
@Success BIT = 0 OUTPUT,
@RowsCopied INT = 0 OUTPUT,
@RowsDeleted INT = 0 OUTPUT
WITH EXECUTE AS 'admin_user'
AS
BEGIN
SET NOCOUNT ON
SET DATEFORMAT YMD
DECLARE @Stamp datetime,
@InsertDate datetime,
@TmpRows int,
@Rows int,
@RowsInserted int,
@error int
-- check for existance of old temp table, a local temp table wont persist but we may need to use a global one (Example usage)
IF object_id('tempdb..#JobHits') IS NOT NULL
BEGIN
-- drop it
DROP TABLE #JobHits
END
-- create temp table
CREATE TABLE #JobHits(
JobFK int,
Stamp Datetime,
ViewHits int,
SearchHits int
)
-- job runs after midnight and we want a SARGABLE WHERE clause
SELECT @Stamp = CONVERT(datetime,CONVERT(varchar,getdate(),23))
SELECT @TmpRows = 0, @RowsInserted = 0, @RowsDeleted = 0
-- insert into my temp table the search hits by date (SQL 2005)
-- if this proves to be still slow or causing blocks then try inserting in batches of 1000 OR 20 min chunks
INSERT INTO #JobHits
(JobFK, Stamp, ViewHits, SearchHits)
SELECT JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23)),0, count(jobFk)
FROM JOBS_DATA with (nolock)
WHERE HitType = 'S'
AND Stamp < @Stamp
GROUP BY JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23))
SELECT @TmpRows = @@ROWCOUNT, @Error = @@ERROR
IF @Error <> 0
GOTO HANDLE_ERROR
-- insert into my temp table view hits
INSERT INTO #JobHits
(JobFK, Stamp, ViewHits, SearchHits)
SELECT JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23)),count(jobFk), 0
FROM JOBS_DATA with (nolock)
WHERE HitType = 'V' AND
Stamp < @Stamp -- SARGABLE WHERE CLAUSE - note I am not using DATEDIFF
GROUP BY JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23))
SELECT @TmpRows = @TmpRows + @@ROWCOUNT, @Error = @@ERROR
-- if an error occurred jump to the error handler
IF @Error <> 0
GOTO HANDLE_ERROR
--If no error but no rows then its just a bad day for the site with no hits
ELSE IF @TmpRows = 0
GOTO EXIT_PROC
ELSE
BEGIN
-- add an index to aid lookups and searching
-- ensure no record exists already in Temp DB
IF object_id('tempdb..clidx_#JobHits') IS NOT NULL
BEGIN
DROP INDEX clidx_#JobHits ON #JobHits
END
-- Add a clustered index to help searching - cover the main join column JobFk and group by column Stamp
CREATE CLUSTERED INDEX [clidx_#JobHits] ON #JobHits
(
[JobFK] ASC,
[Stamp] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY]
END
-- join my temp table to the main table to get info about the company the job belonged to
INSERT INTO JOBS_DATA_HISTORY
(JobFk,Stamp,ViewHits, SearchHits, ClientFK)
SELECT a.JobFK, a.Stamp, Sum(a.ViewHits), Sum(a.SearchHits), j.ClientFK
FROM #JobHits as a
JOIN JOBS as j with (nolock)
ON a.JobFk = j.JobPK
GROUP BY a.JobFK, a.Stamp, j.ClientFk
SELECT @RowsInserted = @@rowcount, @error = @@ERROR
--if we are here then there must be > 0 rows otherwise we would have exited earlier
IF @RowsInserted=0 or @Error<>0
GOTO HANDLE_ERROR
-- Now we have copied our data we need to delete it all from the daily table
-- as this table is being used by the site still we delete in batches to prevent blocking locks
SELECT @Rows = 1
-- loop until no more rows are left
WHILE @Rows > 0
BEGIN
-- delete data in table using the TOP command to ensure we only delete on our indexed column in batches
DELETE TOP(1000)
FROM JOBS_DATA
WHERE Stamp < @Stamp -- indexed column in JOBS_DATA
SELECT @Rows = @@ROWCOUNT, @RowsDeleted = @RowsDeleted + @Rows, @Error = @@ERROR
-- should i wait for a second in between batches? cannot find a definitive answer
-- if problems occur (which they haven't so far) then try this
--WAITFOR DELAY '00:00:01'
IF @Error <> 0
GOTO HANDLE_ERROR
END
--if no rows were deleted then something went pete tong
IF @RowsDeleted=0
GOTO HANDLE_ERROR
END
-- clean up
CLEAN_UP:
-- add an index to aid lookups and searching (not sure if when table is dropped the index record remains or not so being safe - check this!)
IF object_id('tempdb..clidx_#JobHits') IS NOT NULL
BEGIN
-- drop our clustered index
DROP INDEX clidx_#JobHits ON #JobHits
END
IF object_id('tempdb..#JobHits') IS NOT NULL
BEGIN
-- drop our temp table
DROP TABLE #JobHits
END
-- jump over error handler and exit
GOTO EXIT_PROC
--handle error
HANDLE_ERROR:
SELECT @Success = 0
IF @FromProc = 0
SELECT @Success as success, COALESCE(@RowsCopied,0) as rowscopied,COALESCE(@RowsDeleted,0) as rowsDeleted
-- exit
RETURN 0
--handle a successful exit
EXIT_PROC:
SELECT @Success = 1, @RowsCopied = @RowsInserted
-- exit
RETURN 1
No comments:
Post a Comment