There are times when you may need to carry out UPDATES on large tables that are in use and constantly being inserted, deleted or updated.
If you carry out a large UPDATE that affects all of the rows in the table then the table will be locked for the duration of the update and any other processes that may need to carry out DML statements will be BLOCKED from doing so.
You may even experience deadlocks but you will most definitely experience performance issues and if any SELECT statements that access the data don't use a WITH (NOLOCK) statement they too will have to wait in line for the UPDATE to finish.
Obviously wrapping WITH (NOLOCK) onto every SELECT statement is not a good solution unless you know what you are doing as it will provide dirty reads and you may end up giving your users old data. This might be fine for some scenarios but in critical applications where data integrity is key then you need another solution that provides data integrity and allows you to UPDATE the table without the performance problems.
When I find myself requiring the need to UPDATE every record in a large table I use a BATCH UPDATE process which cuts the large UPDATE statement down into lots of small UPDATES that affect only a few rows at a time.
For example instead of the whole table being locked for an hour with lots of blocked processes building up behind waiting for it to finish it would instead only be locked for lots of little time periods.
These smaller locking periods allow other process in to do their work and if the batch size is small enough and you have appropriate indexes you might find that you won't experience a full table lock anyway.
There are various methods for carrying out this approach and you should tailor your BATCH SIZE to your own requirements. Before SQL 2005 you could use the: SET NOCOUNT 50 command to set the size of the batch but in SQL 2005 and beyond you can use a variable directly with an UPDATE TOP (@VAR) command.
SET NOCOUNT ON
SET DATEFORMAT YMD
DECLARE @ROWS INT, @TOTALROWS INT, @BATCHSIZE INT
SELECT @ROWS = 1,
@TOTALROWS = 0,
@BATCHSIZE = 50
--now delete all data from main table that we have added in batches to prevent locks
WHILE @Rows > 0
--delete data in table in batches to prevent blocks
UPDATE TOP(@BATCHSIZE) MyTable
SET MyColumn = dbo.udf_SOME_FUNCTION(MyPK)
WHERE SomeDate > '2011-JAN-01'
SELECT @ROWS = @@ROWCOUNT, @TOTALROWS = @TOTALROWS + @ROWS
PRINT 'Updated ' + CAST(@ROWS as varchar) + ' in batch'
PRINT 'Updated ' + CAST(@TOTALROWS as varchar) + ' total rows'
I am currently using this process now to update a table that is constantly being accessed by a busy website that has over a million rows in it and it isn't causing any BLOCKING or performance issues at all.