Saturday 27 February 2010

The difference between a fast running script and a slow one

When logging errors is not a good idea

I like to log errors and debug messages when running scripts as they help diagnose problems not immediately apparent when the script is started. There is nothing worse than starting a script running and then coming into work on a Monday morning to find that it hasn't done what it was supposed to and not know why. A helpful log file with details of the operation that failed and the input parameter values can be worth its weight in gold.

However logging is an overhead and can literally be the difference between a script that takes days to run and minutes. I recently had a script to run on a number of webservers that had to segment out hundreds of thousands of files (documents in the < 500KB range) into a new folder structure.

My ideal solution was to collate a list of physical files in the folder I wanted to segment and pass that into my SELECT statement so that I could return a recordset containing only the files that actually existed. However my ideal solution was thwarted due to an error message I have never come across before which claimed the DB server didn't have the resources available to compile the necessary query plan. Apparently this was due to the complexity of my query and it recommended to reduce the amount of joins. As I only had one join this was not a solution and after a few more attempts with some covering indexes that also failed I tried another solution.

The solution was just to try to move the file and catch any error that was raised and then log it to a file. The script was set running and a day later it was still running along with a large file containing lots of "File not found errors".

The script was eventually killed for another unrelated reason which pissed me off as I had to get the segmented folder structure up running sharpish. I modified the script so that before each move it checked whether the File Existed before the move. I initially had thought that this in itself may have been an overhead as the actual move method would be doing the same thing only at a lower level therefore I was duplicating two searches for a file in a large folder. I reckoned that just trying to do the move and then catching any error would speed things up.

However because I was logging the error to a text file this was causing a bottleneck in I/O and slowing down the procedure immensely. Adding the FileExists check round each move ensured that only files that definitely existed were attempted to be moved and no errors were raised which meant no logging and no I/O overhead.

I had forgone the nicety of knowing which files were not on the server any more but I had also reduced the scripts running time down to a mere 25 minutes. Therefore the lesson to be learned is that although logging is useful it can also be a major overhead and if you can do without it you may just speed up your scripts.

No comments: