Friday 19 February 2016

Finding Text In Stored Procedures, User Defined Functions, Tables and Indexes

Finding Text In Stored Procedures, User Defined Functions, Tables and Indexes

By Strictly-Software

This is an update to an older stored procedure I had created that just looked inside the system view syscomments for a certain word.

The problem with this stored proc was:
  1. It used the old system views and we are now way past SQL 2000/2005.
  2. It would only look in Stored Procedures and User Defined Functions.
  3. It would provide mis-hits when the word was combined inside another word e.g if you were looking for the word Password and had the word PasswordHash inside the stored proc it would return that result.
  4. It ignored indexes which when you are trying to find columns to remove are obviously important.
  5. It carried out the conversion of the search word to a LTRIM(RTRIM(LOWER(@Word))) on every lookup when it could have been done once at the start.
So I have updated the code to take this in to fact.

It is still not the most efficient code due to the use of numerous LIKE statements but to ensure that you don't bring back invalid results the combination of clauses is required. 

You could use a CLR and write a C# regular expression to search for you but this is outside the scope of the article.

However to keep things simple I am just going to use the standard LIKE clause.

Also note that I have split the SELECT statements into two, one to look for occurrences of the word that is found inside stored procedures, UDF's, table columns and then another for indexes.

The code also uses the newer system views sys.objects, sys.syscomments, sys.all_columns, sys.indexes and sys.index_columns.



SET NOCOUNT ON

DECLARE @Word VARCHAR(100)

-- I am looking for the word Email, not the other columns I know exist such as UserEmail, Emailed, EmailSent etc
SELECT @Word = 'Email'

SELECT @Word = LTRIM(RTRIM(LOWER(@WORD)))

-- get columns in tables, and words inside stored procs and UDFs
SELECT DISTINCT COALESCE(c2.Name,o.NAME) AS [NAME], O.Name as [Object_Name],
  CASE [Type]
   WHEN 'P' THEN 'STORED PROC'
   WHEN 'FN' THEN 'UDF SCALAR'
   WHEN 'TF' THEN 'UDF TABLE'
   WHEN 'U' THEN 'TABLE'   
  END as Object_Type, Modify_Date
FROM SYS.OBJECTS as O
LEFT JOIN 
  SYS.SYSCOMMENTS as C
 ON C.ID = O.OBJECT_ID 
LEFT JOIN 
  SYS.ALL_COLUMNS as C2
 ON C2.OBJECT_ID = O.OBJECT_ID
WHERE 1=1
 AND O.[Type] IN('P','FN','TF','U')
 AND LOWER(COALESCE(c.Text,c2.Name)) LIKE '%' + @Word + '%'
 AND LOWER(COALESCE(c.Text,c2.Name)) NOT LIKE '%[A-Z0-9]' + @Word + '%'
 AND LOWER(COALESCE(c.Text,c2.Name)) NOT LIKE '%[A-Z0-9]' + @Word + '[A-Z0-9]%'
 AND LOWER(COALESCE(c.Text,c2.Name)) NOT LIKE '%' + @Word + '[A-Z0-9]%'
ORDER BY [Object_Name]

-- now return index columns
SELECT i.name AS Index_Name
  ,COL_NAME(ic.object_id,ic.column_id) AS Column_Name  
  ,CASE ic.is_included_column WHEN 0 THEN 'KEY COL' WHEN 1 THEN 'INCLUDED COL' END as Column_Type
  ,Modify_Date
FROM SYS.INDEXES AS i
JOIN SYS.INDEX_COLUMNS AS ic 
    ON i.object_id = ic.object_id AND i.index_id = ic.index_id
JOIN SYS.OBJECTS as O
 ON i.object_id = O.OBJECT_ID
WHERE LOWER(COL_NAME(ic.object_id,ic.column_id)) LIKE '%' + @Word + '%'
 AND LOWER(COL_NAME(ic.object_id,ic.column_id)) NOT LIKE '%[A-Z0-9]' + @Word + '%'
 AND LOWER(COL_NAME(ic.object_id,ic.column_id)) NOT LIKE '%[A-Z0-9]' + @Word + '[A-Z0-9]%'
 AND LOWER(COL_NAME(ic.object_id,ic.column_id)) NOT LIKE '%' + @Word + '[A-Z0-9]%'
ORDER BY Index_Name


Note the combination of the WHERE clauses to cover all the bases with the LIKE statements.

This is to ensure that:


  1. The word is inside the text (sys.syscomments) or the column name in the first place.
  2. The word is not at the end of another word e.g for email you don't want ClientEmail.
  3. The word is not in the middle of another word e.g CandEmailReset.
  4. The word is not at the end of another word e.g EmailsSent
If you had a CLR regular expression function then you could combine all these searches into one but I am keeping it simple with the LIKE statements for this article.



AND LOWER(COALESCE(c.Text,c2.Name)) LIKE '%' + @Word + '%'
AND LOWER(COALESCE(c.Text,c2.Name)) NOT LIKE '%[A-Z0-9]' + @Word + '%'
AND LOWER(COALESCE(c.Text,c2.Name)) NOT LIKE '%[A-Z0-9]' + @Word + '[A-Z0-9]%'
AND LOWER(COALESCE(c.Text,c2.Name)) NOT LIKE '%' + @Word + '[A-Z0-9]%'


This code will return results like the following.

The Stored Procedure / UDF / Table Results

NameObjectNameObject_TypeModify_Date
EmailClientsTABLE2015-02-12 12:13:09.100
Emailudf_validate_emailUDF SCALAR2016-02-12 12:13:09.100
Emailusp_net_get_user_detailsSTORED PROC2011-09-27 17:09:18.530


The Index Results

Index_NameColumn_NameColumn_TypeModify_Date
nclidx_USERS_EmailemailKEY_COL2016-02-12 11:18:19.130
nclidx_USERS_EmailemailINCLUDED_COL2015-12-12 12:10:11.130


So as you can see this is a much more useful piece of code for finding strings within a database.

Obviously if you have professional tools you should be able to use them but it's always good to know the nuts n bolts behind a system and the system views are a great way of finding out information that can be very useful to you.

Why would you use this piece of code?

Well I have used it for a number of reasons including.
  1. Finding certain words that needed replacing in stored procedures e.g when moving from 32bit to 64bit servers the ADO connection string changed and so did the provider and I needed to ensure all stored procedures had SET NOCOUNT ON at the top of them. This code allowed me to find all procedures that didn't have those words inside the procs with a tweak of the LIKE statements and highlighting stored procedures only.
  2. When we changed some column names I needed to find all occurrences of their use across the database, table columns, use in code and indexes.
  3. To find new columns that have been added and the date they were modified. Change the ORDER BY statement and you can find recently added columns of a certain name ordered by the date they were added.
  4. If your system has been hacked you may want to search the columns of tables for the injected string (if you know it) e.g <script src="//hack.ru"></script> and with some tweaked code which is on my main site www.strictly-software.com or this old article about finding text inside a database (SQL 2000 & 2005) you could find and clean up your system without backup/restore methods.


And those are just a few of the reasons I have found code like this useful.

I am sure you will find many more.

Let me know how you use it or would improve it.

By Strictly-Software

© 2016 Strictly-Software

No comments: