Thursday, 19 February 2009

SQL System Views

Using SQL 2005 System Views

SQL has numerous system views containing META data about the system. In fact I would say in SQL 2005 there are probably too many system views as they have kept views from previous versions to be backwards compatible and therefore you are spoilt for choice when it comes to accessing system data. For example the following 4 SELECT statements are all examples of accessing meta data about the tables you have created within a database. Notice how each SELECT uses a different system view.

SELECT TOP 1*
FROM INFORMATION_SCHEMA.TABLES
ORDER BY TABLE_NAME

SELECT TOP 1 *
FROM sys.tables
WHERE type='U'
ORDER BY Name

SELECT TOP 1 *
FROM sys.sysobjects
WHERE xtype='U'
ORDER BY Name

SELECT TOP 1 *
FROM sys.objects
WHERE type = 'U'
ORDER BY Name
Each statement should return the same table name with varying amounts of related data.


Using System Views to create SQL

There are so many different ways that a working knowledge of the system views can be beneficial but one of the ways is to help you write SQL statements which would otherwise take a long time by hand. For example today at work I had to change the value of a certain column from A to B. The column was a key used for identifying websites. It resided in a system that used 4 databases and appeared in nearly 80% of all the tables, so writing the statements by hand would have taken a long time. The idea was to sync up our development and production server so that this column value was the same for both. Therefore I knew the names of the databases that needed updating as well as the name of the column, the existing value and the new value. I needed to output all the relevant UPDATE statements (177 in total) to do the required changes.


Filtering databases using sys.databases

After setting up some variables I created a local temp table which I populated with the names of the databases that I needed to create UPDATE statements for. I used the system view sys.databases for this e.g

CREATE TABLE #DB(DB INT,DBNAME NVARCHAR(200))

--set DBs we want to check for
INSERT INTO #DB
SELECT database_id,Name
FROM master.sys.databases
WHERE Name LIKE 'mydatabase_system_%'

Then I used the undocumented system stored procedure sp_MSForEachDB to loop through all the databases on the server checking each one to see if it also matches a database in my temp table #DB and if so I run a SELECT statement that takes data from 4 more system views to populate another temporary table #TABLES. This temp table will hold all the information I need to create the necessary SQL including the database name, table name and schema owner.

EXEC sp_MSForEachDB 'USE [?];
IF DB_ID() IN(SELECT DB FROM #DB)
BEGIN
INSERT INTO #TABLES
SELECT d.Name,u.Name, o.Name, c.Name
FROM sys.objects as o
JOIN sys.columns as c
ON o.object_id = c.object_id
JOIN sysusers as u
ON u.uid = o.schema_id
JOIN sys.databases as d
ON d.database_id = db_id()
WHERE c.Name = ''SiteFk''
AND o.type = ''U''
AND o.name LIKE ''tbl_%''
AND c.is_identity=0
ORDER BY o.Name
END'

Notice that I am using the latest system views that come with SQL 2005 and I am returning data that will enable me to create the necessary UPDATE statements with fully qualified names.

Once I have populated my working tables I can create the UPDATE statements. As the column I am updating is used in numerous foreign key constraints I need to disable these constraints before carrying out the update and then re-enable them once the update has completed. Therefore for each table in my loop I output 3 SQL statements.

-- Create ALTER statement to disable foreign key constraints
SELECT @SQL = 'ALTER TABLE '+QUOTENAME(DB)+'.'+QUOTENAME(DBUser)+'.'+QUOTENAME(DBTable)+' NOCHECK CONSTRAINT ALL'+@NL+@NL
FROM #TABLES
WHERE RowNo = @Row

-- Create UPDATE sql to change old value to new value
SELECT @SQL = @SQL + 'UPDATE '+QUOTENAME(DB)+'.'+QUOTENAME(DBUser)+'.'+QUOTENAME(DBTable)+' '+@NL + 'SET ' + DBCol + ' = ' + CAST(@NewVal as varchar(5)) + @NL + 'WHERE '+DBCol + ' = ' + CAST(@OldVal as varchar(5)) + ';' + @NL+@NL
FROM #TABLES
WHERE RowNo = @Row

-- Create ALTER statement to enable foreign key constraints
SELECT @SQL = @SQL + 'ALTER TABLE '+QUOTENAME(DB)+'.'+QUOTENAME(DBUser)+'.'+QUOTENAME(DBTable)+' CHECK CONSTRAINT ALL'+@NL+@NL
FROM #TABLES
WHERE RowNo = @Row


Each loop iteration will PRINT out the value for the @SQL variable and after running the code within a second I have 531 SQL statements to run including 177 UPDATE statements e.g:


ALTER TABLE [mydatabase_system_examplea].[dbo].[tbl_BANNER_LOCATIONS] NOCHECK CONSTRAINT ALL

UPDATE [mydatabase_system_examplea].[dbo].[tbl_BANNER_LOCATIONS]
SET SiteFK = 5000
WHERE SiteFK = 116;

ALTER TABLE [mydatabase_system_examplea].[dbo].[tbl_BANNER_LOCATIONS] CHECK CONSTRAINT ALL



So even with the time it took me to write this little script it still outweighed the amount of time it would have taken me to do the work manually e.g find all tables across 4 databases that included the column I needed to update and then write the necessary SQL. Plus if I need to do something similar in the future I can just tweak the script.


Tuesday, 17 February 2009

Using Firebug Lite for Debugging

Debugging with Firebug Lite for IE, Safari, Opera and Chrome

I tend to do all my client side development in Firefox purely because of all the great add-ons available such as the developer toolbar, hack bar, YSlow and Firebug which is a great tool for debugging.

Once the code is working then its a case of testing in the other main browsers and if you purely rely on the inbuilt tools available you will find varying degrees of uselessness. Safari and Chrome both have inbuilt consoles which can be accessed for logging debug statements. Operas Dragonfly is pretty similar to Firebug however I have found so many annoying issues when trying to use opera.postError to output debug messages that I don't bother with it. Then at the bottom of the pile for useless unhelpful error messages is IE. I have always wondered why IE could never get their act together when it comes to Javascript error messages. An accurate line number that relates to the actual file that raised the error is not too much to ask is it? It obviously knows that an error was raised so why doesn't it know what and where. If the other browsers can manage it then IE really should be able to get it right by now. I have seen IE 8 has some better features with the ability to add breakpoints but I still find it very annoying that when you have a page that includes multiple references to external script files that when an error is raised they cannot narrow it down to file rather than page level.

So for these other browsers I use Firebug Lite which has 90% of the features that the full version of Firebug has and most importantly it gives you a wonderful console to output all your debug messages to.

You can download the latest version of Firebug Lite from this link and to use it on your site you can either include a reference to a local version of the file or use the bookmark link option which lets you include it on any site you want. For sites I am developing I will create a local copy of the file that I then reference in a global include file that will make sure the script is only loaded if:
  • The browser is not Firefox, as I use the full version of Firebug for that browser.
  • The version of the site is the development server. Check a constant or an IP address.
  • A global Debug flag is enabled.

Displaying Firebug Lite

The default view mode for the latest version of Firebug Lite is for it to open up at the bottom of your screen whether you want to or not. Obviously this can be quite annoying as it takes up quite a bit of screen space and you would usually only want to view the full console when you required it rather than all the time. This is where having a local version of the firebuglite.js file comes in handy. If you take the time to go over the code you will see the Firebug file contains a number of objects and if you edit the Firebug.init method which sets up the console you can insert the following line of code to minimize the console so that it sits at the bottom of the screen until you require it.

win.minimize();
I put that at line 232 and it works fine for me in all these browsers IE6, IE7, Chrome, Safari and Opera.


Loading Firebug after the page has loaded

There maybe reasons why you don't want to include the Firebug-lite.js file on all pages and the use of a bookmark or lazy loader is your choice option. However if you try and access the console before its loaded you will raise errors. Safari and Chrome will always have a console available so if you want to log your debug to Firebugs console rather than their inbuilt versions then you need to be checking the following:
if((typeof(firebug)!="undefined") && firebug.env && firebug.env.init)
rather than just using
if(typeof(window.console)!="undefined")
The earlier versions of FirebugLite had a simpler object model and you could reference firebug.console directly. The firebug.env object seems to hold the current status of the console, whether it has loaded fully and what display state its currently in i.e minimised, maximised, popup etc. So the check for firebug.env.init is to make sure the console is fully loaded and initialised.


Debugger Wrapper Object

If you are like me you will include debug function calls all throughout your code and rather than wait until the applications gone tits up and have to add them in later I tend to include calls to a wrapper debug function as I write my initial code. The benefit of this is that it saves time when I do need to debug plus I can run a simple clean up process to remove them all before the code goes live. However if you are including Firebug Lite from a bookmark or lazy loader you need to cache all your messages up until the console has loaded and then you can flush them all out.

This is where a simple debug wrapper object such as the one I use comes in handy as it will handle the caching, flushing and output of debug messages as well as in Safari and Chrome control whether you want to use the inbuilt console or the Firebug console.

The code is pretty simple and you can configure the behaviour by setting the debugger properties defined at the top of the object e.g


var Debugger = {
ready: false, // once a console is loaded will be set to true
debugCache: [], // holds debug messages until console is loaded and ready
hasCache: false, // flag when cache has messages
debugCount: 0, // message counter
debugInterval: null, // setInterval timer pointer
forceFirebug: true, // if true for Safari/Chrome we will use firebug lite for debug not their own console
debug: true, // flag to enable/disable debug in case you only want to output certain sections



To output any debug messages just call the ShowDebug function passing in the message you want to output.

ShowDebug("Output this debug message");

You can control whether debug is on or off throughout a page by just toggling the debug property e.g

Debugger.debug = false; //turn off debug messages

ShowDebug("This will not be shown");

Debugger.debug = true; //turn debug messages back on

ShowDebug("This will be shown");


I have tested this in IE 6, IE 7, Opera 9.61, Chrome and Safari 3.2 loading in Firebug Lite with the bookmark and it works fine. The only thing I have noticed is in the WebKit based browsers the styling is a bit off but then that's nothing to do with me.

Sunday, 8 February 2009

CAPTCHAS

CAPTCHAs don't you just love completing them?

Everybody hates filling in CAPTCHAS and even the most complex ones can be beaten either by using bots that make use of OCR (optical character recognition) to take apart the image and calculate the letters used or for those that cannot be beat they just link directly to the site using it and offer free porn to users who complete them.

Obviously for a spammer to go to that amount of effort to beat the CAPTCHA there has to be something worthwhile at the other end like a free email account to send out spammy emails. The other major reason to use automated CAPTCHA breakers is to insert comment spam mainly for links back to nefarious sites or to malware infected sites. So if all CAPTCHAS can be broken either by bots or by humans doing the work for bots is there any point in using them? Well yes I would say as unless you are running a site that offers something worthwhile like email accounts then the chances are having even a simple CAPTCHA system will reduce a large percentage of spam requests.

However a standard image based CAPTCHA is not the only means so here are some others.

Simple Robot Identification Tests

Use Javascript to identify humans

The idea is to only allow humans to submit the form and not bots so you could go down a simple route of using JavaScript to submit the form as 99.9% of bots do not run script. However you also have the problem that roughly 10% of your human traffic don't either. You could place a message at the bottom of the form asking the user to enable Javascript to submit the form. With the message itself hidden by JS so that users with it enabled don't see it.


Identify robots using IP and User-Agent

For those bots you can positively identify as crawlers by IP and User-Agent you can obviously prevent them from submitting the form however most spammers will spoof the agent and go through proxies and other cloaking mechanisms. Identifying a bot as a bot 100% of the time is the holy grail webmasters are seeking so if there was a way of doing this CAPTCHAs wouldn't be needed in the first place.


Using CAPTCHAs to help digitise books for online use

This is a neat idea where you know that those 10 or more seconds spend deciphering the image has not been a total waste of your time. The reCAPTCHA is based on a small portion of word from a book. The image is a section from a scanned version of a page and your answers are compared with other peoples responses to validate the likelihood of a sentence being correct. This is a good idea as even if you got one word wrong from a sentence you could still pass the test if the rest of the words were correct as the system checks the answer you gave for new words that have not been validated with words it knows the answer for.


Using hidden input fields to trick robots

This idea involves adding hidden input fields to your form which you want the robot to complete but not the human. When the form is submitted you check whether a value has been added to this field and if so you can block the request.

You can use either type="hidden" to hide the input or preferably use CSS and a class name that relates to a style e.g display:none; A bot could easily detect the element was hidden and ignore it the same way it could easily read inline styles to work out it was hidden. However with a class name it would have to read in the stylesheet to find out whether the class related to a hidden style or not which is obviously more effort but not impossible.

Also the aim is to trick the bots into filling it out without also tricking any form auto-complete systems such as Googles toolbar from doing the same. I have found problems with older versions of the toolbar when you give the input a name such as "EmailConfirm2" it would complete it as it mentioned a word used within the autofill profile. You could give the field a totally random name but then a clever bot would ignore it knowing it was a trick.

You can give it a name that relates to other visible form elements but prefix it or modify it slightly. Also make sure you place the field outside the flow of the other visible controls as I have found with Googles latest toolbar that it will complete inputs hidden with CSS if they are placed between other visible elements e.g between Name and Email or within Address1 and Address2. Therefore this method is not totally reliable.


Check submitted values for similarity

Also a lot of spammers will submit the same value for all form fields for example on a simple registration form of Name, Email, Confirm Email, Password and Confirm Password the spambot will enter an email address for all 5 controls. Unless you set up validation rules to ensure that email addresses are not used for passwords or names then it would submit the form. What you could do is check whether the same value has been used for Name, Email and Password and block the user. Only a percentage of bots do this and I myself when testing a new site often supply the same email address for all parts of a registration form to quickly get on the system.


Question and multi-part CAPTCHAS

This type which is not as popular as the common garden CAPTCHA is where the user is asked a question about the image. You may have four numbered animals on the image and the question would be "Which one makes the noise mooo" and you would have to pick the image related to the cow. Or the question maybe "what colour is the sky" which you may answer well its England in January so its grey and then find yourself blocked. The problem is making enough questions that can only be interpreted in one way as you are basing your CAPTCHA on a subjective question that you hope everyone will answer the same way.

Another form of this CAPTCHA which I have started using myself is the combination CAPTCHA where the distorted image holds a series of numbers. The user is then presented with a sum based question based on those numbers for example "Subtract the second number from the first and then multiply it by the third number". The sum and image is generated server side with the answer held in a database related to a key. Only the key and the question is passed to the client and the user has to answer the question within a set time period and within a set number of attempts to pass. Not only does this method not use Javascript so its available to all users it also has the affect of weeding out users who cannot do simple math.

A variation on this math based system which I have just read myself tonight so was pretty interested to see other people using is a similar math based solution when the question is only shown to the user if Javascript is disabled. If its enabled then Javascript is used to solve the CAPTCHA behind the scenes using some encrypted keys. This way the user is spared the agony of remembering their junior school times tables :). This solution is available as plug in so can be used by those not wanting to write their own.


So which one do I use?

The problem with the IT world is that its full of people who like a challenge and want to prove they can do anything. Therefore you will always have developers who will spend time writing clever bots designed to beat any form of CAPTCHA. So as with all security methods the best approach is a layered one that makes use of multiple techniques. The idea being the more hoops there are to jump through correctly the more likely you are to trip up some of those devious spammers and hackers.

If you can make your CAPTCHA solution slightly different from all the others then you also have a good chance of it not being beaten. Unless you are offering a golden honeypot on the other side of the submitted form i.e free email then there is no money in defeating it. If you are just a regular site then you want to stop the majority of spammers without making it too much trouble for your users to complete. Remember a lot of spammers are human as well so you will never stop all spam.

Related Links