I was updating my ISAPI rules the other day implementing some new rules to identify XSS hacks and new SQL injection fingerprints and I came across some articles on the web about banning image hot linking which I thought was a cool idea. If you have spent time and effort designing images then there is nothing worse than someone just stealing them or even worse hot linking to the image so that your bandwidth is also taken up hosting their images!
I implemented the rules with a redirect to a logging page that logged the referral property to my traffic DB so that I could see which sort of requests were being made for images that didn't originate from the site they belonged to. After a week or so of logging I had enough data to come to the conclusion that I wouldn't be able to implement a hot linking ISAPI rule in the majority of the commercial applications I work on. This is for the simple reason that most sites I develop send out one or more HTML emails to users that themselves make use of hot linking to images and logos e.g using full URLs back to the image on the webserver. Unless I could come up with a rule that could handle every type of webmail or email client then there would always be someone somewhere in the world opening up their latest site generated email, clicking the "load images" button only to get nothing in return.
There are so many different mail clients out there developed in hundreds of countries that no rule could keep up with all the possibilities. From viewing my logging I had at least 2 different Polish mail clients as well as a number of Russian, Chinese and god knows what. So if you are developing a commercial application that has to send out HTML marketing or other forms of email I would advise against it.
However for small or personal sites then there is nothing wrong with implementing a rule to ban hot linkers. One of the cool ideas I read about was to return a banner advert or some other imagery that would annoy the user. However if you are going to use these rules you should know that as soon as the linker finds out you have blocked them they will just download the image from your site and host it themselves if they really wanted it. The original reason for this type of rule from what I have read is to implement it periodically to log those sites who are linking maybe without even blocking the image and then to issue cease and desist orders to the offenders.
The ISAPI Rules for IIS
I have to work with IIS at my current company and we use both 32 bit and 64 bit servers which means we have different versions of the ISAPI DLL installed on each and therefore slightly different syntax. The component I use is ISAPI Rewrite.
You will notice that the rules are slightly different due to the regular expression engine differences between versions.
Both versions do a number of conditional checks to ensure that
-The referer is not blank
-The referer is not the current site which is obviously okay
-The referer is not an image search bot. Looking at the most popular robots.
-The referer is not an email client.
-The user-agent is not a popular search engine bot.
-The image in question must be a gif, jpeg, jpg, png or bmp
-If all those conditions are matched I redirect to a 403 forbidden page.
Version 2.7 ( 32 bit server - httpd.ini)
Notice that I capture the host in the first conditional and then use a back reference to that matched value in the third conditional which ensures only hosts that are not the current site get matched.
RewriteCond Host: (.+)
RewriteCond Referer: .+
RewriteCond Referer: (?!https?://\1.*).*
RewriteCond Referer: (?!https?://(?:images\.|www\.|cc\.)?(cache|mail|live|google|googlebot|yahoo|msn|ask|picsearch|alexa)).*
RewriteCond Referer: (?!https?://.*(webmail|e?mail|live|inbox|outbox|junk|sent)).*
RewriteCond User-Agent: (?!.*(google|yahoo|msn|ask|picsearch|alexa|clush|botw)).*
RewriteRule .*\.(?:gif|jpe?g|png|bmp) /403.aspx [I,O,L]
Version 3 (64 bit server - .htaccess)
RewriteCond %{HTTP_REFERER} ^.+$
RewriteCond %{HTTP_REFERER} ^(?!https?://(?:www\.)?mysite\..*) [NC]
RewriteCond %{HTTP_REFERER} ^(?!https?://(?:images\.|www\.|cc\.)?(cache|mail|live|google|googlebot|yahoo|msn|ask|picsearch|alexa).*) [NC]
RewriteCond %{HTTP_REFERER} ^(?!https?://.*(webmail|e?mail|live|inbox|outbox|junk|sent).*) [NC]
RewriteCond %{HTTP_USER_AGENT} ^(?!.*(google|yahoo|msn|ask|picsearch|alexa|clush|botw).*) [NC]
RewriteRule .*\.(jpe?g|png|gif|bmp) /403.aspx [NC,L]
Quirks and Differences
As always nothing is ever simple especially when you want to implement the same rule across two different versions of an application. As well as the obvious syntax differences with the flags and HTTP header names I found the following:
- Using the IIS converter tool to convert the 2.7 rules to version 3 did not convert all the rules. It could not handle the back references and therefore I explicitly match the site domain in the v3 rules.
- The negative lookahead asserts differ between versions. I could not get them working in 2.7 without putting the trailing .* outside the grouping e.g (?!https?://\1.*).* whereas in v3 they are within the grouping e.g ^(?!https?://(?:www\.)?mysite\..*)
- The documentation recommends NOT using the ^ and $ when using rules with conditions because internally all conditions are combined together to create one rule and this can lead to unexpected behaviour.
- I could not get the Ignore Case flags [I] working in version 2 with the negative lookaheads. As soon as I added the flag the rules would not match. This does not seem to be a problem in version 3 and the equivalent flag [NC] works fine.
Apart from those quirks both sets of rules have been tested on live systems and work fine. However if you are going to use rules such as these you are always going to run into problems with new mail clients or user-agents that come along oh so very frequently and unless you are going to constantly update your ini files you will have a considerable percentage of false positives.
The full ISAPI Rewrite documentation can be found here: http://www.isapirewrite.com/docs/