Wednesday, 30 May 2012

Overcoming the Demand 5 TV Catchup Performance Problem

Solving the Demand 5, Flash Movie problem - and other Catch Up TV website issues

If you are from the UK then you may often like to catch up on missed TV programmes by watching the online TV catch up services such as BBC iPlayer, Channel Four's 4OD and Channel 5's TV Catch Up service called Demand 5.

The past few nights I have been trying to catch up on a number of programmes shown on Channel 5 as for some reason Five Star has stopped showing the latest episode of Burn Notice on a Sunday afternoon and I like to watch other shows (when they are available) like NCIS and The Mentalist etc so I try to use Channel 5's catch up website Demand 5.

I really wish they would buy the rights to show for longer so that you could watch a programme for more than just 7 days but then it's not called catchup TV for nothing and most TV programmes are made in the USA.

However when I use the Demand 5 website to watch the programme I experience a very annoying problem which effects every browser and computer I tried using including Chrome, FireFox, Safari and IE 8 on a Sony Vaio Windows XP 32 bit Dual Core and IE 9 on a Dell Windows 7 64 bit Quad core.

The Demand 5 catchup TV problem means that pre-show adverts play fine and then after a period of anything between a few seconds and a couple of minutes of the actual show playing the Demand 5 Catchup TV programme would either just freeze up and not start playing again at all. 

Or it would play for a while before stopping for a few seconds as if it was buffering more content before playing for another few seconds again - totally unwatchable.

I couldn't even move the skip bar backwards or forward and the pause / play button just didn't work at all.

A refresh would reload the page, show the adverts before the programme again and then the problem would re-occur. It was all very annoying..

After a number of seemingly ignored comments to Demand 5 complaining about them not fixing the problem and after reading on almost every show on the site that a large number of other people have had similar problems I tried solving the issue myself.

If you go to any show on Demand 5 and read the comments at the bottom e.g www.channel5.com/shows/ you will see comments along the following lines.

"Really poor transmission of episode 23. Can't get more than 20 seconds of coherent dialogue and then it freezes"
"Impossible to watch because of the poor service from Demand Five's website. Even worse than Karen's experience."
"Why is demand 5 not working? It hasn't worked for several weeks. You don't get this with iplayer!!!"
"doesnt seem to be able to play anything"
"wow the longest i got to watch before it froze was 50sec. i did give up 5min in though. I'm with sue off to iplayer which works."
And those are just a snapshot of the many comments on a couple of shows I wanted to watch today all complaining about the Demand 5 website video freezing or not playing at all. 

Demand 5 really need to fix this problem if they want to keep visitors coming to the site.

Due to my own comments and complaints not being answered I tried fixing the problem with Demand Five freezing myself - only because I really wanted to watch the latest episode of NCIS.

I wasn't going to step through all their custom JavaScript and I have no idea what they are doing server side so proper debugging is out of the question but what you can do in situations like this is turn everything else off and then back on again one by one to see if any section of the browser or website is causing the problem.

So without knowing whether they are running some client side code constantly that rewrites the DOM such as many sites try to use to beat all the plugins that are designed to hide adverts, images and flash I could try to see if turning off those parts of the site helped as they maybe running code to constantly re-insert adverts into the HTML to replace any removed by plugins like AdBlocker.

If they were doing something like this (and I don't know if the are) then it's possible that a race condition might occur between Demand 5's own advertising code and any any plugins or virus checkers (which have built in anti-banner options) that constantly scan and rebuild the DOM.Only proper debugging will prove if that is happening,

Debgging the Demand 5 website Performance Problem

So to get a fix for the the Demand 5 problem as soon as possible the first thing I did were all the normal things that developers try when met with similar issues. If you find similar problems on other sites or with plugins for websites you should always try this list first to rule them out as reasons for the problem.
  • Disabling all cookies - the Web Developer Toolbar in Firefox is great for this and if you are not signing into the website there is no need for cookies to be enabled anyway.
  • Disabling Java - Most websites don't even use Java Applets any-more so its not really needed until you find a site that actually makes use of it.
  • Disabling JavaScript - Most websites that show TV content actually require JavaScript to be enabled for their site to work at all even though a basic Flash OBJECT or VIDEO element on a page outputted with server side code playing a movie doesn't require it. However companies like Demand 5 or iPlayer don't do this because it mean their content can easily be stolen or watched overseas by non UK citizens etc so JavaScript is actually required to load in the adverts and programming in chunks.
  • Turn off all Flash tracking by disabling Flash cookies. You do this by right clicking on a flash movie, click the Global Settings option and then choose "Block all sites from storing information on this". Unless you really want another way for advertisers and websites to track your online movements there is little need for this option to be enabled. You may want to check your video and microphone options as well.

Even after all this the Demand 5 flash videos were still having problems playing and although the options I just disabled are worth doing anyway to prevent online tracking by advertisers it didn't fix the Demand 5 TV catch up problem.

Flash is a well known CPU killer on websites and I have seen whole PC's crash due to one to many flash object on a web page being left open too long. Just open a page with a few flash movies and watch the CPU rise in your task manager if you want proof - Chrome seems particularly bad for this but despite this many sites continue to try and write their whole website in Flash which is a BAD IDEA!

However I did notice that the page that held the movie also contained a large number of other banner adverts that used both flash and animated gifs and I wondered whether there was some sort of problem occurring due to all the techniques advertisers now use to try and overcome advert blockers.

Basically websites or plugins use a timer to re-insert banners that have been removed from the HTML DOM by advert blocking plugins like AdBlocker which also uses a timer to remove any adverts re-inserted this way. Both sets of code doing the opposite to each other every split second - not good for performance!

Because both the advertiser and blocker are using the same methods to scan and modify the DOM constantly it basically turns it into one big performance nightmare in which your DOM is constantly being re-written on the fly. The less client code that runs the better KISS IT (Keep It Simple Stupid) for a multitude of reasons.

The Fix for the Demand 5 website Performance Problem

I opened Firefox and went to install a plugin I use on my other computer called Flash Blocker but I noticed a new plugin had been released called Image and Flash Blocker 0.7. I thought I would try it out.

At first I thought something hadn't loaded as I couldn't see any options under my Tools menu however you need to use the context menu (right click the mouse button) to use the add-on and see the various options.

Choose "Image and Flash Blocker" from the context menu and then select "Images off, Flash off" and hey presto most of the imagery, banner adverts, flash banners and movies disappear.In fact most of the Demand 5 page is blank without these features enabled.

The flash movies are replaced with a little red circle with a white "F" (for Flash) in the middle and if you want that particular flash movie to appear you just click it.

I turned every image and flash movie off on the latest episode of the show I wanted to watch e.g Burn Notice, NCIS, Archer, The Mentalist etc, and then I clicked the main movie screen to turn the video back on.

When the adverts had finished playing the TV show played without a single stall, stoppage or flicker. Hey presto problem solved! Big slap on the back for moi.

Remember - too much whizz bangery can cause performance issues

Remember images, Applets, Active-x objects, Flash and all the other fancy whizz bangery that comes with HTML 5, CSS 3 and modern JavaScript libraries is all well and good but more often that not it can cause a massive performance overhead on your computer. The best tactic is to turn it all off and then only turn on what you need once you know you need it and the clients browser supports it.

In the case of fixing Demand 5 catch-up TV it was definitely a case of less the better as it seemed that too much was going on behind the scenes to allow their TV shows to play smoothly. What exactly is happening I don't know without access to their source code but this is definitely a workaround for the Demand 5 performance problem that works!

Hopefully this blog article will help others overcome the same problem. I have tried writing comments with a link to this article on most of the shows I watch as they all contain similar complaints but for some reason they don't like the fact I put a link into my comment or try to help people solve their own technical issues.

So if you watch Demand 5 and have performance problems remember - turn off images and flash and then turn on the only flash movie you need - the TV programme you are trying to watch and then enjoy it without any flickering or stopping every 20 seconds.


Friday, 25 May 2012

Debugging Regular Expressions

Regular Expressions - A guide to debugging

Regular expressions are a powerful tool to master and a great skill to learn as a good expression can save many lines of procedural code. They can be used for many tasks and are especially great for validating user input such as form validation as well as great for cleaning content such as HTML entered from a content management system. When used with remote content grabbers they are great for extracting sections of text from larger pieces as well as useful for quickly reformatting it.

Although a useful and powerful tool when used correctly its also quite easy to use them either accidentally or on purpose for destructive purposes. For example a sure sign that a regular expression has gone haywire and needs looking at is when the CPU on the computer running the expression suddenly spikes to 100%.

This can often happen for numerous reasons but a couple of the most common are when the expression gets itself into an internal loop because of a partial match that then matches multiple sections of text, a complicated pattern that matches nothing or because of a negative lookahead. All of these problems may not be spotted if the text you are matching against is quite small but when the text is quite big the problem amplifies. In fact some people have designed complicated patterns that are specifically created to match nothing but consume CPU whilst doing so. For more information about the dangers of regular expressions and SQL Denial of Service Attacks read the following two articles.


Common Problems

Sometimes trying to work out what complex regular expressions are doing can be quite hard especially if you didn't write it in the first place. For a novice looking at an expression like the following it could be quite confusing trying to work out what is going on:
var re_getme = /^<([^> ]+)[^>]*>(?:.|\n)+?<\/\1>$|^(\#?([-\w]+)|\.(\w[-\w]+))$/i
Therefore when you get stuck with a complex expression its a good idea to break it down into tiny little steps and build it up bit by bit. If your pattern isn't matching what you expect then comment it out and start a new expression. Start with a cut down simple version of your pattern and get it matching or replacing something and then build it up from there.

Even for performance reasons it might be worthwhile considering breaking a complicated expression down into multiple expressions. If you are using a pattern to replace certain content then using multiple expressions is a good idea especially if your existing expression contains lots of OR statements or negative matches.

Multiple ways to skin a cat

The great thing about regular expressions however is that they offer you the ability to handle a task in multiple ways. For example both of the following expressions look for HTML tags:
var reHTML1 = /<[^>]+?>/;

var reHTML2 = /<(.|\n)+?>/;
The first one matches an open bracket and then uses a negative match to look for any character apart from the closing tag one or more times before then matching the closing tag. The second one matches an open bracket and then any character or any space one or more times and then a close tag.

Therefore if you are getting stuck and have spent a lot of time trying to crack an expression one way but its not just working try to attack it from a different angle.

Negative Matching

Trying to do negative matching can be quite complicated especially if you trying to match a string rather than one character or a group of characters. For example this negative expression just removes any characters that are not between A to Z or 1 to 9.
[^A-Z1-9]+
Because this expression doesn't require any ordering of the characters its not matching its not hard for the engine to carry out. The following expression is a bit more complicated as its doing a negative lookahead to make sure the text is not one of the specified strings in the OR list.
(?!https?://.*(webmail|e?mail|live|inbox|outbox|junk|sent)).*

Expressions that carry out negative lookaheads are okay as long as the text being searched is not too large however trying to do complex lookaheads with large pieces of text is a sure way of maxing out your CPU.

Using Placeholders

If you are trying to carry out replacements of text that involve a negative match then you may want to consider using a placeholder to make your code simpler and perform better. Instead of trying to negatively match a string that you don't want to replace you do a positive match on this string replacing it with a placeholder then do your replacement before putting the value back in.

For example in my HTMLEncoder object I have a method that HTML Encodes text passed to it. When it comes to replacing ampersands with entities I cannot do a straight replacement as I want to keep any ampersands already in the text that are there as part of an existing numerical or HTML entity as I cannot guarantee the text I am encoding doesn't already contain some encoded characters. Therefore instead of trying to do a negative match for &# I do a positive match replacing that string with a placeholder. I then do my replacement of all other & ampersands before putting back my placeholder values.
 
// replace ampersands from existing numerical entities with a placeholder
s = s.replace(/&#/g,"##AMPHASH##");

// replace all remaining ampersands with their entity version
s = s.replace(/&/g, "&amp;");

// put back in my placeholder values for numeric entities
s = s.replace(/##AMPHASH##/g,"&#");

Re-use objects and Cache Expressions

If you using a regular expression object and carrying out multiple expressions or executing the same expression multiple times then you should look into caching the expression if possible. If you are using wrapper functions to carry out certain tasks then try not to create a new regular expression object on each call to the function. The creation and destruction of a COM object can be very expensive so try to use global objects that are created once and used by all your functions and then destroyed at the bottom of your page or script.

Differences in Regular Expression Engine

Although most languages have inbuilt support for regular expressions the quality of their engines differ as well as the syntax of the expressions themselves. I have found Microsoft's Visual Basic Regular Expression Engine is very poor and doing complex expressions in VBScript or ASP Classic can often cause CPU spikes on the web server. I have also seen certain patterns that work in other languages cause problems with this engine when the text being matched contained Unicode characters.

The JScript engine is a lot better than the VBScript engine and if you are using ASP classic then I would recommend using server-side JScript for any complex regular expressions. Not only is the engine better you will have much more flexibility with your code such as using lambda expressions and being able to pass in a function as the argument for the replacement value.

In SQL Server there is no inbuilt regular expression function like there is in MySQL but you can use LIKE or PATINDEX for basic patterns or if you are using SQL 2005 - 2008 you can extend the CLR and build some user defined functions that hook into .NET to allow you to carry out proper regular expression functions. In SQL 2000 you can use OLE and the built in extended stored procedures sp_OACreate, sp_OAsetProperty and sp_OAMethod to instantiate the VBScript.RegExp COM object to carry out regular expression tasks. However this method can be quite an overhead and should be used carefully if required especially if the create and destroy methods are wrapped in a UDF as it will mean a new COM object is created every time the function is called and we all know object re-use is a key ingredient when it comes to performance especially with COM objects.

Knowing when you have a problem

If your regular expression code is client side Javascript or VBScript then a good indicator that you have a problem is when your browser hangs or pops up an unresponsive pop-up. Check your task manager and investigate the CPU usage as a long running expression that is killing your machine will surely be maxing out your PC's CPU.

If the code is server-side either in a database or web server then if you are viewing your servers performance monitor and you notice the CPU jump up in blocks of 25% (on a quad), or 50% (on a dual) or 100% (single), stay there for a few seconds and then drop down again then the chances are its regular expressions doing the maxing.

Friday, 18 May 2012

Websites For Sale - Buy Now

I am in the process of selling a number of domains and if you are interested please contact me to discuss a price.

Most of these domains also have either code associated with them, Twitter accounts and automatic TweetBots that post to those accounts or post articles to the website.

If you are interested in purchasing this code as well as the domain name then obviously this will cost you more. 

The sites are:

www.hottospot.com
  • domain
  • website
  • Twitter account
  • Tweetbot


Hottospot.com - as it looked like


www.hattrickheaven.com
  • domain
  • website
  • Twitter account
  • Tweetbot


strurl.com
  • domain
  • website

A site I have had to take down due to my hosting company not "agreeing" with people being able to provide shortlinks to their websites!

You can see a screenshot of how the site looked before it was taken offline below.

strurl.com - as it used to look like


The prices are all up for negotiation and you can contact me by using the contact link in the footer of the page or from this email contact me to buy this domain

It's a bit hard to get accurate statistics seeing that I have had to take 2 of the sites down and put up holding pages in their place which is why I am using screenshots from Google Webmaster Tools.

However all 3 sites have been around for over a couple of years so they will all have site authority and a page rank of at least 3 to 4.

Plus they will have lots of back links from news sites, football sites, my own blogs and sites and many other sites - and none from spammy directory sites that are totally bad for SEO!

For example for  www.hattrickheaven.com using http://www.opensiteexplorer.org and Google web master tools I got these details:

  • Page Authority 32/100 
  • Domain Authority 21/100 
  • 6,419 URLs indexed.
  • 134,077  Links to site.

If you are interested in running your own #altnews website email then I suggest buying www.hottospot.com as it already has over 15,000 articles in it's database and uses Wordpress as its back-end.

You should  read this article on it or contact me at contact me to buy this domain  to discuss purchasing options.

Let me know! 

Thursday, 17 May 2012

Update to the Twitter Hash Tag Scanner Application

This is just a note to anyone who has already bought or thinking of buying my Twitter Hash Tag Scanner application from my www.strictly-software.com site.

I have just updated the code so that it gets round the new Twitter follower blocking actions they have implemented. If you have purchased it you might have noticed lately that you were getting the value 0 for all the accounts followers. This is because Twitter had changed the format of their source code.

The new version should get round this change and return correct follower counts.


If you have already purchased the application (and I know if you have from my list of payments) then you can request a new version for free.

Otherwise feel free to pay the small fee of £10 to obtain such a brilliant SEO and Social Media analysis application from my website: http://www.strictly-software.com/applications/twitter-hash-tag-hunter.