Sunday 30 August 2009

Compression Comparison

Comparing Compressor Tools

There are many Javascript compressor tools online and they all do wonderful jobs. What makes the tool I made slightly different is that it allows you to customise a number of compression options which can aid you in getting the best compression rate possible. Whilst most tools offer a simple crunch method and maybe a pack method (changing the code to run through an eval statement to obfuscate the code) they don't allow you to do some simple things that can make a lot of difference such as:

Renaming global objects that are used frequently to reduce size e.g window, document, navigator.

Renaming your own global objects that you may use frequently.

Renaming your own commonly accessed functions to short names.

Replacing calls to document.getElementById with a call to a single letter function e.g $ or G.

These 4 options used together could drastically alter the compression rate of your script.

Also if you have a small script then choosing to pack it as well as crunch or minify it will most likely increase the size of the output rather than compress it. Packing maybe worthwhile if you really want to hide your codes purpose from any user but its totally pointless as anyone can reverse engineer a packed script with a simple line of Javascript either within the Error Console in Firefox or by using one of the unpacker tools available online.


Different Outputs

Also I note that on a few compressors the output may give a misleading impression of success to the user. If a file has been reduced in size by 30% it has been compressed by 30%. Some tools however will show the new file size as a ratio of the old size which would be 70% which is fine. However having a label that just says "Compressed" next to the figure of 70% may lead some people to believe their file has been compressed by 70% when in fact its only been compressed by 30%.

For example take this silly example of a function:


function tested(var2){
var fuv = "hello "
+ "mr smith "
+ "how are you ";
var pid = 1000
var ola = 3343

if(var2==fuv){

var rob = function(){

addEvent(pid,"click",function(e){

var donkey = function(e){
if(fuv == pid){
return true;
}
}
})
}
}
}

Now run it through my compressor


Which outputs the following which has reduced the size by 40.65%


function tested(a){var b="hello mr smith how are you ";var c=1000,d=3343;if(a==b){var rob=function(){addEvent(c,"click",function(e){var donkey=function(e){if(b==c){return true}}})}}}


And now this other compressor tool.


Which outputs the following code and in a box labelled "Compression" it has the value 63.67%.


function tested(A){var B="hello "+"mr smith "+"how are you ";var C=1000var D=3343if(A==B){var E=function(){addEvent(C,"click",function(e){var F=function(e){if(B==C){return true;}}})}}}

Now this is actually the size of the new code in relation to the old and not how much the code has been reduced by which is 36.33%. This is not the only tool that does this and I am sure most people will be aware what the figures actually mean. However because my tool does the opposite and shows the percentage that the new code is reduced by it may lead people to believe one tool has a better compression rate than another when in fact it doesn't.

I am not claiming my tool is perfect as it will never be as it uses regular expressions instead of a Java Engine however most other tools do this as well and I have spent a lot of time handling issues such as missing terminators which other tools like the one above misses. Douglas Crockfords JSMin will just add a new line when a missing terminator is found whereas my tool adds the missing terminator. Other tools will just assume the code has been validated and checked before submission which of course is the best way to avoid any errors at all.


Compression with GZIP

Whats the benefit of minimising your script as compared to using GZIP?

Well GZIP may offer superior compression but its a way of delivering the content to the browser which then uncompress it. It uses server side processing to generate the compressed files which may or may not be a problem depending on the load and the type of file being served.

With minification the code can stay compressed on the server as well as the client plus it will run on older browsers. You also have the benefit that certain minification techniques should actually aid client side performance by reducing processing power e.g reducing the amount of variable declarations or reducing string concatenation by combining the strings together into one variable (see my compressed output)

There is nothing stopping you minifying as well as using GZIP. So if you haven't looked into compression then you should do as its a very simple and easy way of increasing your sites performance.

Friday 28 August 2009

Firebug So So Slow

Developer Toolbars Slowing Sites Down

Following on from my whinge the other day about how IE 8.0's developer toolbar is causing pages to hang, CPU to max out (50% on dual core). I have noticed on a few sites now since upgrading to Firefox 3.5.2 and Firebug 1.4.2 that I have similar problems.

I have just been to my football site www.hattrickheaven.com and was clicking some of the leagues down the sidebars and wondering why the pages were taking so long to load. I then went to Chrome and tried the same links and the pages loaded as fast as lightening. This made me wonder about Firebug and low and behold disabling Firebug stopped the delay and the pages loaded very quickly. I would like some other people to try the same thing if they have Firebug and let me know the results e.g try these links with Firebug enabled and with it disabled:



With Firebug disabled they should load pretty quickly however with Firebug enabled its taking anything up to 15 seconds! Also check the Task Manager and see what CPU level Firefox shows as mine shows 50%.

I don't know if this is something to do with something within my page that Firebug cannot handle but the only validation errors I have are missing ALT attributes.

So as a tip for slow loading sites I would suggest in both IE and Firefox disabling the developer toolbar and Firebug and seeing if that helps speed up the load times. It seems to have worked for a number of sites now which is a shame as both toolbars should be great add-ons if only they could work without slowing everything down.

Tuesday 25 August 2009

Firebug and Hightlighter.js Problem returns

Code Highlighting Disabled with Firebug 1.4.2

The other month I wrote an article explaining how after upgrading to Firebug 1.4.0
all my code highlighting went haywire
with bits disappearing all over the place. I thought I had fixed the problem as the code highlighting has been working fine in FireFox for a couple of months now but I have recently upgraded to FireFox 3.5.2 and Firebug 1.4.2 and now the problem is back in a slightly different form.

Instead of the code I am trying to highlight disappearing or being broken the code appears but the highlighting just doesn't activate and the code stays uncoloured. There is an error on line 1 of the packed version of highlight.js and 101 of that version unpacked with my unpacker tool.

return [O.substr(0, r.index), r[0], false]

The error being:

Cannot access optimized closure

If I disable Firebug on the page or the whole add-on then the code highlighting will work in FireFox so it does seem to be an issue with Firebug clashing with the highlighter code in some way.

I have tried downloading the latest version of the highlighter 5.3 from Software Maniacs but it makes no difference.

The highlighting continues to work in other browsers IE, Chrome, Opera, Safari etc and it will work in FireFox when Firebug is disabled.

The last few versions of Firebug have been very buggy in terms of inspection and performance so maybe this is a known bug. I did post a bug at Mozilla last time and I also contacted Software Maniacs. Until this gets resolved I suggest if you really want code highlighting then to use another browser or disable Firebug.

Sunday 23 August 2009

Compression and Unpacking Javascript

Reverse Engineering the Strictly Compressor with the Unpacker Tool

I have just put up a cut down version of my own compressor tool on my website. The compressor has a number of advanced options which allow you to customise the compression process as well as take care of some very common global objects and function calls such as window and document. If you choose these options the tool will add in aliases for these objects and then replace any references to the object within the code with the alias instead. An example of the compression and then its reverse engineering with the unpacker tool is below.

Example Code

The following code is a dummy script that does nothing apart from show the process at work.


/* A make believe example object
and calling function */
var myObj = {

myFunc : function(test1,divID){
// throw in a regular expresion literal
var re = /^\/firefox.+\d+?/gi;
var test2 = "IE";
var some = "vars", anothervar=100;

var somestring = "this is a string "
+"that continues "
+"on a few lines";

if(document.getElementById(test1).value==test2){
window.open("someurl.htm","width=500");
}else{
if(/firefox/i.test(navigator.userAgent)){
document.getElementById(divID).innerHTML="you are using Firefox";
}
}
return;
}
}
myObj.myFunc(document.getElementById('div1').innerHTML,"myDiv");



I will run it through the compressor tool selecting the advanced options:
  • Minify Global Objects.
  • Create a Get function.
  • I have used the default value of G for my get function.

The compressed output which is a 37.76% reduction in size is below.

var _w=window,_n=navigator,_d=document;
G=function(i){return document.getElementById(i)}
var myObj={
myFunc:function(a,b){var re=/^\/firefox.+\d+?/gi,c="IE",d="vars",e=100,f="this is a string that continues on a few lines";if(G(a).value==c){_w.open("someurl.htm","width=500")}else{if(/firefox/i.test(_n.userAgent)){G(b).innerHTML="you are using Firefox"}};return}};myObj.myFunc(G('div1').innerHTML,"myDiv");


Notice how the compressor has added the following lines to the top of the code.


var _w=window,_n=navigator,_d=document;
G=function(i){return document.getElementById(i)}
Now if you are sensible and working with lots of compressed scripts you wouldn't want to have these 2 lines in each script and should place them in a central file that is included on all pages so all your scripts can reference them. Removing these two lines gives us a compression rate of 52.11% which if you compare it against YUI ( 39% ) and JSMin ( 27.7% ) is pretty good. Even with these 2 extra lines added to the output we are compressing on the same level as YUI. Larger files will do much better and I have had compression ratios of 60-70% on certain files so far.

Some other things to note about the compressed output are:
  • Function parameters have been renamed to use single letters. I don't rename those that are already less than 2 characters in length.
  • Local variables are also renamed to use single letters. I start at a and increment up to z and then if needs be into double letters e.g aa to zz and beyond.
  • Multiple variable declarations have been combined into one.
  • Strings on multiple lines have been joined together.
  • Comments have been removed.
  • Unneccessary terminators have been removed.
  • Global objects window, document and navigator have been renamed to use short aliases.
  • Any reference to document.getElementById has been replaced with a call to the new function G.

Now we have a compressed version of the file lets see what happens if we run this compressed code through my unpacker tool. Don't be fooled by the name the unpacker not only unpacks javascript packed with Dean Edwards packer but also reformats compressed code and if you have used my compressor to do the original compression you get an added bonus that it will reverse engineer the Get function and also the global object minification to make the resulting code more understandable.

The unpacked code is below.

var myObj = {
myFunc: function (a, b) {
var re = /^\/firefox.+\d+?/gi,
c = "IE",
d = "vars",
e = 100,
f = "this is a string that continues on a few lines";
if (document.getElementById(a).value == c) {
window.open("someurl.htm", "width=500")
} else {
if (/firefox/i.test(navigator.userAgent)) {
document.getElementById(b).innerHTML = "you are using Firefox"
}
};
return
}
};
myObj.myFunc(document.getElementById('div1').innerHTML, "myDiv");


Notice how the extra code that my compressor put in before has been removed and how any global objects that were being referenced through aliases and the Get function have now been replaced with their original values. Not only does this aid readability, it also helps you understand the codes purpose when you can see the original object name that is being referenced.

Now even if you cannot get your hands on the original uncompressed version of a script you can get a pretty readable version back out from a compressed and minified version with just a few clicks of a button.

Tips for compression

Put minified references to global objects and your Get function in a global file that all your scripts can reference.

When coding your global objects build in a minified name alongside your standard name e.g

var _S = System = { }

The same goes for your important and most frequently referenced functions. You will get most benefit from minifying those functions that you reference constantly throughout your site e.g functions to get and set elements, add and remove events, traverse the DOM etc e.g

G = getEl = function(i){ return document.getElementById(i) }
O = getObj = function(i){ return (typeof i == "string") ? getEl(i) : i }


This is good for multiple reasons including allowing you to reference your objects and functions with both names. For example if the compression fails to rename any references to the longer name then the code will still work as the function can still be referenced. You may also have files elsewhere that have not been through the compression process and this way they will also still work.

Always make sure you correct any syntactical errors and make sure any nested single line IF/ELSE statements are wrapped in brackets to avoid any problems. Although perfectly legal to not wrap single line conditionals in brackets its a lot more readable if each branch is contained correctly. You can use Douglas Crockfords JSLint tool to do this online if you need to.

Saturday 22 August 2009

Strictly Javascript Compressor

Introducing yet another Javascript compressor

If you read a previous article about creating a script compressor you will know that I like to write my own code purely for sadistic reasons and mainly due to a rare form of OCD that makes me unable to stop coding until something is complete. I am sure this a rather common infliction amongst some in the coding community however it does have its benefits. Although creating a compressor from scratch is a long and painful process it has the upside of increasing knowledge about code syntax, compression techniques, regular expressions, language quirks and much much more. I wouldn't recommend doing it unless you have plenty of time and a desire to know the pain that regular expression based compression can bring.

The Strictly Software Javascript Compressor Tool

I have put a cut down version of my compressor tool up on my website which you can find at the following link: Strictly Javascript Compressor

The online tool only works with Javascript but I have made available a number of options which will allow you to customise the compression process.


Compression Options


The following is a list of all the advanced options available with the online version of the compressor.

Compression Modes: Simple or Complex

Complex mode will try to format the Javascript so that each function is on its own line in the output. If you have a function that contains other functions then only the outer most function will be formatted like this. To accomplish this the engine must try to auto-correct any lines that have missing terminators or brackets by inserting them. In most cases this should work but I cannot guarantee this in 100% of cases due to the engine being based on regular expressions. However if you format your code correctly before hand ensuring all lines are terminated and IF, ELSE statements are wrapped in brackets then you shouldn't get any problems.

Simple mode will not try to put each function on its own line although it will condense multiple occurrences of brackets to one line. This mode will give a longer output as it will contain more carriage returns however it will be less likely to cause syntax errors.

Complex mode will result in a better compression rate than simple mode, sometimes up to 10% or more in certain cases.

Minify Functions

If you have specific global functions that you wish to replace with smaller names then you can provide a list of the functions to replace in the format

[function:minified,function:minified]


For example if you want to replace all occurrances of the function getEl with a minified name of G and the function addEvent with the name A then you can provide a list in the format of:

[function:minified, function:minified]

[getEl:G,addEvent:A]


Important Note:
  • You can only provide 20 functions to rename with this online tool.
  • As my minification process renames long variable and parameter names in local functions with single letter versions starting from a and incrementing up to zz you should avoid using lower case letters for your minified function names to avoid clashes. Its recommended to use upper case letters or use underscores.


Minify Objects

In the same way that you can change function names you can also provide up to 20 global objects that you can replace with minified names. For example if you have global objects Debugger and System you could replace them with _D and _S respectively. Provide this list in the following format of

[object:minified, object:minified]

[Debugger:_D,System:_S]

Important Notes
  • You can only provide 20 objects to rename with this online tool.
  • As my minification process renames long variable and parameter names in local functions with single letter versions starting from a and incrementing up to zz you should avoid using lower case letters for your minified object names to avoid clashes. Its recommended to use upper case letters or use underscores.
Minify Global Objects

This option will replace some standard global objects with smaller versions. The objects it will replace are Window, Document and Navigator. The engine will add the following line of code to the output:

var _w=window,_d=document,_n=navigator


Then it will replace all occurrences of window with _w and so on. Note how I have added underscores to the variable names so not to conflict with the standard minification process of function parameters and variables.

Create a Get Function

This option will replace all occurrences of references to document.getElementById to a minified function call. As this is a very common reference in Javascript code on the web it will save considerable bytes and is very easy to do. The name of the function created will be decided by the value supplied for the Get Function Name option.

Note: If none is supplied the value will be G.

Get Function Name

This option is related to the previous option and only available if you have decided to create a Get function. The value you supply will be used for the minified function name. For example if you provide the following value _S then the following function will be created and added to the compressed output:

_S=function(i){return document.getElementById(i)}


Remove ShowDebug Functions

This maybe a very specific function catering to my own needs but its something I would recommend to all developers. As you create your code you should build in calls to a debug function that will output messages to the console (e.g Firebug, Firebug-lite, IE, Chromes console). This is a much better idea than having to add debug code once a bug has been found and it will save time in fixing the bug. However you should also remove all these functions on a live production environment as you will not want your users to view these messages and even if you turn debugging off inside the function an unnecessary call to the function is made. I always call my debug function ShowDebug. This option will remove all these calls from the code.

If you would like to know more about debugging and creating a custom debug object please read the following blog article.

Thursday 20 August 2009

Problems with IE 8 Developer Toolbar

IE 8 Developer Toolbar Hanging and Locking Browser

Update 25-Sep-09

I have found that one of the primary causes of IE 8 hanging with the developer toolbar is its console logger. If you have a script that outputs messages to the console I have found that IE will hang until all the messages have been outputted to the console. The more messages you have the longer it takes. Also unlike Firebug which seems to output the messages during normal page load activity IE seems to cache all the messages and then output them all in one batch (or has the appearance of doing so). The CPU seems to max out during the outputting of these messages.

I have been doing a lot of work lately with styles, fades and CSS which has obviously meant working with Internet Explorer building in code to handle the differences between IE and other browsers e.g getting computed styles from IEs currentStyle, Opacity differences etc. This has meant using trying to use the Developer Toolbar which was shipped as standard in IE 8.

It should be a great add-on as it lets you do all the things Firebug has allowed developers to do for a long while including switching Javascript on/off, clearing sessions and cookies, viewing the DOM, inspecting elements and their styles and the ability to output debug to the console.

However I have found that this developer toolbar add-on has a tendency to hang especially after the first use of it. I can go to a page and then open the developer window to change the browser or document modes or step through some debug. However if I minimise the window and go back to the main window and then try to open the developer window again the majority of the time it will refuse to come back to focus. Even if I try to maximise or restore the window or go to the desktop to find it. I have also noticed that when I do get focus back it will only partially be visible as if its trying to load content but struggling.

In fact checking the Task Manager usually reveals a CPU of 50% or more. There will be two processes open one for the developer window the other for the main IE window but even though the developer window is hanging it seems the process with the high CPU is the browser window.

I first thought it may just be some heavy Javascript code running on the page but I have had this problem with numerous sites and other people in the office have had similar issues.

I don't know if this is common issue that developers have found but at the moment I am tending to use Firebug-lite or a custom DIV for outputting debug if I need to and custom pages for killing sessions and checking the current document mode and browser mode which are the 3 most common tasks I find myself doing which is a shame as I really like the layout and functionality the toolbar offers. Its just a shame that it seems to run very badly at the moment.

Saturday 15 August 2009

The large number of duplicate frameworks

Living with multiple frameworks

I have just been investigating some of the source code that some of the new widgets I have added to my site lately contain. Its always good to delve into the source code of any widget or add-on as it helps you as a developer to see other ways of coding including methodology and style. Even with a compressed and packed source code you can easily reformat it to make it readable by using an unpacker tool like www.strictly-software.com/unpack-javascript.

One of the things I noticed was the large amount of duplication in functionality that my site now has due to using these 3rd party add-ons and tools. There is no doubt that these widgets offer some great functionality with little or no cost and development time and they have enhanced the sites usability. However using Googles AJAX API, jQuery, add-this and snapshots and my own core library strictly.js I now have 5 separate JS files being loaded that handle events, load scripts dynamically and other various DOM manipulation functions.

I don't know what the answer is as its too much to ask 3rd party developers to all develop their add-ons with one framework or to ask them to create multiple versions of their widget for each framework. Plus its very likely that any developer who creates a nice add-on will want to write their own code anyway as any coder worth their pay thinks they know best and in most cases they choose to encapsulate any helper functions into the add-ons core files.

Now you could spend a lot of time creating local copies of these add-ons and rewriting all the code to share as much as possible but this would be futile and take a lot of effort. Plus every time a new version came out you would have to go through the same process or otherwise stay with an old version of the software.

I was looking into a really nice looking lightwindow add-on the other day but this involved the installation of another 4 large Javascript files including Prototype and Scriptaculous. Therefore the dilemma was whether to add to the existing frameworks or rewrite the code to use JQuery instead.

This started me thinking around the possibility of some sort of CDN which any developer could access and upload to that would hold files that met the requirements of a standardised API. Developers of new add-ons instead of recreating the wheel could build their widgets using these central objects. It would then be up to the site owner to decide which version of the object he wished to use but as long as any add-ons used on their site implemented this standard API the code would all work.

For example thinking about the common need for event handling by all widgets. Say I am developing a new widget that shows previews of other sites (like the Snapshot add-on). Instead of writing new code to handle event management (add, remove, onload, DOM ready etc) I implement this standard API that has a specific naming convention. There can be multiple objects written to handle events on this CDN and they all differ in their internal workings but they all support the same interface and therefore it would be very easy for a developer to hook into them or change the core framework. An example would be that for any event object written to support this API to add an event they would follow this convention: element.addEvent(type,function,capture).

The main framework developers would all have to standardise their naming conventions if they wanted to use the API but it would mean any site owner using multiple add-ons could choose one core framework for their sites engine and as long as all the add-ons and widgets were built to use the standard API it wouldn't matter if they changed the core framework from JQuery to Prototype the site would continue to work and without the need for code duplication. 3rd party widget developers could specify in their documentation which framework they felt the widget worked best with but it wouldn't depend on that framework and a site owner would make their choice without worrying about breaking the add-ons.

It just an idea that I have just had and it would obviously need lots more thought and the support of the major framework developers as well as any 3rd party widget developers. However until the time comes when some sort of standardised central codebase is implemented and widely used it looks like code duplication and large files are here to stay. Therefore to mitigate this issue site owners will need to make sure all these files are optimised as much as possible through compression, minification, local hosting or CDNs, caching and other tweaks.

This article is a great resource for covering this in detail: http://websitetips.com/optimization/

Thursday 13 August 2009

Content Preview with Snap

New feature on website

I have just added a cool new feature to my site which is available from www.snap.com. You should notice all the little speech bubble icons next to all my links to external sites which if you hover over the icon will open up a nice little popdiv containing various content preview.

For example hovering over the following link will show a screenshot of my website www.strictly-software.com. For new sites that have not been previewed before it may take a little while for the div content to load however once a site has been loaded the cached version should be pretty fast.

The following link which goes to another blog but because its also available as an RSS feed it opens up with that option and you should see a scrollable list of the feed items. Other content could be videos, photo galleries or even google maps.

Its a cool little free feature and can be added to any site with one line of code which is a reference to a custom script. You can customise the pop up from the snap website and change the colour of the background image, whether or not to show the icon, upload a logo and whether the pop-up is used for internal links as well as external ones.

You can also choose to do what I have done which is let the add-on automatically scan the page for links it thinks it can generate content for rather than hand crafting them. All in all a nice little add-on.

Tuesday 11 August 2009

Who owns the codebase my site runs on

Problems with customers not understanding the technology they purchase

The other day I wrote an article about problems that arise when customers try to implement SEO strategy for their sites through 3rd parties only to find that they cannot implement any of the recommendations due to restrictions on the application they have purchased.

Only a couple of days after that post another instance occurred at the company I work for. A company had bought a site from us which was a product that has a shared codebase which is hosted by my company. They then went out and hired someone to work for them whose job would be to edit the sites code and optimise the SEO. This would be fine if the company owned the codebase the site was written in, hosted it themselves or even had access to the files but this customer didn't have any of that. The customer was pretty upset when we tried to explain to him that all he had was a licence to run his website on our codebase but its totally up to the customer to understand fully what they are purchasing and what they can and cannot do in terms of the code before buying it. They especially need to know this if they are going down the route of paying 3rd parties to do work on the site.

I cannot discuss contractual details or any problems or misunderstandings that a customer may or may not have had when they signed up for the site in the first place as I don't know however I do know that the reason our company is successful and can roll new sites out quickly at a low cost is down to our system setup which involves a shared codebase (core web files and Database).

Customers get what they pay for and we can charge very competitive prices due to our setup which involves a lot of configurable options but not full control or ownership of the code. Even though we have a shared codebase each site can have a totally unique design and layout, all the wording and messages are customisable as well as the input fields on the core pages. Customers also have a CMS system to add their own pages and content. However its a complex system behind the scenes and we would never allow 3rd parties to edit our source code.

If a customer does want a copy of the source code and database so that they have full control then they are welcome to pay for that as well as the server(s) to run it from. However they will then be looking at a cost at least 5-10 times of that they currently pay for their site setup. You cannot expect to purchase a systems source code that has taken 4 years development work for the same cost as the product itself.

If you want to have full control over the codebase and pay next to nothing then you can go down that road as there is plenty of open source software about that people give away. You can download a free Joomla site off the web and have fun modifying it for the next year or two until you have all the features a fully tested site would have. It can run on a free MySQL database that runs fine with one user but locks up when you have 250,000 hits a day due to the default table types still being set to MyISAM and none of the tables will be indexed correctly at all so it will perform tricks like a dog on Valium.

I could go on to talk about customers sending over random PHP scripts for message boards that they expect us to just add on to their system build in ASP but I will end my rant here. If any existing customers or potential customers are reading this just take one thing from this post and that is to ensure that when you are buying a website or any kind of software system from a development company that you fully understand what is and what is not within your control to change.

  • Do you own the rights to the source code or do you just have a licence to use the code on your site.
  • If you don't own the rights to the source code is it still possible to modify the code or send customised code to the development team to add to your site.
  • If the answer is yes then what language is the system built in and what type of servers does it run on.
  • Do you have FTP or Terminal Service access to the web servers or can you only modify pages through a CMS system.
  • Is it possible to rename filenames for SEO reasons or is it a shared codebase that is untouchable.

These are the questions you need to ask and its in your own best interest to know the answers.

Monday 10 August 2009

PHP and the demise of ASP Classic

Coding with PHP and the demise of ASP Classic

I have just started dabbling my toes in the water with PHP and MySQL primarily because I have just created a new blog with wordpress. I have always been on the Microsoft side of the development world starting off with Macros and VBA coding, Access databases, SQL 6.5 through to 2005 and then when the Interweb was created by Al Gore in the 90's :) I started using ASP classic, Javascript and then C#, ASP.NET.

All the companies I have worked for have used Microsoft server technology and developed primarily in MS technologies. Other languages have been used including Delphi, Perl and Java but my core experience has been Microsoft for good or bad.

One thing I am surprised at though is how PHP which is a server side scripting language is still flourishing whilst ASP classic has really died a slow death. Obviously there are still lots of sites including large ones that I have developed still running on ASP classic but any new development would be carried out in .NET. I can see how much more flexible PHP is as a scripting language but apart from some better OO characteristics its still script. I am therefore wondering why Javascript used server side didn't take off as the primary language for coding ASP classic.

VBScript has so many downsides such as its Error handling, its useless memory intensive regular expression engine and the only object orientated aspect is encapsulation. Therefore with Javascript having better solutions for all these 3 and coders being familiar with it anyway through using it for client side scripting its a shame that more coders didn't take up the Javascript mantle when they could. Creating development sites on home PCs or any laptop with an MS OS is as easy as putting a .ASP file in your inetpub/wwwroot folder and running it from your browser.

There are new forms of serverside Javascript coding about including Jaxer and Rhino which as I understand it converts the Javascript into Java so it seems to be carrying on in different forms.

As for the PHP I am just getting used to some of the odd syntax such as -> and concatenation using periods but it seems pretty similar to ASP classic in that its loosely typed language and can appear very messy in places when the HTML is mixed with the code. However it has more inbuilt functionality and is a nicer language as I much prefer Java or C# like syntax when coding.

Sunday 2 August 2009

Implementing SEO Strategy too late in the day

Think about Search Engine Optimisation early in the day

In my 10+ years of web development I have noticed many mistakes that customers make in terms of creating an application that meets their requirements as well as being cost effective and delivered on time. Most of the mistakes and problems can be traced back one way or another to not having a detailed specification document that is not only just signed off by both parties but also kept to 100% with any deviation treated as new work to be costed and developed.

Loose specs lead to misunderstanding's from both parties where the customer expects one thing and the developer is building another. Even with a signed off stringently kept spec there is also the problem of customers not understanding the technical limitations or boundaries of the system they are buying into. An example I want to give is in relation to SEO which is usually treated as an after thought by the customer rather than as a key component during the specification stage. I work for a recruitment software development company and have built a system that is now running 200+ jobboards. I have noticed that what usually happens is that the customer is under some illusion that the 7 or 10k they have spent on a site has bought them a custom built system that is flexible enough to allow any future development requirement that they may wish to have down the line. Now this maybe the fault of the sales person promising the earth for peanuts or it may not but in reality they have bought an off the shelf generic product that can be delivered quickly and comparatively cheaply exactly because it hasn't been custom built for their requirements.

One of the ways this problem manifests itself is in Search Engine Optimisation as the customer will usually wait a couple of months after the site going live before realising that they are not top of Google for the word jobs and ask why. They then discover numerous SEO specialists that offer to get them to the top of Google for certain terms and invest lots of money in audits and optimisation reports only to find out that we cannot implement everything they want because they are using a system with a shared codebase. Yes our system has numerous inbuilt features for SEO that can be turned on and off but asking for and expecting specific development that has been recommended by a 3rd party after development has completed can cause unneeded stress and tension especially when the customer is told no due to system limitations.

What the customer should do is think about and investigate the possibilities of SEO before the specification document has been signed off rather than months after the site has gone live. This way any limitations of the system can be discussed so the customer is made aware that spending money with a 3rd party who is also unaware of system limitations is probably a waste of £££. Also any good ideas they may have regarding SEO requirements can be planned out and possibly developed during the main development phase rather than thrown in later as an after thought. Even if its a generic system good ideas are good ideas and a benefit for one customer will be a benefit to others as the development house can make money by reselling the feature as an add-on to existing customers.

I am not saying 3rd party SEO consultants don't do a good job but potential customers of sites need to be aware of what they are buying and what is and not possible before they spend money with any 3rd party. There can be nothing worse than spending money with a consultant only to find out that their recommendations cannot be implemented or if they are implemented it will cost even more money for the extra development. So take my advice and think about SEO before not after development as not only will it save time and money but having good SEO from the off will mean your site gains better positioning in the search engines quicker rather than later.

Further Reading:

Please read this article of mine about techniques for good and bad search engine optimisation.

Saturday 1 August 2009

Using Online Translator Tools

Increasing your sites audience by targeting other languages

This is probably the best reason for investigating the various online translator tools and APIs that are available to use. If you can convert your site into other languages it will increase the number of indexed pages in the major search engines and drive users to your site. Make sure you have links to the original content so that even if the translated version is poor the user has the option of reading the article in the source language. We should remember that although most English speaking people have a poor grasp of other languages the same is not true in reverse and as English is the main business language of the world a large percentage of the world outside England and North America can speak and read English.

I have been looking into the various methods for translating content lately for my own sites and although the translators such as BabelFish and Google Translator are not perfect in their translations they are just good enough to allow someone with an understanding of that language to get the overall meaning or gist of the translated text. Although I am not 100% positive about their internal methods I would reckon they work by first translating common phrases and sentences and then revert to word to word translations. I would say this is probably good enough to cover most non-language critical sites however if I had been paid to deliver a site in Russian, French or Chinese because the sites main audience would be reading this language then I would go down the route of using professional translators rather than a free online tool. However in most cases using a free tool is fine for increasing your sites audience.

From my experience over the last few weeks I have noticed the following issues.

1. Trying to translate a whole webpage sometimes causes format issues especially with text on top of other text. For example view this Chinese translation of my unpacker tool using Yahoos Babelfish and notice the formatting issues especially the code examples and the text around the google adverts.

2. Even with a translator that doesn't cause formatting issues if you are outputting code examples even in PRE and CODE tags you will likely face issues with your code getting translated when it shouldn't be. Look at this example of the same unpacker page in Chinese using Googles translator tool and look at how the variable names have been translated.

3. Therefore I found that to translate a whole page in one go wasn't really feasible using one of these tools and I had to break it into pieces to ensure the code examples were not translated. I converted a couple of my online tools into Russian and Chinese e.g
Unpacker tool in Russian and Chinese.
HTML Encoder tool in Russian and Chinese.

Luckily I work in the same office as a lady from St Petersburg and I asked her to take a look at the Russian translation on my unpacker tool. She said that it was about 80% accurate and was the sort of language and grammar that her 10 year old daughter would write. To be honest I was expecting a lot worse and was pretty happy to hear that the main gist of the page could be understood in that language.

4. Names of people and coding terms sometimes cause issues with too literal a translation. For example the name Dean Edwards in Chinese comes back with the word Dean translated into Chinese and Edwards remains in English. This is probably due to the word Dean being construed as meaning the Dean of a college. This is obviously the main problem with word for word translations. Also slang words, curses, youth chatter or abbreviations will be very problematic to translate automatically especially as these sorts of words have a very niche audience in their own country let alone worldwide. I would probably have a hard time understanding the slang terms that kids speak today just as my elders would have with me when I was young so putting these sorts of terms on the web and asking an automated process to translate the word and keep its meaning would be a near impossible task.

5. Carrying out mass translations I found to be quite problematic as these online tools don't like you posting thousands of words to translate individually. However if you are aiming to get your content into multiple languages for SEO reasons then you need to have your translated content delivered server-side or as static HTML rather than using client side tools such as Googles Translator API or Bings new API. This obviously means either doing each page by hand in steps or trying to automate it. If you are going to crawl an online translator tool and don't want to be met by one of Googles "your query looks similar to automated requests from a computer virus or spyware application." messages then you need to be careful, not make too many requests too quickly, and change agents and IPs between requests as much as possible. Another option I thought about was to create an AJAX logger tool and then use Javascript and Googles AJAX API to make the requests to its translator object and then log the results. From what I have read there is no limit on the amount of requests you can make as long as you keep to their Terms and conditions and all you need is a Google API Key.

If you are not concerned about having copies of your original content in multiple languages for SEO purposes then offering your visitors the ability to read a page in their own language by using client side tools is a good way to go. Using Googles or Bings API you can translate your content as it loads. I have a little function that allows me to pass an array of IDs into my translator wrapper object which will then loop through the array taking each related elements innerHTML, translate it and then as the result comes back replaces the original content. With this method you can be very precise with what you translate and avoid issues such as translating code examples. Also rather than a long wait whilst the whole content is translated you achieve a ripple effect as the DOM is modified bit by bit. You can see an example of this on my football site with a test page I created which will translate from English to Spanish. Also if you are translating single words such as labels, button values and headers you are probably more likely to get a better translation and you are only requiring a word for word match.

Another online tool I have just added to my main site is a Twitter feed translator which will use Googles Translator API to translate tweets from any Twitter account from one language to another. You can use the form as is or you can directly link to it from your own site passing the name of the Twitter feed you want to translate and then the language code for the language the feed is written in and the language you want to translate it to. For example to view my Twitter feed in German you would use the following URL:


If you would like to see the code behind this form and how it links into Googles API you can read the following article about translating twitter feeds.

Now I am not suggesting that twitter feeds will be the easiest forms of text to translate especially due to the restriction on the amount of text you can use per tweet which usually leads to slang terms and abbreviations being supplied. However as a large percentage of twits (what do you call people who use twitter?) seem to use the application to share links with a short blurb about the links content its probably good enough for accessing tweets containing key #tags of interest supplied in other languages. Plus it enabled me to offer a tool that's puts to use the following features:

-Makes use of Googles AJAX API, especially the translate object.
-The user-interface is very basic with little explanatory text and I have implemented an auto translate of the main labels, buttons and text so that if the user is not English these will be translated as the DOM loads. This makes use of Googles GEOcode data which is supplied for free with their API.
-I also use this GEO data to work out the users location and if they are from the UK I will display a specific banner advert otherwise I default to a Google Advert.
-Use of ISAPI url rewriting to allow automatic translations and nice links to the page.