Sunday, 23 August 2009

Compression and Unpacking Javascript

Reverse Engineering the Strictly Compressor with the Unpacker Tool

I have just put up a cut down version of my own compressor tool on my website. The compressor has a number of advanced options which allow you to customise the compression process as well as take care of some very common global objects and function calls such as window and document. If you choose these options the tool will add in aliases for these objects and then replace any references to the object within the code with the alias instead. An example of the compression and then its reverse engineering with the unpacker tool is below.

Example Code

The following code is a dummy script that does nothing apart from show the process at work.


/* A make believe example object
and calling function */
var myObj = {

myFunc : function(test1,divID){
// throw in a regular expresion literal
var re = /^\/firefox.+\d+?/gi;
var test2 = "IE";
var some = "vars", anothervar=100;

var somestring = "this is a string "
+"that continues "
+"on a few lines";

if(document.getElementById(test1).value==test2){
window.open("someurl.htm","width=500");
}else{
if(/firefox/i.test(navigator.userAgent)){
document.getElementById(divID).innerHTML="you are using Firefox";
}
}
return;
}
}
myObj.myFunc(document.getElementById('div1').innerHTML,"myDiv");



I will run it through the compressor tool selecting the advanced options:
  • Minify Global Objects.
  • Create a Get function.
  • I have used the default value of G for my get function.

The compressed output which is a 37.76% reduction in size is below.

var _w=window,_n=navigator,_d=document;
G=function(i){return document.getElementById(i)}
var myObj={
myFunc:function(a,b){var re=/^\/firefox.+\d+?/gi,c="IE",d="vars",e=100,f="this is a string that continues on a few lines";if(G(a).value==c){_w.open("someurl.htm","width=500")}else{if(/firefox/i.test(_n.userAgent)){G(b).innerHTML="you are using Firefox"}};return}};myObj.myFunc(G('div1').innerHTML,"myDiv");


Notice how the compressor has added the following lines to the top of the code.


var _w=window,_n=navigator,_d=document;
G=function(i){return document.getElementById(i)}
Now if you are sensible and working with lots of compressed scripts you wouldn't want to have these 2 lines in each script and should place them in a central file that is included on all pages so all your scripts can reference them. Removing these two lines gives us a compression rate of 52.11% which if you compare it against YUI ( 39% ) and JSMin ( 27.7% ) is pretty good. Even with these 2 extra lines added to the output we are compressing on the same level as YUI. Larger files will do much better and I have had compression ratios of 60-70% on certain files so far.

Some other things to note about the compressed output are:
  • Function parameters have been renamed to use single letters. I don't rename those that are already less than 2 characters in length.
  • Local variables are also renamed to use single letters. I start at a and increment up to z and then if needs be into double letters e.g aa to zz and beyond.
  • Multiple variable declarations have been combined into one.
  • Strings on multiple lines have been joined together.
  • Comments have been removed.
  • Unneccessary terminators have been removed.
  • Global objects window, document and navigator have been renamed to use short aliases.
  • Any reference to document.getElementById has been replaced with a call to the new function G.

Now we have a compressed version of the file lets see what happens if we run this compressed code through my unpacker tool. Don't be fooled by the name the unpacker not only unpacks javascript packed with Dean Edwards packer but also reformats compressed code and if you have used my compressor to do the original compression you get an added bonus that it will reverse engineer the Get function and also the global object minification to make the resulting code more understandable.

The unpacked code is below.

var myObj = {
myFunc: function (a, b) {
var re = /^\/firefox.+\d+?/gi,
c = "IE",
d = "vars",
e = 100,
f = "this is a string that continues on a few lines";
if (document.getElementById(a).value == c) {
window.open("someurl.htm", "width=500")
} else {
if (/firefox/i.test(navigator.userAgent)) {
document.getElementById(b).innerHTML = "you are using Firefox"
}
};
return
}
};
myObj.myFunc(document.getElementById('div1').innerHTML, "myDiv");


Notice how the extra code that my compressor put in before has been removed and how any global objects that were being referenced through aliases and the Get function have now been replaced with their original values. Not only does this aid readability, it also helps you understand the codes purpose when you can see the original object name that is being referenced.

Now even if you cannot get your hands on the original uncompressed version of a script you can get a pretty readable version back out from a compressed and minified version with just a few clicks of a button.

Tips for compression

Put minified references to global objects and your Get function in a global file that all your scripts can reference.

When coding your global objects build in a minified name alongside your standard name e.g

var _S = System = { }

The same goes for your important and most frequently referenced functions. You will get most benefit from minifying those functions that you reference constantly throughout your site e.g functions to get and set elements, add and remove events, traverse the DOM etc e.g

G = getEl = function(i){ return document.getElementById(i) }
O = getObj = function(i){ return (typeof i == "string") ? getEl(i) : i }


This is good for multiple reasons including allowing you to reference your objects and functions with both names. For example if the compression fails to rename any references to the longer name then the code will still work as the function can still be referenced. You may also have files elsewhere that have not been through the compression process and this way they will also still work.

Always make sure you correct any syntactical errors and make sure any nested single line IF/ELSE statements are wrapped in brackets to avoid any problems. Although perfectly legal to not wrap single line conditionals in brackets its a lot more readable if each branch is contained correctly. You can use Douglas Crockfords JSLint tool to do this online if you need to.

No comments: