Showing posts with label YUI. Show all posts
Showing posts with label YUI. Show all posts

Sunday, 4 October 2009

Writing your own framework

People always ask me why I don't use one of the big frameworks like jQuery, Prototype, Mootools, YUI etc. My answer is that its not that I don't think their code is good although I have found numerous bugs or issues in all of them over time but rather that I prefer to write my own code because that way I get to understand the language and become a better coder.

I don't claim to be a brilliant Javascript programmer and its only within the last few years that I have developed a big interest in it as opposed to back-end coding and database development. However I know that I won't get to the level of the John Resig's, Dean Edwards and Douglas Crockfords if I always rely on others to write my code for me.

For example on a project I worked on I had to create a fading in lightbox WYSIWYG editor that could float around the screen but never leave the boundaries of the viewport. This is no easy job for a novice and I could have easily loaded up Prototype, Scriptalicious and TinyMCE, found a how to article on Google and got the code working with a few hours. However I would have no idea how the code worked and if I had to make customisations to the code I would then have to spend hours trawling through lots of code I didn't understand hacking about until my tweak was complete. Plus I would have to do it in a way that any updates to those libraries didn't overwrite my changes when future updates were rolled out.

Obviously if you have time constraints or just don't care about how things are done but just want them done in the quickest way possible this maybe the way forward for you and I don't blame you for not wanting to commit the time it would take to build a widget such as the one described from scratch. However this is what I chose to do and it involved spending some considerable time reading up about my intended task before even attempting to write a line of code for the job.

This method of development will take you some time to achieve and you will probably spend a lot of time pulling whatever hair you have left out of your head. However along the way you will learn a lot about your craft including browser differences, the history of the DOM, event models and future developments as well increasing your own skills as a developer.

The pain and time spent will be worth it as when it comes to getting a high paid job in development in the future you will have done yourself a big favour by going down the hard route. Plus the widget or application you have just built will be your own creation and you will know your own code inside out. Any bugs that need fixing will not rely on some 3rd party to release an update and you will have pride in completing a job that many others would not do.

Now I am not totally against frameworks and even use a couple on my own sites. However I do feel that people tend to use them without due consideration to what tasks they actually need to perform. If you are writing heavy DOM manipulated AJAX application that utilise the majority of JQueries features then that is the tool for you. However if you are not intending to use the majority of the features and just like the fact you can access an element by id with the $ sign then you have bloated your codebase unnecessarily.

Whether you write your own code or use frameworks you should at least spend the time looking at the code so that you understand what it is your using and how it works. Once you do this fair play but there is nothing worse in my eyes that someone who uses a framework such as Prototype or Dojo but reverts straight to a forum when they get stuck rather than looking at the source code to see what is happening under the hood. If you at least attempt to do that first and then still don't know what's going on then your well within your rights to ask for help. However at least try to figure out what the code is doing as it may take up some of your time but it may also help you understand JavaScript a little bit more.


If you want to see some ideas on how you could go about writing your own JavaScript framework then check out this article of mine which goes over the basic concepts of using CSS selectors to find your nodes, applying functions to those nodes, chaining functions together, and passing the results of the first function to the second in the chain:

http://blog.strictly-software.com/2009/10/build-your-own-framework.html

Sunday, 26 April 2009

Write your own script compressor

Compression Techniques

I have just finished writing my own script compressor to handle the compression on one of my larger systems. You may ask why I didn't just use one of the existing free compressors out there like jsmin or YUI compressor. Well the answer is two fold:

1. I like to write my own code for the simple reason I learn more about my craft doing it this way and although it takes time and usually blood sweat and tears I always come out the other side with more knowledge about development practises in general. See this link about whether to use a framework or write your own for more details.

2. The existing compression tools didn't do all I wanted them to do and also I found certain issues with certain syntax such as complex regular expression literals and conditional comments.

This article is just an overview of some of the things I discovered during the building of my script in case other people want to write their own.

The main features of my compressor are:
  • Takes a directory as its source and will output all compressed files to a destination folder replicating the structure of the source folder.
  • Handles multiple file types e.g JS, PHP, ASP, INC, HTML, CSS
  • A file such as an ASP file that contains in-line JS, HTML, CSS and ASP will have each "section" compressed according to that sections code type.
  • The usual compression technique of removing comments, excess white space and empty lines is carried out.
  • JS code is also compressed by having function parameters and local variables "minified" by replacing variable names with one character names. 
  • Global objects and common functions are also replaced with short versions.
  • Debug functions are removed.
  • JS functions are saved one per line so even the compressed file is nicely formatted.
  • Adds missing terminators ; to aid compression of functions to one line.
  • Corrects HTML so that its XHTML valid.
  • Skips files that are already compressed.
  • Creates a log file detailing the files compressed and the compression rate.

Building a Compressor

I found out during this task that getting a fully working Javascript compressor without the aid of a Java engine is not as simple as it may at first sound. Especially if you wanted like I did each function to appear on its own line and you are handling code that may not be correctly formatted in the first place (e.g missing terminators ;). 

If you want to just remove excess white space and comments then that's fine but doing a "minifier" that renames variables with single letter names is a bit trickier as you need to identify all your functions and parameters correctly for this to work. Remember there are a multitude of ways of defining functions in JavaScript as well so its not just a case of looking for the word function e.g:



function myFunc(var1, var2){ return }

var myFunc = function(var1, var2){ return }

SomeObject.prototype.myFunc = function(var1, var2){ return }

myFunc : function(var1, var2)

(function(){ code }())


Also unless you create a global system that checks which JavaScript functions are being referenced by each file then you can only really minify local function variables and parameters. Otherwise you may change the name of a variable that is being referenced by a file you don't know about causing an error.

If you want to create a very simple minification process you can concentrate on some common global objects such as the window, document and navigator objects. You can also create yourself a small function for document.getElementById and then update all references to that e.g

var _w=window,_d=document,_n=navigator;
//create wrapper function
_g = function(i){
return _d.getElementById(i);
}

So for every time your site references document.getElementById which is probably quite a lot you have saved yourself 18 characters. Add to that the common use of window and document you will save a fair few bytes just by these changes alone.


Store and Replace

I found the best approach was a layered approach to parsing each file as files such as PHP or ASP will contain a mixture of both server side and client side script as well as HTML and maybe CSS defined in style tags. With the help of some good regular expressions I would look for each type of code using their identifying markers for example look for the open and close style tags. I would then store each block of code in an array, parse it accordingly and then when rebuilding the compressed file re-insert in order. This got round issues where you have ASP script inside in-line JS script blocks within HTML inside an ASP page. Each section would have its own function that compressed according to that languages syntax but all would first store any string and regular expression literals in another array so that they stayed unaffected during any compression as you don't want to be removing white space and changing symbols within string literals.


Fun with regular expressions

Most of my compression was carried out using regular expressions but some tasks such as identifying literals and comments were done in loops. This is pretty easy when your looking for string literals as you know they are going to be enclosed within double or single quotes and you can combine this with a check for single and multi line comments at the same time. However in JS regular expressions can also be defined as literals such as

var re = /^\S\s+[^/].*?>/gi

You cannot just start at the first / and then stop at the next unescaped / as you can use unescaped slashes within character groups e.g [/] so you would end prematurely. You also cannot stop at the last / you come across as you may have multiple statements on the same line such as:



var re = /^\S\s+[^/].*?>/gi, str = str.replace(/##BR##/gi,"<br />");


Plus as you can see the replacement value on the second statement has a forward slash inside it so you would cut off half the replacement value causing a syntax error. I first took the decision that so what a bit more than I intend gets stored as a literal until its put back in but if that extra bit of code also contains variable names that you want to minify you will run into problems.

I got round this problem by first using a regular expression to identify unescaped forward slashes within expressions and replacing them with a placeholder. I could then use another pattern to match string functions such as replace, match, search, split, compile and another for literals and regular expression functions such as test and exec. Once the literal is stored I can put the escaped characters back.


Negative Matches

I also found that the previous technique of using a placeholder value before carrying out a regular expression match was very useful when dealing with complex negative matches. If you have a long string of text and you are trying to carry out a replacement except in a certain instances then this is a good way of having to avoid a complex negative pattern match. For example in my JS compression function I add in extra terminators to the end of lines to make sure I can get a whole function on to the same line. However doing  this sometimes causes issues when terminators are put in places they shouldn't be so at the end I run some corrections which remove terminators from places they shouldn't be such as inside certain brackets. However there are cases where a terminator can appear inside an open bracket legitimately such as a for expression without all the sections. Therefore instead of using a complex negative match to do the excess terminator replacement I use a placeholder e.g

// put a placeholder in for the terminator I want to keep
strJSCompressed = strJSCompressed.replace(/for(;/,"##_FOR_TERM_##");

// carry out the replacement of terminators
strJSCompressed = strJSCompressed.replace(/([\{\(\[,><\|&])(;)/,"$1");

Then once the replacements have been carried I put back in all the original values that the placeholders were storing e.g


// put the placeholder back in to my code
strJSCompressed = strJSCompressed.replace(/##_FOR_TERM_##/,"for(;/");


This is a very useful technique when you want to avoid complex regular expressions that involve negative matches as you should know by now complex patterns and long strings combine to cause high CPU!

Another good example is HTML comments. I want to strip all HTML comments apart from the following derivatives:

Server Side Includes e.g <!-- #virtual="/somefile.inc"-->

IE Conditional Comments e.g <!--[if lt IE 7]> OR <![endif]--> 

Server Side META Includes e.g <!--METADATA TYPE="typelib" Blah -->

As you can see trying to write one regular expression that would handle multiple pattern matches within a file that strips all HTML comments apart from those that start with # [ or METADATA would involve some hardcore matching. Its very easy though to match each individual comment type first and replace it with a placeholder, then do my replace for everything between <!-- AND --> and then put the placeholder values back in.

A tidy page is a godly page

As well as carrying all the usual compression functions I also incorporated a number of replacements to tidy up the code. If your going to loop through each page in a system then this seems like a good place to do such things as:

  • Make my HTML XHTML compliant by encoding characters, expanding attributes, making sure all attributes are quoted and that my tags are lower case and some other HTML related tweaks.
  • Removing comments from within SCRIPT blocks as they are not required anymore as well as shortening SCRIPT tags down to the minimum e.g remove the language and type attributes. Obviously this breaks your XHTML compliance but then again you can't have everything.
  • Combine multi-line string literals together into one variable.
  • Combine variable declarations (server and client side) into one declaration.
  • Remove certain function calls such as calls to my custom ShowDebug function that outputs messages for client and server side script. I always build my codebase with the debug statements built in rather than add them in later as it makes debugging quicker and easier. However on a production system these function calls are expensive and unnecessary and should be removed.
  • Remove excess white space, usually TABS within dynamic SQL strings. Obviously I don't do UPDATES or DELETES only SELECTS but I usually format my code with TABS.

Why Compress Server-Side Code?

Well yes I know that even interpreted languages such as ASP gets compiled into a token based language which is cached by the web server so there is not much scope for compression but the smaller I can make the file size then the better in terms of storage and maybe caching. Plus the main point of the server-side compression was to remove all my ShowDebug function calls to aid performance.


So can I get a copy?

Not at the moment as I am in the process of testing it on a live system to iron out any bugs. At the moment the code is a script that I point at a directory and run. I am hoping to make a C# based windows application version of it and then I might put a copy up on the site.

Should I use a framework?

When to use a framework and when to write your own code

I am always asked at work why I don't just use a framework such as jQuery, prototype, YUI, MooTools etc rather than spend time writing my own code and its a fair point. I have spent time looking at the major frameworks and its all good code written by clever people and if you haven't got the time to spend then I would definitely recommend using a library. Then again if John Reisig had thought like that then millions of people would be using YUI instead of jQuery and Microsoft would be packaging another library with Visual Studio to handle selectors instead.

Libraries are good for many reasons they hide browser incompatibilities from the developer and its good for a team of developers to stick to a standard code base rather than all adding their own functions and bloating a site up with several versions of the same addEvent or toggleClass function.

The downside is that most libraries will contain lots of code that is never even used by the developer. If you're not even going to be using selectors to return DOM objects in your JavaScript and are just looking for a shorter version of document.getElementById then using $('#blah') is not the way to go. 

The other good thing about writing your own code is that you get to understand the language of your trade a whole lot better than if you just relied on a library. There is no better way in my opinion for learning anything that being thrown in at the deep end and having to sink or swim so to speak. Yes it takes much longer as you will have to read up about the early browser wars and compatibility issues, learn about objects and their properties and understand event models and script syntax but it will make you a much better programmer and when bugs appear due to a new version of Internet Explorer you won't have to wait for an update to your framework to be released.

As with all things its swings and roundabouts and just because I like to write my own code and know why things work the way they do does not mean I won't use a library. However having spent the time researching the language for my own code has given me invaluable knowledge and it helps being able to step through something like jQuery and actually understand what its doing and why rather than just knowing that it works.