Monday 30 November 2009

Changing all links and source attributes in the DOM

Working with hosted merchant payment solutions

If you have ever worked with hosted payment solutions such as SecPay (now PayPoint) and WorldPay you will have dealt with Callback pages which are pages containing server-side code e.g .NET, ASP, PHP etc and located on your webserver but are loaded up and displayed within the payment gateways secure domain.

This means that any relative links on images, stylesheets, scripts and anchors will be relative to the payment gateways domain and not your webserver. Therefore if you don't apply some code to correct these links the styles won't load and the links won't go anywhere apart from 404 error pages.

You could ensure that all your links are absolute anyway in which case you won't have a problem but often this isn't possible for numerous reasons. Therefore if you don't want to create a very basic minimal template page to use for your callback page to get round this issue you can use some client side Javascript to loop through all the relevant collections and change the links to reflect the true location of the files.

The following function is one that I use on my own system. It is called once the page loads and loops through the A, LINK, SCRIPT and IMG collections checking the current src or href attributes and makes sure any relative links are changed into absolute ones pointing to the true base URL (e.g your site) and not the payment gateway, and for absolute links that have already been resolved incorrectly it replaces the payments domain with the true domain. This ensures that all links point to absolute URI's that reference your site and not the payment gateway which has loaded the content to display on its own system.

If you are using server side code in your callback page then you can replace the top two parameters makeAbs.domain and makeAbs.directory that refer to the base URL and the Virtual directory that contains the callback page on your webserver with some code to dynamically populate those values. The full function code is below.
makeAbs = {
// the domain we want to reference
domain : "http://www.mysite.com",

// the virtual directory containing the file that will be referenced
directory : "/somedomain/subdomain/",

// function to modify the DOM call once page has loaded
ModifyDOM : function(){

// change Anchors
this.ChangeLocation("A","href");

// change CSS Links
this.ChangeLocation("LINK","href");

// change SCRIPT
this.ChangeLocation("SCRIPT","src");

// change IMG
this.ChangeLocation("IMG","src");

},

ChangeLocation : function(tag,att){

var o,n,h,e=document.getElementsByTagName(tag);
for(var i=0,l=e.length;i<l;i++){
o = (att=="href")?e[i].href:e[i].src;

// if current href/src is blank then skip
if(o && o!=""){

// if its a relative link
if(!/^https?:\/\//.test(o)){

// if its just a filename then we need the domain + virtual to create absolute URL otherwise just need our domain
n = ((o.substring(0,1)=="/") ? this.domain : this.domain + this.directory) + o;

// if its an absolute URL make sure the payment servers domain is replaced with our own in case relative links
// have already been associated with the wrong location
}else{
n = o.replace(document.location.protocol + "//" + document.domain,this.domain);
}

// now reset with our new value
if(att=="href"){
e[i].href = n
}else{
e[i].src = n;
}
}
}

}
}


The code can be downloaded as a file from the following location: makeAbsolute.js

Sunday 22 November 2009

Disabling Bold Highlighting

Using bold highlighting for Search Engine Optimisation

As you may have noticed I tend to use bold highlighting on keywords specific to my articles on this blog. The primary reason for this is to aid SEO as Google and other search engine bots will consider words wrapped in bold, strong, em and header tags to be more important than other content. If you are marking certain text out for the user it means you consider these words to be important and so the SEO bots will as well.

Wrapping all your text in bold or header tags will not work and will in fact get you marked out as a spammer so you should use this technique sparingly. I have used it for a year now and I think its worked very well as a lot of my articles are ranked very highly for certain keywords.

 Obviously other key factors are also important such as the length of the article, the percentage of wording marked as highlighted in relation to overall content and the words marked. I would consider under 10% to be optimum for this technique as any more is getting into the realms of spam.

The other reason I do it is for users who have bad eye sight or for people who speed read articles to mark out the key sentences. Obviously not everyone likes this technique and I have had a few moaning minnies and you cannot please everyone all of the time. 

However as my first aim is to optimise for SEO so that more people get to read the articles in the first place I am going to keep using this technique. However I have added two links in my sidebar menu on the right which you can use to disable this highlighting if you so wish.

The "Turn off bold highlighting" option will just disable any bold highlighting on the current blog page. Once you do this the link should change to "Turn on bold highlighting" to reverse the change.

To turn it off on all articles so you don't have to click the link on each visit you can use the "Turn off for all pages" option which will turn it off on the current page and also set a cookie so that whenever the blog loads it remembers if you want this option. Again once clicked it will toggle the link to "Turn on for all pages".

The quickest technique I found for turning the highlighting on and off was to use selectors to select all my content and then apply or remove a class to those elements. 

For some reason Blogger has a mix of highlighting with the old B tags as well as <span style="font-weight:bold;"> rather than the preferred method of using STRONG tags. I know a lot of WYSIWYG editors will automatically convert SPAN formatting into STRONG tags but for some reason some of my articles have this mixture. So if you look at my code that disables it I have applied all 3 methods to cover all HTML tags.

function unbold(){
 
 // select all B tags within the main post-body div and apply a class
 G('DIV.post-body B').setAtts('className','unbold');
 
 // same method on SPAN tags
 G('DIV.post-body SPAN').setAtts('className','unbold');
 
 // same method on STRONG tags
 G('DIV.post-body STRONG').setAtts('className','unbold'); 
 
}

As you can see I am using my super G method, to select the nodes I want and then use my setAtts method to apply a class to all the nodes. Obviously if you are using a framework like JQuery or Prototype you would be using the $ to select your objects and some method like attr or curCSS to do a similar chained method.

Let me know of any problems.

Saturday 14 November 2009

Strictly Software Jobs - Jobs in IT

Looking for a job in IT? Check out jobs.strictly-software.com

As you may or may not know I work for one of the UK's leading providers of recruiter software and we currently have over 200+ jobboards based around the world running on software I developed. Now if you are visiting my site you are most likely a techie of some sort so I have created a search page that you can use to search the latest IT related jobs from the majority of my sites. The system will scan all these sites for IT related jobs so it might be helpful if you are ever thinking of changing career.



Want to work in the UK?

Most of the jobs come from UK based jobboards however there are jobs from Europe and Australia so its worth a look even if you don't want to work in the UK. I am going to be updating the jobs.strictly-software domain very soon with a lot more features but until then you can use the search page to view all jobs or you can access the RSS job feed jobs.strictly-software.com/rss which will get updated hourly with the latest 500 jobs taken from a total of over 3000 IT related vacancies.


What kind of job are you looking for?

There are web development jobs, web designer jobs, back end database developer and database administrator jobs as well as jobs related to network management, search engine optimisation and various other new media and Internet technology vacancies. Whether you are looking for a full time, permanent or contract job you should check out what's on offer by clicking on one of the following links.













Sunday 8 November 2009

Displaying Flash and Video content

The various methods of outputting flash and video content

I was looking at some YouTube videos earlier and the code that they use to allow users to embed the movies into other HTML has changed. I know it changed quite a while back actually but it got me thinking about the various methods for displaying video content on the web.

I am pretty sure that they used to use the old combo method which used to use an outer OBJECT tag designed to work in IE with classid and codebase attributes and then some PARAM tags and then an EMBED tag to handle all other browsers. Even though EMBED works across browsers its not a standard compliant method for displaying content however because it works across all browsers its used everywhere.
<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://fpdownload.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,0,0" data="http://www.youtube.com/v/FrYlNNy929Y&hl=en&fs=1" align="middle" width="425" height="344" >
<param name="movie" value="http://www.youtube.com/v/FrYlNNy929Y&hl=en&fs=1" />
<param name="allowFullScreen" value="true"></param>
<param name="allowScriptAccess" value="sameDomain" />
<embed src="http://www.youtube.com/v/FrYlNNy929Y&hl=en&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed>
</object>

The way they do it now is to have a very bare outer OBJECT tag and then some PARAMS and an EMBED tag. Rather than reference the movie source in the OBJECT's data attribute its only referenced in the PARAM and EMBED tags.
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/FrYlNNy929Y&hl=en&fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param>
<embed src="http://www.youtube.com/v/FrYlNNy929Y&hl=en&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed>
</object>

Now I know there are a multitude of ways of delivering movie content on the web and you can deliver it server-side or client-side with one of the libraries such as SWFObject or UFO or the new combination of those two and Adobe SWFObject 2.

I tend to do my delivering server-side mainly for the reason that at least 10% of users have Javascript disabled when they surf the web therefore that is quite a large audience to skip over. To keep to standard XHTML I was going down the route of copying what the Javascript libraries do but server-side e.g
  • Check for browser type e.g IE, a known standard compliant browser or Unknown
  • For IE deliver the OBJECT method with classid and codebase
  • For non IE deliver the OBJECT method with type application/x-shockwave-flash
  • For unknown deliver the nested method OBJECT and EMBED
  • Load some Javascript to handle the EOLAS patent issue for Opera (IE has fixed this)
Therefore we keep to the standards and get maximum audience coverage. For users of Opera with Javascript enabled they will play normally but those without they will have to click them to play. Although an annoyance this is a smaller percentage than the 10% who would have no flash play due to Noscript being enabled etc.

Another introduction to the current myriad of ways to deliver movie content is the new VIDEO tag which is now being supported by Firefox 3.5, SeaMonkey 2, Thunderbird 3 and Chrome 3. Currently it supports limited filetypes such as Ogg Theora which is an open format and you can reference a movie pretty simply which the following syntax shows.
<video id="video6" src="http://www.dailymotion.com/cdn/OGG-320x240/video/x9euyb?auth=1269605698_a8b629faf0d043e1b538971997ff9ba5" width="425" height="344"></video>

Accessing video content through Javascript

If you are loading all your video content through Javascript which a lot of people do then although you may be missing 10% of your audience you won't have to worry about the EOLAS issue in Opera and you will have a variety of functions in your chosen library to access the movie and manipulate it.

However if you are loading your flash server side like me you still may need some Javascript functionality to access the movie and check whether it has loaded or not.

The following code contains two functions which can be used to access a movie delivered by an OBJECT, EMBED or VIDEO tag cross browser and should handle very old browsers as well as it has an extensive fallback.

The other function lets you check whether a movie has loaded yet which you may want to do before making controls available to manipulate the movie or as I do with my flash bullet counter starting the movie and moving it to a certain frame.

function getMovie(movie){      
var r = null;
// try for standards way of accessing movie through OBJECT tag if you use the new ADOBE/SWFObject JS library
// to create your flash then this will work.
var o = document.getElementById(movie);
if(o){
if (o.nodeName == "OBJECT") {
// check SetVariable this could return undefined, unknown or function depending on browser
if (typeof o.SetVariable != "undefined") {
r = o;
}else{
var n = o.getElementsByTagName("object")[0];
if (n) r = n;
}
// handle the new VIDEO tag which plays Ogg files and can handle multiple fallbacks
}else if(o.nodeName == "VIDEO"){
r = o;
}
}
// if we still have no flash movie revert back to old tried and tested methods.
// these are used when you deliver your flash server sides in certain browsers
if(!r){
// access the window or document object
r = (document[movie]) ? document[movie] : window[movie];
if(r) return r;

// last resort use the embeds collection
if(document.embeds){
r = document.embeds[movie];
}
}
// return either null or a reference to our movie
return r;
}

// function to test whether a movie has loaded yet
function isMovieActive(movie){
movie = typeof(movie)=="string" ? getMovie(movie) : movie;
// if we have no reference quit
if(!movie) return false;

// default our percent loaded variable to 0 = not loaded
var pl=0;

// for VIDEO tags we can check the readyState property https://developer.mozilla.org/En/nsIDOMHTMLMediaElement
// readyState 3 video can be played a bit
// readyState 4 video can be played to the end without interuption
if(movie.nodeName=="VIDEO"){
return (movie.readyState>=3) ? true : false;
}else{
//if movie not loaded then this will raise an error but if its loaded we can check the PercentLoaded property
try{
pl=movie.PercentLoaded();
}catch(e){}
return (pl==100) ? true : false;
}
}

The following test page has been created to show all the numerous methods of outputting OBJECT and EMBED tags with a test to show which Javascript methods allow access of the movie. Test it in various browsers to see the differences and when viewing in Chrome be prepared to wait a while for the VIDEO content to load which will give you a chance to see the isMovieActive function work both ways. Also the movie is pretty funny anyway so its worth watching as its a fight between a Yoga master and two kung fu fighters.

Saturday 7 November 2009

Possible fixes for protocol violation errors

HttpWebRequest object returning a protocol violation error

I have just been working on my robot which is written in C# and I was trying to resolve some issues with redirects on certain pages which the bot wasn't following. The default setting for the HttpWebRequest object is to follow up to 50 redirects however you can override the default settings with the following properties:

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(_url);

request.AllowAutoRedirect = true;
request.MaximumAutomaticRedirections = 5;

However I was having some issue with a link to an article that was going to an advert page first and then redirecting afterwards. I was trying to come up with a solution to bypass this page but running a test console app from my laptop on the URL with my HttpRequest wrapper class was only returning the following error:
The server committed a protocol violation. Section=ResponseStatusLine
However when I tried to access the same URL from a webpage the error wasn't being raised and the HTTP response was being returned with a 200 status code.

A quick look on the web and I found that this problem can be caused by invalid headers in the response such as extra carriage returns or incomplete or invalid headers. To get round the problem of invalid headers the following code can be added to the app.config file.

<system.net>
<settings>
<httpwebrequest useunsafeheaderparsing="true">
</httpwebrequest>
</settings>
</system.net>
However I tried this solution and it didn't resolve the issue. I then paid a closer look to the other headers I was passing and noticed that on the website I was passing as the User-Agent header the same value as the current users browser e.g

// set user-agent to the same agent as the current browser
string useragent = Request.ServerVariables["HTTP_USER_AGENT"].ToString();

// use my HTTP request object to make requests
HTTPRequest webReq = new HTTPRequest(url, proxy, useragent);

However when I was running my test console application I was using a default user-agent which I have made up myself e.g:

Mozilla/4.0 (compatible; RobsRobot 1.3; www.strictly-software.com;)

I changed this default agent to IE 6.0 e.g:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)

and then I was able to retrieve a valid response from the remote server. However on the second response I got the error again. Changing the agent to IE 7.0 once again allowed me to retrieve a response but only for one attempt. So I am wondering whether the server in question which is an nginx server has some sort of IP/Agent logging and was blocking multiple requests within a certain time limit.

So I tried using a proxy server and found that no matter which user-agent I used I got a valid response back from the server every time. I could do multiple quick requests and use my own user-agent string.

Therefore I am not quite sure what the problem is as I haven't managed to narrow it down 100% but I am pretty sure that the page in question is doing some sort of server side agent sniffing and then delivering advertising content related to the request. This advertisement seems to have an issue with its headers which causes the protocol error which is now handled by my object to return an empty string as the response.

Friday 6 November 2009

Testing for browser event support without sniffing

Browser Event Support by object detection

One 0f the things I have often wondered since I really got into Javascript a few years back was whether there was a way to check for event support without resorting to browser sniffing.

I had a task the other day which meant that I had to add some code to prevent a user from pasting in content into an email confirmation box. I used the onpaste event for most browsers but Opera doesn't support this so I had to use a keypress event to look for the CTRL+V key combo and then block it. This got me looking at ways of checking for browser event support without resorting to a sniff.

Then I came across this article by Ryan Morr: http://ryanmorr.com/archives/ondomready-no-browser-sniffing

A lot of the DOMReady functions I have seen used including my own use a browser sniff to check for old Opera, WebKit and KHTML and use a timer to check for loaded state, a call to DOMContentLoaded for DOM2 supported browsers, a cludge for IE using defer or a doScroll and then a fallback to window.onload to handle anything that doesn't fire by the time the window loads.

Ryan's solution is to do all of them without any browser checks. He adds a DOMContentLoaded listener for DOM2 browsers as well as a window.onload and then he sets a timer up for all browsers. All of these call a function which checks the appropriate event type and for IE does the doScroll check. Once a load has been confirmed the desired function is called and the timer is killed.

Its a shame that a timer has to be used for all browsers when in reality only a very small percentage of browsers will fall into the class that require it however its an example of thinking outside the box.

This somehow got me to another article by Kangax where he had a brilliant function for checking for event support in any browser using a combination of two methods:
var isSupported = ('onpaste' in element)
and for those that fail a creation of the event with a simple return as the function and then a check to make sure that the event is a typeof function e.g
el.setAttribute('onpaste', 'return;');
isSupported = typeof el['onpaste'] == 'function';

So I read some more and then checked out some similar articles and his test page which had a number of event tests and saw that in IE and Chrome/Safari that the unload and resize tests failed using the existing checks. These events are definitely supported so should result in a positive when tested for. Therefore I have amended the original function to use the window object for these checks if the first check fails. I have also added a little cache in to prevent the same event type being checked multiple times as well as combining another check by Diego Perini which checks the global Event object. I don't actually know if this last check is required as I haven't seen a browser where the first tests fail but its there anyway.

var isEventSupported = (function(){
var win=this,
cache={},
TAGNAMES = {
'select':'input','change':'input',
'submit':'form','reset':'form',
'error':'img','load':'img','abort':'img'
};
function isEventSupported(eventName) {
var key = (TAGNAMES[eventName] || (eventName=="unload"||eventName=="resize")?"window":'div') + "_" + eventName;
if(cache[key])return cache[key];
var el = document.createElement(TAGNAMES[eventName] || 'div');
var oneventName = 'on' + eventName.toLowerCase();
var isSupported = (oneventName in el);
// cannot create a window object so to get a correct test for IE/Webkit on resize/unload check the window
if(!isSupported && (eventName=="unload" || eventName=="resize")){
isSupported = (oneventName in win);
}
if (!isSupported && el.setAttribute) {
el.setAttribute(oneventName, 'return;');
isSupported = typeof el[oneventName] == 'function';
}
// the above tests should work in majority of cases but this test checks the EVENT object
if(!isSupported && win.Event && typeof(win.Event)=="object"){
isSupported = (eventName.toUpperCase() in win.Event);
}
el = null;
cache[key]=isSupported;
return isSupported;
}
return isEventSupported;
})();


You can check out the test page here which compares a number of events against this function as well as Kangax's original and also a version by Diego Perini who I believe was the person who came up with the doScroll method for IE used in many a DOMReady function.

http://www.strictly-software.com/eventsupport.htm

Unfortunately this method of event testing doesn't work with the mutation events such as DOMContentLoaded but then the only way you can really test for these is by running them anyway. Therefore although this function was perfect for my onpaste check it wouldn't work in a DOMReady function to test whether DOMContentLoaded was supported or not.

Articles Mentioned:

http://thinkweb2.com/projects/prototype/detecting-event-support-without-browser-sniffing

http://ryanmorr.com/archives/ondomready-no-browser-sniffing