Thursday 12 September 2013

SEO - Search Engine Optimization

My two cents worth about Search Engine Optimisation - SEO

Originally Posted - 2009
UPDATED - 12th Sep 2013

SEO is big bucks at the moment and it seems to be one of those areas of the web where there seem to be lots of snake oil salesmen and "SEO experts" who will promise no 1 positioning on Google, Bing and Yahoo for $$$ per month.

It is one of those areas that I didn't really pay much attention to when I started web developing mainly because I was not the person paying for the site and relying on leads coming from the web. However as I have worked on more and more sites over the years its become blatantly apparent to me that SEO comes in two forms from a development or sales point of view.

There are the forms of SEO which are basically good web development practise and will come about naturally from having a good site structure, making the site usable and readable as well as helping in terms of accessibility.

Then there are the forms which people will try and bolt onto a site afterwards. Either as an after thought or because an SEO expert has charged the site lots of money, promised the impossible, and wants to use some dubious link-sharing schemes that are believed to work.


Cover the SEO Basics when developing the site


Its a lot harder to just "add some Search Engine Optimization" in once a site has been developed especially if you are developing generic systems that have to work for numerous clients.

I am not an SEO expert and I don't claim to be otherwise I would be charging you lots of money for this advice and making promises that are impossible to be kept. However following these basic tips will only help your sites SEO.

Make sure all links have title tags on them and contain worthy content rather than words like "click here". The content within the anchor tags matter when those bots come a crawling in the dead of night.

You should also make sure all images have ALT attributes on them as well as titles and make sure the content of both differ. As far as I know Googlebot will rate ALT content higher than title content but it cannot hurt to have both.

Make sure you make use of header tags to differentiate out important sections of your site and try to use descriptive wording rather than "Section 1" etc.

Also as I'm sure you have noticed if you have read my blogs before I wrap keywords and keyword rich sentences in strong tags.

I know that Google will also rank emphasised content or content marked as strong over normal content. So as well as helping those readers who skim read to view just the important parts, it also tells Google which words are important on my article.

Write decent content and don't just fill up your pages with visible or non-visible spammy keywords.

In the old days keyword density mattered when ranking content. This was calculated by removing all the noise words and other guff (CSS, JavaScript etc) and then calculating what percentage of the overall page content were relevant keywords? 

Nowadays the bots are a lot cleverer and will penalise content that does this as it looks like spam.

Also its good for your users to have good readable content and you shouldn't remove words between keywords as it makes it more unreadable and you will lose out on the longer 3, 4, 5 word indexable search terms (called long-tail in the SEO world).

Saying this though its always good to remove filler from your pages. For example by putting your CSS and Javascript code into external files when possible and removing large commented out sections of HTML.

You should also aim to put your most important content at the top of the page so its the first thing crawled.

Try moving main menus and other content that can be positioned by CSS to the bottom of the file. This is so that social media sites and other BOTS that take the "first image" on an article and use it in their own social snippets don't accidentally use an advertisers banner instead of your logo or main article picture.

The same thing goes for links. If you have important links but they are in the footer such as links to site-indexes then try getting them higher up the HTML source.

I have seen Google recommend that 100 links a per page is the maximum to have per page. Therefore having a homepage that has your most important links at the bottom of the HTML source but 200+ links above them e.g links to searches even if not all of them are visible then this can be harmful.

If you are using a tabbed interface to switch between tabs of links then the links will still be in the source code and if they are loaded in by JavaScript on demand then that's no good at all as a lot of crawlers don't run JavaScript.

Items such as ISAPI URL rewriting are very good for SEO plus they are nicer URLs for sites to display.

For example using a site I have just worked on as an example http://jobs.professionalpassport.com/companies/perfect-placement-uk-ltd is a much nicer URL to view a particular company profile than the underlying real URL which could also be accessed as http://jobs.professionalpassport.com/jobboard/cands/compview.asp?c=6101

If you can access that page by both links and you don't want to be penalised for duplicate content then you should specify which link you would want to be indexed by specifying your canonical link.You should also use your Robots.txt file to specify that the non re-written URL's are not to be indexed e.g.

Disallow: /jobboard/cands/compview.asp

META tags such as the keywords tag are not considered as important as they once were and having good keyword rich content in the main section of the page is the way to go rather than filling up that META with hundreds of keywords.

The META Description will still be used to help describe your page on search results pages and the META Title tag is very important to describe your page's content to the user and BOT.

However some people are still living in the 90's and seem to think that stuffing their META Keywords with spam is the ultimate SEO trick when in reality that tag is probably ignored by most crawlers nowadays.

Set up a Sitemap straight away containing your sites pages ranked by their importance, how often they change, last modified date etc. The sooner you do this the quicker you site will be getting indexed and gaining site authority. It doesn't matter if it's not 100% ready yet but the sooner it's in the indexes the better.

Whilst you can do this through Googles webmaster tools or Microsofts Bing you don't actually need to use their tools and as long as you use a sitemap directive in your robots.txt file BOTS will find it e.g

Sitemap: http://www.strictly-software.com/sitemap_110908.xml

You can also use tools such as the wonderful SEOBook Toolbar which is an add-on for Firefox which has combined numerous other free online SEO tools into one helpful toolbar. It lets you see your Page Ranking and compare your site to competitors on various keywords across the major search engines.

Also using a text browser such as Lynx to see how your site would look to a crawler such as yahoo or google.is a good trick to see how BOTS would view your site as it will skip all the styling and JavaScript.

There are many other good practises which are basic "musts" in this day and age and the major SERP'S are moving more and more towards social media when it comes to indexing sites and seeing how popular they are.

You should set up a Twitter account and make sure each article is published to it as well as engaging with your followers.

A Facebook Fan page is also a good method of getting people to view snippets of your content and then find your site through the world most popular social media website.

Making your website friendly for people viewing it on tablets or smart phones is also good advice as more and more people are using these devices to view Internet content.

The Other form of SEO, Black Magic Optimization

The other form of Search engine optimization is what I would call "black magic SEO" and it comes in the form of SEO specialists that will charge you lots of money and make impossible claims about getting you to the number one spot in Google for your major keywords and so on.

The problem with SEO is that no-one knows exactly how Google and the others calculate their rankings so no-one can promise anything regarding search engine positioning.

There is Googles Page Ranking which is used in relation to other forms of analysis and it basically means that if you have a site with a high PR that links to your site that does not link back to the original site then it tells Google that your site has higher site authority than the linking site.

If your site only links out to other sites but doesn't have any links coming in from high page ranked relevant sites then you are unlikely to get a high page rank yourself. This is just one of the ways which Google will use to determine how high to place you in the rankings when a search is carried out.

Having lots of links coming in from sites that have nothing whatsoever to do with your site may help drive traffic but will probably not help your PR. Therefore engaging in all these link exchange systems are probably worth jack nipple as unless the content that links to your site is relevant or related in some way its just seen as a link for a links sake i.e spam.

Some "SEO specialists" promote special schemes which have automated 3 way linking between sites enrolled on the scheme.

They know that just having two unrelated sites link to each other basically negates the Page Rank so they try and hide this by having your site A linking to site B which in turn links to site C that then links back to you.

The problem is obviously getting relevant sites linking to you rather than every tom dick and harry.

Also advertising on other sites purely to get indexed links from that site to yours to increase PR may not work due to the fact that most of the large advert management systems output banner adverts using Javascript therefore although the advert will appear on the site and drive traffic when people click it you will not get the benefit of an indexed link. The reason being that when the crawlers come to index the page containing the advert the banner image and any link to your site won't be there.

Anyone who claims that they can get you to the top spot in Google is someone to avoid!

The fact is that Google and the others are constantly changing the way they rank and what they penalise for so something that may seem dubious that works currently could actually harm you down the line.

For example in the old days people would put hidden links on white backgrounds or position them out of site so that the crawlers would hit them but the users wouldn't see which worked for a while until Google and the others cracked down and penalised for it.

Putting any form of content up specifically for a crawler is seen as dubious and you will be penalised for doing it.

Google and BING want to crawl the content that a normal user would see and they have actually been known to mask their own identity ( IP and User-Agent ) when crawling your site so that they can check whether this is the case or not.

My advice would be to stick to the basics, don't pay anybody who makes any kind of promise about result ranking and avoid like the plague any scheme that is "unbeatable" and promises unrivalled PR within only a month or two.

Tuesday 10 September 2013

New Version of Strictly AutoTags - Version 2.8.6

Strictly Auto Tags 2.8.6 Has Been Released!

Due to the severe lack of donations plus too many broken promises of "I'll pay if you just fix or add this" I am stopping to support the Strictly Auto Tags plugin.

The last free version, 2.8.5 is up on the WordPress repository: wordpress.org/plugins/strictly-autotags

It fixes a number of bugs and adds some new features such as:
  1. Updated storage array to store content between important content, bold, strong, headers and links etc. So they don't get tagged inside e.g put bolded words inside an existing h4 etc. 
  2. Changed storage array to run "RETURN" twice to handle nested code because of previous change.
  3. Fixed bug that wasn't showing the correct value for the minimum number of tags that a post must have before deeplinking to their tag page in admin. 
  4. Fixed bug in admin to allow noise words to have dots in them e.g for links like youtube.com 
  5. Added more default noise words to the list.
  6. Cleaned code that wasn't needed any-more due to changes with the way I handle href/src/title/alt attributes to prevent nested tagging.
  7. Removed unnecessary regular expressions which are not needed now. 
Version 2.8.5 of Strictly AutoTags www.strictly-software.com/plugins/strictly-auto-tags

Is going to be a "donate £40+" and get a copy version.

I am going to be sexing this plugin up into more of an SEO, text spinning, content cleaning, auto-blogging tool full of sex and violence in the future and I am running it on my own sites at the moment to see how well it does.

New features in 2.8.6 include.

Set a minimum length of characters for a tag to be used.

Set equivalent words to be used as tags. I have devised a "mark up" code for doing this which will allow you to add as many tag equivalents as you want. For example this is a cut down current version from one of my sites to show you an example using Edward Snowden (very topical at the moment!).

[NSA,Snowden,Prism,GCHQ]=[Police State]|[Snowden,Prism]=[Edward Snowden]|[Prism,XKeyscore,NSA Spying,NSA Internet surveillance]=[Internet Surveillance]|[TRAPWIRE,GCHQ,NSA Spying,Internet surveillance,XKeyscore,PRISM]=[Privacy]|[Snowden,Julian Assange,Bradley Manning,Sibel Edmonds,Thomas Drake]=[Whistleblower]

As you can see from that example you can use the same words multiple times and give them equivalent tags to use. So if the word Snowden appears a lot I will also tag the word "Police State", "Edward Snowden", "Whistleblower" as well as Snowden.

This feature is designed so that you can use related words as tags that maybe more relevant to peoples searches. 

I also have added a feature to convert textual links that may appear from importing or scraping into real links for example www.strictly-software.com will become a real link to that domain e.g http://www.strictly-software.com.

I have added the new attributes data and their derivatives e.g data-description or data-image-description , basically anything that has data- at the front of it, into my list of attributes to store and then replace after auto-tagging to prevent nested tags being added inside them.

I will be extending this plugin lots in the future but only people prepared to pay for it will be able to get the goodies. I am so fed up of open-source coding there is little point in me carrying on working my ass off for free for other peoples benefit any more.

If you want a copy email me and then I will respond.

You can then donate me the money and I will send you a unique copy.

Any re-distribution of the code will mean hacks, DDOS, viruses from hell and Trojans coming out your ass for years!





Testing Server Load Before Running Plugin Code On Wordpress

Testing Server Load Before Running Plugin Code On Wordpress

UPDATED - 10th Sep 2013

I have updated this function to handle issues on Windows machines in which the COM object might not be created due to security issues.

If you have a underpowered Linux server and run the bag of shite that is the WordPress CMS system on it then you will have spent ages trying to squeeze every bit of power and performance out of your machine.

You've probably already installed caching plugins at every level from WordPress to the Server and maybe even beyond....into the cloud....all stuff normal websites shouldn't have to do but it seems WordPress / Apache / PHP programmers love doing.

A fast optimised database, queries that return data in sets (not record by record) and some static pages for content that doesn't change constantly should be all you need but it seems that this is not the case in the world of WordPress!

Therefore if you have your own server or virtual server and the right permissions you might want to consider implementing some code in important plugins that prevents the job you intend running causing more performance problems if the server is already over loaded.

You can do this by testing for the current server load, setting a threshold limit and then only running the code you want if the server load is below that limit.

Of course security is key so lock down permissions to your apps and only let admin or the system itself run the code - never a user and never by a querystring that could be hacked!

The code is pretty simple.

It does a split for Windows and non Windows machines and then it checks for a way to test the server load in each branch.

For Windows it has two methods, one for old PHP code and one for PHP 5+.

In the Linux branch it tests for access to the /proc/loadavg file which contains the current load average on LINUX machines.

If it's not there it tries to access the shell_exec function (which may or may not be locked down due to permissions - up to you whether you allow access or not) and if it can run shell commands it calls the "uptime" function to get the current server load from that.

You can then call this function in whatever plugin or function you want and make sure your server isn't already overloaded before running a big job.

I already use it in all my own plugins, the Strictly Google Sitemap and my own version of the WP-O-Matic plugin.


/**
 * Checks the current server load
 *
 * @param boolean $win
 * @return string 
 *
 */
function GetServerLoad(){

 $os = strtolower(PHP_OS); 
 
 // handle non windows machines
 if(substr(PHP_OS, 0, 3) !== 'WIN'){
  if(file_exists("/proc/loadavg")) {    
   $load = file_get_contents("/proc/loadavg"); 
   $load = explode(' ', $load);     
   return $load[0]; 
  }elseif(function_exists("shell_exec")) {     
   $load = @shell_exec("uptime");
   $load = explode(' ', $load);        
   return $load[count($load)-3]; 
  }else { 
   return false; 
  } 
 // handle windows servers
 }else{ 
  if(class_exists("COM")) {     
   $wmi  = new COM("WinMgmts:\\\\."); 
   if(is_object($wmi)){
    $cpus  = $wmi->InstancesOf("Win32_Processor"); 
    $cpuload = 0; 
    $i   = 0;   
    // Old PHP
    if(version_compare('4.50.0', PHP_VERSION) == 1) { 
     // PHP 4      
     while ($cpu = $cpus->Next()) { 
      $cpuload += $cpu->LoadPercentage; 
      $i++; 
     } 
    } else { 
     // PHP 5      
     foreach($cpus as $cpu) { 
      $cpuload += $cpu->LoadPercentage; 
      $i++; 
     } 
    } 
    $cpuload = round($cpuload / $i, 2); 
    return "$cpuload%"; 
   }
  } 
  return false;     
 } 
}

A simple server load testing function that should work across Windows and Linux machines for load testing.