Showing posts with label Social Media. Show all posts
Showing posts with label Social Media. Show all posts

Thursday, 12 December 2019

Browsers New Automatic Settings Slowing Site Loads Down

Blocking All Social Media Cookies and Trackers Seems To Be Slowing Down Chrome and FireFox

By Strictly-Software

I have had recent automatic updates for Firefox and Chrome to versions

Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0

and 

Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36

How these two have got to versions nearing 80 so quickly is funny, when IE just rolls out a new version of it's browser every year or so not every time I try and open Firefox from my taskbar.

However these updates seem to contain some important settings, some may have been around for a while which I just haven't noticed. However it is the slowness of these browsers compared to Opera, that uses a proxy server in it's pretend VPN, so you are actually going through 2 servers to your site compared to the other browsers, that is doing my head in.

I liked Firefox, and Chrome when it first came out for their speed, and add-ons. However either my laptop is either deciding to slow down these 2 browsers for some reason and let Opera hop before loading a faster page or something else is going on.

My version of Opera with this "VPN" is:

Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 OPR/65.0.3467.62

When I open it, it is faster than both Chrome and FireFox despite their VPN which when I check with a geo location website shows I have these details instead of my real ones:
  • Your Public IPv6 is: 2001:67c:2660:425:7::dfa
  • Your IPv4 is: 77.111.247.105
  • Location: Amsterdam, NH NL 
  • ISP: Hern Labs AB
I notice they have removed the word Opera and replaced it with OPR now, also nearing it's 70th edition. So not only does it protect my privacy a bit it is faster than Chrome and FireFox at the moment for me.

So today after upgrades to both Chrome (manual due to it's insane slowness) and an automatic update for FireFox that took almost 20 minutes, I noticed major slowdowns.

Chrome seems to always be showing a "resolving host" message in the status bar and loading in remote scripts from the big 3 spyware social networks, Twitter, Facebook and Google.

Of course site builders have put these outbound scripts into their code as they want people to Like, Follow and Tweet whatever crap they are selling and you may like seeing a Twitter scroller on the blog you are following and the ability to share the page to Facebook.

However I watched a video on www.darkpolitricks.com the other night about how many #alttnews sites are moving their videos from YouTube to BitShute. YouTube is broken and Google is an evil company. And it seems they are indeed.

It is all about how these niche news outlets had created YouTube making it the biggest video sharing site online, and although the company claims they are only 1% of all videos watched they still feel the need to de-rank these alternative views and put "authoritative sources" above your searches in their algorithm tweaks. These tweaks were all admitted by a Google employee who described getting those Up Ticks and Likes as a "drug" that ensures people continue with their outpourings of every thing they do in life on Social Media.

Yes we do want to see where Jane is having lunch, who with, where the cafe is located and then everything else she does that day as we follow her goings about on Facebook and Instagram.Well you may want to but I don't. However to do so you need to load in Facebook scripts from their servers.

However it seems if you delve into the privacy and security settings for Firefox you get to see that their default setting is to stop tracking cookies from cross site and social media trackers which obviously means if you are loading a 3rd party script from another location you could see as I did today on one site the "trying to connect to t.co" message appear dozens of times as the page tried to load. All the while the page was hung and unusable.

You can go into the FireFox settings and change your settings under Privacy and Security. The heading is...

Browser Privacy

Enhanced Tracking Protection

Trackers follow you around online to collect information about your browsing habits and interests. Firefox blocks many of these trackers and other malicious scripts.

The default setting will block Social media trackers, Cross-site tracking cookies, Tracking content in Private Windows and Cryptominers.

Obviously the latter few are definitely required but let me know if you have noticed a slow down with the standard setting that supposedly is "Balanced for protection and performance. Pages will load normally.", as they may load normally but they seem very slow to load, and off server scripts like Twitter and Facebook are attempted multiple times before a page is usable.

What happened to just loading the core code first to let the page be usable and load any off server scripts by Ajax in the background. It seems too many sites now use pure JavaScript and Ajax to load the content, probably to prevent content scraper BOTS however it does mean a lot of code has to run and be loaded before the page is usable. Have you had a look at the source HTML of www.google.com lately?

Apart from some META tags after the HTML tag the whole source is JavaScript and probably Ajax to load in the content for what is really nothing more than a white page with a different image every now and then above a text input box for searching.

The links to your Google account and Gmail in the top right corner are just that links. We could shorten the load time and the code to a few lines of HTML in reality. I really think Google have gone overboard with their API Jizz all across their systems as their need to stop scrapers has just caused slow loading pages it seems.

Would you like all 3rd party scripts and cookies blocked, or would you like the site to work and load quickly? It seems a dilemma these browsers are making over complicated especially for non techies who wouldn't know half the words in Firefox's Privacy and Security settings.

The difference between Standard which says "Balanced for protection and security. Pages will load normally" and Strict which says "Stronger protection, but may cause some sites or content to break" seems to only be the addition of:
  • Tracking content in all windows - rather than standard mode which only blocks "Tracking content in Private Windows" and
  • Fingerprinters (blocking Browser finger printing, logging your add-ons, window size and other ways to identify you from just your browser)
They don't actually explain what a fingerprinter is, and to the average user they would be scratching their head thinking about their latest Samsung phone and the ability to login using your fingerprint. However these two extra blocks seem to be deadly for a working website as they state underneath :
Heads up! Blocking trackers could impact the functionality of some sites. Reload a page with trackers to load all content.
So god knows how someone is supposed to manage the 3rd option which is a custom way of blocking things you don't understand or know why they would break a site.

Of course they have a number of complicated Knowledge Base articles  for you to read and get your head round to try and understand whether they need to use Private Windows for browsing all the time, and why the prevention of loading certain features is going to stop your site loading.

Of course Firefox has a "simple way" to help you understand what is going on by just clicking on the shield in the address bar you can change the mode of protection on or off. 

You can view this site with "Enhanced Tracking Protection is OFF for this site" or ON and if you don't know what the difference is they have helpful little graphs that tell you about their Enhanced Tracking Protection, how many trackers they have blocked over the week and ways to look for data breaches. All very interesting but not very helpful information.

They helpfully clarify the situation by saying "Social networks place trackers on other websites to follow what you do, see, and watch online. This allows social media companies to learn more about you beyond what you share on your social media profiles."

Of course you could just disable 3rd party cookies and JavaScript by default with a web developer toolbar and see if the page loads or not. If it doesn't work turn on JavaScript and try again before white listing the site so it can use JavaScript again.

It seems that as Windows in numptifying the front end of their latest operating systems and making it harder for developers to dig in and get into the back end like Windows 8.1r which I still have - now without Skype support - the browsers are offering their users far too many options they probably don't understand or need to know about.

What I want from a browser is for websites to load quickly, any 3rd party hosted widgets like Facebook or Twitter widgets to load in asynchronously and not prevent the working of the site. I want the browser to do the dirty stuff behind the scenes and I don't want 100's of options to play about with. They should block dangerous content, warn users about dangerous sites and stop anything that may have a dangerous effect on my browsing or privacy.

Yes - Ask me if I want to load this soon to be outdated flash movie or allow notifications but don't give me too much to tweak about with.

The speed of loading a site is the most important factor for most users and also affects the sites SEO. If they want to give us an option for being as private as possible or allowing tracking cookies then just have a single option "Privacy HIGH or OFF" option, and then use their own browsing logic to work out if a page won't load instead of offering the user a whole list of options to try out if a page doesn't load.

What is wrong with just keeping incognito windows that are private as possible, don't allow trackers or fingerprinting and the logging of pages visited with a clear out of cookies automatically when I leave?

It just seems that as Chrome enters the laptop world with it's Chromebooks, that as Operating Systems continually ask for your admin password in Windows 10+ when opening an application. Hiding all the nitty gritty that really slows your PC down behind automated "maintenance jobs". That browsers are trying to become their own little PC within a PC.

Just give me fast loading pages and if I want to hide what I am doing from sites and other users of my laptop then make the incognito windows as private as possible. Stop trackers, fingerprinting, 3rd party cookies, and anything else you are now making a "choice" for the user under the settings.

It is bad enough that as everyone moves to HTTPS we see the TLS handshake message in the taskbar constantly which is obviously slowing down the loading of pages and their content, especially if it's mixed.

Just give me a fast browser. I thought using FireFox today would speed things up as Chrome is just getting unusable and as everyone realises they are actually evil, I don't want to help Google pass on my data from their browser or search engines and analytics trackers to advertisers and god knows who else.

From now on Opera with its extra server hop is going to be my standard browser. The "VPN" offers enough privacy and whilst some pages won't remember certain settings due to my location changing the browser is fast.

Anyone find their settings too complicated nowadays and the speed an issue?


By Strictly-Software

© 2019 Strictly-Software

 

Thursday, 28 May 2015

Twitter Rush - The Rush just gets bigger and bigger!

Twitter Rush - The Rush Just Gets Bigger And Bigger!

By Strictly-Software

The amount of BOTs, social media sites and scrapers that hit your site after you post a Tweet with a link in it to a site to Twitter just gets bigger and bigger. When I first started recording the BOTS that hit my site after a post to Twitter it was about 15 now it has grown to over 100+!

You can read about my previous analysis of Twitter Rushes here and here however today I am posting the findings of a recent blog posting using my Strictly TweetBOT WordPress plugin to Twitter and the 108 HTTP Requests that followed in the following minutes after posting.

If you are not careful these Twitter Rushes could consume your web servers CPU and Memory as well as making a daisy chain of processes waiting to be completed that could cause high server loads and long connection / wait times for the pages to load.

You will notice that the first item in the list is a POST to the article.

That is because in the PRO version of my Stricty TweetBOT I have an option to send an HTTP request to the page before Tweeting. Then you can wait a few seconds (a setting you control), before any Tweets are sent out to ensure the plugin has enough time to cache the page.

This is so that if you have a Caching Plugin installed (e.g on WordPress WP Super Cache) or another system, the page is hopefully cached into memory or hand written as an HTML file to prevent any overload when the Twitter Rush comes.

It is always quicker to deliver a static HTML file to users than a dynamic PHP/.NET file that needs DB access etc.

So here are the results of today's test.

Notice how I return 403 status codes to many of the requests. 

This is because I block any bandwidth wasters that bring no benefit at all to my site.

The latest batch of these bandwidth wasters seem to be social media and brand awareness BOTS that want to see if their brand or site is mentioned in the article.

They are of no benefit to you at all and you should either block them using your firewall or with a 403 status code in your .htacces file.

Please also note the amount of duplicate requests from either the same IP address or the same company e.g TwitterBOT or Facebook that are made to the page. Why they do this I do not know!

The Recent Twitter Rush Test - 28-MAY-2015

XXX.XXX.XXX.XXX - - [28/May/2015:17:08:17 +0100] "POST /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/?r=12 HTTP/1.1" 200 22265 "-" "Mozilla/5.0 (http://www.strictly-software.com) Strictly TweetBot/1.1.2" 1/1582929
184.173.106.130 - - [28/May/2015:17:08:22 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "ShowyouBot (http://showyou.com/crawler)" 0/3372
199.16.156.124 - - [28/May/2015:17:08:21 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22377 "-" "Twitterbot/1.0" 1/1301263
199.16.156.125 - - [28/May/2015:17:08:21 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22375 "-" "Twitterbot/1.0" 1/1441183
185.20.4.220 - - [28/May/2015:17:08:21 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22377 "-" "Mozilla/5.0 (TweetmemeBot/4.0; +http://datasift.com/bot.html) Gecko/20100101 Firefox/31.0" 1/1224266
17.142.151.49 - - [28/May/2015:17:08:21 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22375 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1; +http://www.apple.com/go/applebot)" 1/1250324
151.252.28.203 - - [28/May/2015:17:08:22 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22374 "http://bit.ly/1eA4GYZ" "Go 1.1 package http" 1/1118106
46.236.26.102 - - [28/May/2015:17:08:23 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22376 "-" "Mozilla/5.0 (TweetmemeBot/4.0; +http://datasift.com/bot.html) Gecko/20100101 Firefox/31.0" 0/833367
199.16.156.124 - - [28/May/2015:17:08:23 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22376 "-" "Twitterbot/1.0" 0/935200
142.4.216.19 - - [28/May/2015:17:08:24 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 - "-" "Mozilla/5.0 (compatible; OpenHoseBot/2.1; +http://www.openhose.org/bot.html)" 0/1964
17.142.152.131 - - [28/May/2015:17:08:24 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22375 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1; +http://www.apple.com/go/applebot)" 0/875740
52.5.154.238 - - [28/May/2015:17:08:25 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22376 "-" "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31" 1/1029660
4.71.170.35 - - [28/May/2015:17:08:26 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate HTTP/1.1" 403 251 "-" "grokkit-crawler (pdsupport@purediscovery.com)" 0/1883
192.99.19.38 - - [28/May/2015:17:08:26 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 - "-" "Mozilla/5.0 (compatible; OpenHoseBot/2.1; +http://www.openhose.org/bot.html)" 0/1927
141.223.91.115 - - [28/May/2015:17:08:28 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "_bit=55673d4a-0024b-030a1-261cf10a;domain=.bit.ly;expires=Tue Nov 24 16:07:38 2015;path=/; HttpOnly" "-" 1/1592735
17.142.151.101 - - [28/May/2015:17:08:32 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22260 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1; +http://www.apple.com/go/applebot)" 17/17210294
184.173.106.130 - - [28/May/2015:17:08:49 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "ShowyouBot (http://showyou.com/crawler)" 0/1870
142.4.216.19 - - [28/May/2015:17:08:49 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Mozilla/5.0 (compatible; OpenHoseBot/2.1; +http://www.openhose.org/bot.html)" 0/1601
52.6.187.68 - - [28/May/2015:17:08:28 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22262 "-" "Typhoeus - https://github.com/typhoeus/typhoeus" 20/20260090
45.33.89.102 - - [28/May/2015:17:08:28 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22262 "-" "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 20/20370939
134.225.2.7 - - [28/May/2015:17:08:26 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22260 "-" "Mozilla/5.0 (TweetmemeBot/4.0; +http://datasift.com/bot.html) Gecko/20100101 Firefox/31.0" 22/22337338
2a03:2880:1010:3ff4:face:b00c:0:8000 - - [28/May/2015:17:08:25 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22261 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" 23/23973749
134.225.2.7 - - [28/May/2015:17:08:27 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Mozilla/5.0 (TweetmemeBot/4.0; +http://datasift.com/bot.html) Gecko/20100101 Firefox/31.0" 21/21602431
54.167.214.223 - - [28/May/2015:17:08:25 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" 24/24164062
4.71.170.35 - - [28/May/2015:17:08:51 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "grokkit-crawler (pdsupport@purediscovery.com)" 0/1688
192.99.19.38 - - [28/May/2015:17:08:51 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Mozilla/5.0 (compatible; OpenHoseBot/2.1; +http://www.openhose.org/bot.html)" 0/1594
54.246.137.243 - - [28/May/2015:17:08:51 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 - "-" "python-requests/1.2.3 CPython/2.7.6 Linux/3.13.0-44-generic" 0/1736
92.246.16.201 - - [28/May/2015:17:08:51 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0" 0/725424
4.71.170.35 - - [28/May/2015:17:08:55 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "grokkit-crawler (pdsupport@purediscovery.com)" 0/1808
2a03:2880:2130:9ff3:face:b00c:0:1 - - [28/May/2015:17:08:57 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate HTTP/1.1" 301 144 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" 0/657830
54.198.122.232 - - [28/May/2015:17:08:51 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/534.24 (KHTML, like Gecko) (Contact: backend@getprismatic.com)" 7/7227418
2a03:2880:1010:3ff7:face:b00c:0:8000 - - [28/May/2015:17:08:51 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22255 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" 7/7169003
54.198.122.232 - - [28/May/2015:17:08:51 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22257 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/534.24 (KHTML, like Gecko) (Contact: backend@getprismatic.com)" 7/7185701
2607:5300:60:3b37:: - - [28/May/2015:17:08:53 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0" 5/5298648
2a03:2880:2130:9ff7:face:b00c:0:1 - - [28/May/2015:17:08:56 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22267 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" 1/1999466
178.32.216.193 - - [28/May/2015:17:08:49 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "http://bit.ly/1eA4GYZ" "LivelapBot/0.2 (http://site.livelap.com/crawler)" 9/9518327
199.59.148.209 - - [28/May/2015:17:08:58 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22260 "-" "Twitterbot/1.0" 1/1680322
54.178.210.226 - - [28/May/2015:17:08:58 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22257 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 1/1842148
54.198.122.232 - - [28/May/2015:17:08:58 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/534.24 (KHTML, like Gecko) (Contact: backend@getprismatic.com)" 1/1903731
54.198.122.232 - - [28/May/2015:17:09:00 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/534.24 (KHTML, like Gecko) (Contact: backend@getprismatic.com)" 1/1131792
2607:5300:60:3b37:: - - [28/May/2015:17:09:00 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22260 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0" 1/1048667
199.59.148.209 - - [28/May/2015:17:09:02 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Twitterbot/1.0" 1/1024583
54.178.210.226 - - [28/May/2015:17:09:02 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22260 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 1/1251088
65.52.240.20 - - [28/May/2015:17:09:03 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)" 0/814087
54.92.69.38 - - [28/May/2015:17:09:04 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/925457
54.92.69.38 - - [28/May/2015:17:09:05 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22266 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/932984
54.178.210.226 - - [28/May/2015:17:09:06 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/927202
54.178.210.226 - - [28/May/2015:17:09:08 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22260 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/717344
54.167.123.237 - - [28/May/2015:17:09:09 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:28.0) Gecko/20100101 Firefox/28.0 (FlipboardProxy/1.1; +http://flipboard.com/browserproxy)" 0/2286
54.178.210.226 - - [28/May/2015:17:09:12 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22254 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/971022
37.187.165.195 - - [28/May/2015:17:09:52 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Mozilla/5.0 (compatible; PaperLiBot/2.1; http://support.paper.li/entries/20023257-what-is-paper-li)" 0/688208
74.112.131.244 - - [28/May/2015:17:10:24 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22257 "-" "Mozilla/5.0 ()" 3/3572262
52.68.118.157 - - [28/May/2015:17:11:35 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/688056
52.68.118.157 - - [28/May/2015:17:11:35 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/719851
52.68.118.157 - - [28/May/2015:17:11:37 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22256 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/739706
52.68.118.157 - - [28/May/2015:17:11:38 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22258 "-" "Crowsnest/0.5 (+http://www.crowsnest.tv/)" 0/760912
74.6.254.121 - - [28/May/2015:17:12:05 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/248578
66.249.67.148 - - [28/May/2015:17:12:38 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22259 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 0/493468
54.145.93.204 - - [28/May/2015:17:13:25 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "jack" 0/1495
54.145.93.204 - - [28/May/2015:17:13:26 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22257 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2008091620 Firefox/3.0.2" 0/597310
178.33.236.214 - - [28/May/2015:17:13:41 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Mozilla/5.0 (compatible; Kraken/0.1; http://linkfluence.net/; bot@linkfluence.net)" 0/2065
173.203.107.206 - - [28/May/2015:17:13:50 +0100] "POST /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/?r=12 HTTP/1.1" 200 22194 "-" "Mozilla/5.0 (http://www.strictly-software.com) Strictly TweetBot/1.1.2" 4/4801717
184.173.106.130 - - [28/May/2015:17:13:58 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "ShowyouBot (http://showyou.com/crawler)" 0/96829
178.32.216.193 - - [28/May/2015:17:13:57 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22303 "http://bit.ly/1eA4GYZ" "LivelapBot/0.2 (http://site.livelap.com/crawler)" 1/1032211
192.99.1.145 - - [28/May/2015:17:13:59 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22303 "http://bit.ly/1eA4GYZ" "LivelapBot/0.2 (http://site.livelap.com/crawler)" 1/1535270
184.173.106.130 - - [28/May/2015:17:14:02 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "ShowyouBot (http://showyou.com/crawler)" 0/1764
52.6.187.68 - - [28/May/2015:17:14:01 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22293 "-" "Typhoeus - https://github.com/typhoeus/typhoeus" 66/66512611
146.148.22.255 - - [28/May/2015:17:15:10 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22292 "-" "Mozilla/5.0 (compatible; Climatebot/1.0; +http://climate.k39.us/bot.html)" 0/885387
74.6.254.121 - - [28/May/2015:17:15:11 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 1/1789256
146.148.22.255 - - [28/May/2015:17:15:17 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22290 "-" "Mozilla/5.0 (compatible; Climatebot/1.0; +http://climate.k39.us/bot.html)" 1/1275245
54.162.7.197 - - [28/May/2015:17:15:18 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22291 "http://bit.ly/1eA4GYZ" "Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.107 Safari/534.13 v1432829642.1352" 0/711142
146.148.22.255 - - [28/May/2015:17:15:24 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22293 "-" "Mozilla/5.0 (compatible; Climatebot/1.0; +http://climate.k39.us/bot.html)" 0/742404
54.162.7.197 - - [28/May/2015:17:15:32 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22289 "-" "msnbot/2.0b v1432829684.8617" 0/717679
23.96.208.137 - - [28/May/2015:17:16:05 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22294 "-" "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)" 0/560954
69.164.211.40 - - [28/May/2015:17:17:38 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22293 "-" "Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/)" 0/516967
96.126.110.221 - - [28/May/2015:17:18:24 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22300 "-" "Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/)" 0/464585
69.164.217.210 - - [28/May/2015:17:18:42 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22288 "-" "Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/)" 0/482230
173.255.232.252 - - [28/May/2015:17:19:03 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22293 "-" "Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/)" 0/514587
173.255.232.252 - - [28/May/2015:17:19:12 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22288 "-" "Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/)" 0/858459
96.126.110.222 - - [28/May/2015:17:19:26 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22288 "-" "Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/)" 0/469048
92.222.100.96 - - [28/May/2015:17:19:28 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22291 "-" "sfFeedReader/0.9" 0/574409
54.176.17.88 - - [28/May/2015:17:20:20 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+DarkPolitricks+%28Dark+Politricks%29 HTTP/1.1" 200 - "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36" 1/1112283
184.106.123.180 - - [28/May/2015:17:20:56 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22285 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14" 0/522039
74.6.254.121 - - [28/May/2015:17:22:35 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/260972
54.163.57.132 - - [28/May/2015:17:23:05 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Ruby" 0/2749
54.163.57.132 - - [28/May/2015:17:23:07 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Ruby" 0/1647
54.163.57.132 - - [28/May/2015:17:23:09 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Ruby" 0/1487
178.33.236.214 - - [28/May/2015:17:23:14 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+DarkPolitricks+%28Dark+Politricks%29 HTTP/1.1" 403 252 "-" "Mozilla/5.0 (compatible; Kraken/0.1; http://linkfluence.net/; bot@linkfluence.net)" 0/1996
168.63.10.14 - - [28/May/2015:17:23:23 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 - "-" "Apache-HttpClient/4.1.2 (java 1.5)" 0/1602
168.63.10.14 - - [28/May/2015:17:23:23 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Apache-HttpClient/4.1.2 (java 1.5)" 0/1486
74.6.254.121 - - [28/May/2015:17:24:05 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/260635
45.33.35.236 - - [28/May/2015:17:24:59 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22284 "-" "Mozilla/5.0 ( compatible ; Veooz/1.0 ; +http://www.veooz.com/veoozbot.html )" 0/618370
74.6.254.121 - - [28/May/2015:17:25:35 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/255700
70.39.246.37 - - [28/May/2015:17:26:10 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22283 "-" "Mozilla/5.0 Moreover/5.1 (+http://www.moreover.com; webmaster@moreover.com)" 0/469127
82.25.13.46 - - [28/May/2015:17:28:55 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22285 "https://www.facebook.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.81 Safari/537.36" 0/568199
157.55.39.84 - - [28/May/2015:17:29:17 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22284 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" 0/478186
188.138.124.201 - - [28/May/2015:17:30:10 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22284 "-" "ADmantX Platform Semantic Analyzer - ADmantX Inc. - www.admantx.com - support@admantx.com" 2/2500606
54.204.149.66 - - [28/May/2015:17:30:30 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22285 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0 (NetShelter ContentScan, contact abuse@inpwrd.com for information)" 0/680643
54.198.122.232 - - [28/May/2015:17:30:31 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+DarkPolitricks+%28Dark+Politricks%29 HTTP/1.1" 200 22220 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/534.24 (KHTML, like Gecko) (Contact: backend@getprismatic.com)" 0/650482
64.49.241.208 - - [28/May/2015:17:30:32 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22284 "-" "ScooperBot www.customscoop.com" 0/658243
50.16.81.18 - - [28/May/2015:17:30:46 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22283 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0 (NetShelter ContentScan, contact abuse@inpwrd.com for information)" 0/673211
54.166.112.98 - - [28/May/2015:17:30:47 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A%20DarkPolitricks%20%28Dark%20Politricks%29 HTTP/1.1" 403 252 "-" "Recorded Future" 0/1645
54.80.130.191 - - [28/May/2015:17:30:59 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 403 252 "-" "Recorded Future" 0/2777
74.6.254.121 - - [28/May/2015:17:31:35 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22282 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/606530
74.6.254.121 - - [28/May/2015:17:33:05 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22283 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/490218
54.208.89.59 - - [28/May/2015:17:34:31 +0100] "HEAD /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 - "-" "-" 0/309191
54.208.89.59 - - [28/May/2015:17:34:32 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22281 "-" "-" 0/647914
74.6.254.121 - - [28/May/2015:17:34:35 +0100] "GET /2015/05/study-finds-severe-cold-snap-during-the-geological-age-known-for-its-extreme-greenhouse-climate/ HTTP/1.1" 200 22284 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" 0/555798

So the moral of the story is this:

  • Be careful when you post to Twitter as you will get a rush of traffic to your site in the following minutes which could cause your site problems.
  • Try and block social media / brand awareness / spam BOTS if you can so they don't consume your bandwidth and CPU/Memory/
  • Use either your servers firewall, or .htaccess file to block BOTS you consider a waste of your money. Remember any HTTP request to your site when you are using a VPS will cost you money. Why waste it on BOTS that provide no benefit to you.
  • Try and mitigate the rush by using the Crawl-Delay Robots.txt command to stop the big SERP BOTS from hammering you straight away.

I am sure I will post another Twitter Rush analysis in the coming months and the number of BOTS will have grown from the initial 15+ or so when I first tested it to 200+!

Saturday, 18 May 2013

Why I hate the new Google+ API

I absolutely hate the new Google+ API

Yes Google+ have had a revamp and if you are not on it then you won't know what the old version was like if you now join.

To me it's as if someone has read too many books on the jQuery effects library and basically orgasmed code across the API.

If you go to type a new status message into a box the whole page shifts round so that your box moves to the centre of the screen and the rest of the messages and segments of the page do a little jig around it so that you are supposed to go "wow".

Not me. Too much API Jizz is something I hate. 

Not only does it repeatedly turn my PC into a helicopter as the CPU rises and falls like a coke head on the lash but it just is too much for my ageing eyes.

It really seems to me as if someone is showing off by writing their "funky" API code. Hey boss look what I can do with a shit load of JavaScript that takes ages for all the page segments to load but makes non techies go "oooh" as they see it in action.

Whilst an API should be friendly and easy to use there is nothing "useful" about the whole screen moving around just so your current type box is in the middle of the screen.

Why not just put the "new message" box in the middle to start with?

Not only that but the amount of times I go to reply to a conversation down the right hand side and someone I have never seen before pops up in a box on top of the place I am trying to write is beyond annoying.

It means not only can I hit the send button but sometimes if I can find a way to get rid of the annoying box (and that's not 100% of the time) the message I was writing disappears!

I know writing the whole page in JavaScript stops (or limits script kiddy's) from scraping easily but there really is a limit. Personally I just think Google+ have crossed it and that there was nothing too wrong with their old API.

What do you think?


Sunday, 3 March 2013

Stop BOTS and Scrapers from bringing your site down

Blocking Traffic using WebMin on LINUX at the Firewall

If you have read my survival guides on Wordpress you will see that you have to do a lot of work just to get a stable and fast site due to all the code that is included.

The Wordpress Survival Guide

  1. Wordpress Basics - Tools of the trade, useful commands, handling emergencies, banning bad traffic.
  2. Wordpress Performance - Caching, plugins, bottlenecks, Indexing, turning off features
  3. Wordpress Security - plugins, htaccess rules, denyhosts.


For instance not only do you have to handle badly written plugins that could contain security holes and slow the performance of your site but the general WordPress codebase is in my opinion a very badly written piece of code.

However they are slowly learning and I remember once (and only a few versions back) that on the home page there were over 200+ queries being run most of them were returning single rows.

For example if you used a plugin like Debug Queries you would see lots of SELECT statements on your homepage that returned a single row for each article shown for every post as well as the META data, categories and tags associated with the post.

So instead of one query that returned the whole data set for the page in one query (post data, category, tag and meta data) it would be filled with lots of single queries like this.

SELECT wp_posts.* FROM wp_posts WHERE ID IN (36800)

However they have improved their code and a recent check of one of my sites showed that although they are still using seperate queries for post, category/tag and meta data they are at least getting all of the records in one go e.g

SELECT wp_posts.* FROM wp_posts WHERE ID IN (36800,36799,36798,36797,36796)

So the total number of queries has dropped which aids performance. However in my opinion they could write one query for the whole page that returned all the data they needed and hopefully in a future edition they will.

However one of the things that will kill a site like Wordpress is the amount of BOTS that hit you all day long. These could be good BOTS like GoogleBOT and BingBOT which crawl your site to find out where it should appear in their own search engine or they could be social media BOTS that look for any link Twitter shows or scrapers trying to steal your data.

Some things you can try to stop legitimate BOTS like Google and BING from hammering your site is to set up a Webmaster Tools account in Google and then change the Crawl Rate to a much slower one.

You can also do the same with BING and their webmaster tools account. However with BING they apparently respect the ROBOTS.txt command DELAY e.g


Crawl-delay: 3


Which supposedly tells BOTS that respect the Robots.TXT commands that they should wait 3 seconds before each crawl. However as far as I know only BING support this at the moment and it would be nice if more SERP BOTS did in future.

If you want a basic C# Robots.txt parser that will tell you whether your agent can crawl a page on a site, extract any sitemap command then check out > http://www.strictly-software.com/robotstxt however if you wanted to extend it to add in the Crawl-Delay command it wouldn't be hard ( line 175 in Robot.cs ) to add in so that you could extract and respect it when crawling yourself.

Obviously you want all the SERP BOTS like GoogleBot and Bingbot to search you but there are so many Social Media BOTS and Spammers out there nowadays that they can literally hammer your site into the ground no matter how many caching plugins and .htacess rules you put in to return 403 codes.

The best way to deal with traffic you don't want to hit your site is as high up the chain as possible. 

Just leaving Wordpress to deal with it means the overhead of PHP code running, include files being loaded, regular expression to test for harmful parameters being run and so on.

Moving it up to the .htaccess level is better but it still means your webserver is having to process all the .htacess rules in your file to decide whether or not to let the traffic through or not.

Therefore if you can move the worst offenders up to your Firewall then it will save any code below that level from running and the TCP traffic is stopped before any regular expressions have to be run elsewhere.

Therefore what I tend to do is follow this process:


  • Use the Wordpress plugin "Limit Login Attempts" to log people trying to login (without permission) into my WordPress website. This will log all the IP addresses that have attempted and failed as well as those tht have been blocked. This is a good starting list for your DENY HOSTS IP ban table
  • Check the same IP's as well as using the command: tail -n 10000 access_log|cut -f 1 -d ' '|sort|uniq -c|sort -nr|more  to see which IP addresses are visiting my site the most each day.
  • I then check the log files either in WebMin or in an SSH tool like PUTTY to see how many times they have been trying to visit my site. If I see lots of HEAD or POST/GET requests within a few seconds from the same IP I will then investigate them further. I will do an nslookup and a whois and see how many times the IP address has been visiting the site.
  • If they look suspicious e.g the same IP with multiple user-agents or lots of requests within a short time period I will comsider banning them. Anyone who is using IE 6 as a user-agent is a good suspect (who uses IE 6 anymore apart from scrapers and hackers!)
  • I will then add them to my .htaccess file and return a [F] (403 status code) to all their requests.
  • If they keep hammering my site I wll then move them from my DENY list in my .htaccess fle and add them to my firewall and Deny Hosts table.
  • The aim is to move the most troublesome IP's and BOTS up the chain so they cause the least damage to your site. 
  • Using PHP to block access is not good as it consumes memory and CPU, the .htaccess file is better but still requires APACHE to run the regular expressions on every DENY or [F] command. Therefore the most troublesome users should be moved up to the Firewall level to cause the less server usage to your system.
  • Reguarly shut down your APACHE server and use the REPAIR and OPTIMIZE options to de-frag your table indexes and ensure the tables are performing as well as possible. I have many articles on this site on other tools which can help you increase your WordPress sites perforance with free tools.

In More Details

You should regularly check the access log files for the most IP's hitting your site, check them out with a reverse DNS tool to see where they come from and if they are of no benefit to you (e.g not a SERP or Social Media agent you want hitting your site) then add them to your .htaccess file under the DENY commands e.g

order allow,deny
deny from 208.115.224.0/24
deny from 37.9.53.71

Then if I find they are still hammering my site after a week or month of getting 403 commands and ignoring them I add them to the firewall in WebMin.


Blocking Traffic at the Firewall level

If you use LINUX and have WebMin installed it is pretty easy to do.

Just go to the WebMin panel and under the "Networking" menu is an item called "Linux Firewall". Select that and a panel will open up with all the current IP addresses, Ports and packets that allowed or denied access to your server.

Choose the "Add Rule" command or if you have an existing Deny command you have setup then it's quicker to just clone it and change the IP address. However if you don't have any setup yet then you just need to do the following.

In the window that opens up just follow these steps to block an IP address from accessing your server.

In the Chain and Action Details Panel at the top:


Add a Rule Comment such as "Block 71.32.122.222 Some Horrible BOT"
In the Action to take option select "Drop"
In the Reject with ICMP Type select "Default"

In Condition Details Panel:

In source address of network select "Equals" and then add the IP address you want to ban e.g 71.32.122.222
In network protocol select "Equals" and then "TCP"

Hit "Save"

The rule should now be saved and your firewall should now ban all TCP traffic from that IP address by dropping any packets it receives as soon as it gets them.

Watch as your performance improves and the number of 403 status codes in your access files drop - until the next horrible social media BOT comes on the scene and tries scrapping all your data.

IMPORTANT NOTE

WebMin isn't very clear on this and I found out the hard way by noticing that IP addresses I had supposedly blocked were still appearing in my access log.

You need to make sure all your DENY RULES are above the default ALLOW rules in the table WebMin will show you.

Therefore your rules to block bad bots, and IP addresses that are hammering away at your server - which you can check in PUTTY with a command like this:
tail -n 10000 access_log|cut -f 1 -d ' '|sort|uniq -c|sort -nr|more 

Should be put above all your other commands e.g:


Drop If protocol is TCP and source is 91.207.8.110
Drop If protocol is TCP and source is 95.122.101.52
Accept If input interface is not eth0
Accept If protocol is TCP and TCP flags ACK (of ACK) are set
Accept If state of connection is ESTABLISHED
Accept If state of connection is RELATED
Accept If protocol is ICMP and ICMP type is echo-reply
Accept If protocol is ICMP and ICMP type is destination-unreachable
Accept If protocol is ICMP and ICMP type is source-quench
Accept If protocol is ICMP and ICMP type is parameter-problem
Accept If protocol is ICMP and ICMP type is time-exceeded
Accept If protocol is TCP and destination port is auth
Accept If protocol is TCP and destination port is 443
Accept If protocol is TCP and destination ports are 25,587
Accept If protocol is ICMP and ICMP type is echo-request
Accept If protocol is TCP and destination port is 80
Accept If protocol is TCP and destination port is 22
Accept If protocol is TCP and destination ports are 143,220,993,21,20
Accept If protocol is TCP and destination port is 10000


If you have added loads at the bottom then you might need to copy out the IPTables list to a text editor, change the order by putting all the DENY rules at the top then re-saving the whole IPTable list to your server before a re-start of APACHE.

Or you can use the arrows by the side of each rule to move the rule up or down in the table - which is a very laborious task if you have lots of rules.

So if you find yourself still being hammered by IP addresses you thought you had blocked then check the order of your commands in your firewall and make sure they are are at the top NOT the bottom of your list of IP addresses.

Tuesday, 30 October 2012

New version of the SEO Twitter Hunter Application

Introducing  version 1.0.4 of the Twitter Hashtag Hunter Application

I have just released the latest version of the popular windows application that is used by SEO experts and tweeters in combination with my Strictly Tweetbot Wordpress plugin to find new @accounts and #hashtags to follow and use.

Version 1.0.4 of the Twitter HashTag Hunter application has the following features:
  • A progress bar to keep you informed of the applications progress in scanning.
  • More detailed error reporting including handling the fail whale e.g 503 service unavailable error.
  • More HTTP status code errors including 400, 404, 403 and the 420 Twitter Scan Rate exceeded limit.
  • Clickable URL's that open the relevant Twitter account or Hash Tag search in your browser.
  • Multiple checks to find the accounts follower numbers to try and future proof the application in case Twitter change their code again.
  • A new settings tab that controls your HTTP request behaviour.
  • The ability to add proxy server details e.g IP address and Port number to scan with.
  • The ability to change your user-agent as well as a random user-agent switcher that picks between multiple agent strings for each HTTP request when a blank user-agent is provided.
  • An HTTP time-out setting to control how long to wait for a response from the API.
  • A setting to specify a wait period in-between scans to prevent rate exceeded errors.
  • A setting to specify a wait period when a "Twitter scan rate exceeded" error does occur.
  • Extra error messages to explain the result of the scan and any problems with the requests or settings.
The main new feature of 1.0.4 is the new settings panel to control your scanning behaviour. This allows you to scan through a proxy server, specify a user-agent, set delay periods in-between scans and the "Twitter Scan Rate exceeded limit" error which occurs if you scan too much.

Changing the Scanner Settings

For Search Engine Optimisation (SEO) experts or just site owners wanting to find out who they should be following and which #hashtags they should be using in their tweets this application is a cheap and useful tool that helps get your social media campaign off the ground by utilising Twitters Search API.

You can download the application from the main website www.strictly-software.com.

Saturday, 25 June 2011

Loading Social Media Code Asynchronously

Preventing Slow Page Loads By Loading Widgets Asynchronously

I have noticed on a number of sites that use such as the popular Add This widget that for some reason many people using this code have caused problems for themselves by adding the <SCRIPT> tag that loads the widget in every place that they want the widget to display.

On a news blog with 10 articles this means that the same <SCRIPT> could be referenced 10 times. Now I know browsers are clever enough to know what they have loaded and utilise caching but as every user of Google Adsense knows having to embed <SCRIPT> tags in the DOM at the place where you want the advert to display instead of just referencing it once at the bottom of the HTML or loading it in with Javascript can cause slow page loads as the browser will hang when it comes across a script until the content has been loaded.

I have personally spent ages trying to hack Google AdSenses code about to utilise the same asynchronous loading that they now use for their Analytics code but to no avail. There code loads in multiple iframes and any hacking seems to trigger a flag their end that probably signifies to them some kind of fraudulent abuse.

However for other kinds of widgets including the AddThis widget there is no need to reference the script multiple times and I am busy updating some of my sites to utilise another method which can be seen on the following test page >> http://www.strictly-software.com/AddThis_test.htm


Loading addthis.com social media widgets asynchronously

I wanted to keep the example as simple as possible so in that regards if you use IE it's best off to view it in IE 9 as the only cross browser code I have added is a very basic addEvent function and an override for document.getElementsByClassName which doesn't exist pre IE 9.

Other browsers should work without a problem i.e Chrome, FireFox, Safari, Opera and any other standard compliant browser that supports the DOM 2 Event Model.


Specifying where the Social Media widgets will appear

HTML 5 makes use of custom attributes that validate correctly as long as they are prefixed by the name data- therefore I have utilised this much needed feature to specify the URL and the Title of the item that is to be bookmarked on the desired Social Media site.

You might have a page with multiple items, blog articles or stories each with their own Social Media widget and instead of defaulting to the URL and Title of the current document it is best to specify the details of the article the Add This widget refers to.

The HTML code for outputting a widget is below:


<div class="addthis_wrapper" data-url="http://www.strictly-software.com/twitter-translator" data-title="Twitter Translator Tool" >


Notice how the URL and Title are referenced by the

data-url="http://www.strictly-software.com/twitter-translator" data-title="Twitter Translator Tool"

attributes. You can read more about using custom HTML5 attributes as they are becoming more and more commonly used.
Changing the placeholders into Social Media widgets

Once the placeholder HTML elements are inserted into your DOM where you want the addthis widget to appear instead of doing what many Wordpress plugins and coders do and adding a reference to the hosted script next to each DIV you just need to add the following code in the footer of your file.

You can either wrap the code in an on DOM load event or an on Window load or just as I have done wrap it in a self calling function which means it will run as soon as the Browser gets to it.

You can view the code is more detail on my test page but to keep things simple I have just done enough cross browser tweaks to make it run in most browsers including older IE. There might be some issues with the actual addthis code that is loaded in from their own site but I cannot do anything about their dodgy code!

The Javascript code to change the DIV's into Social Media Widgets

(function(){
// this wont be supported in older browsers such as IE 8 or less
var els=document.getElementsByClassName('addthis_wrapper');

if(els && els.length >0){

// create a script tag and insert it into the DOM in the HEAD
var at=document.createElement('script');at.type='text/javascript';

// make sure it loads asynchronously so it doesn't block the DOM loading
at.async=true;
at.src=('https:'==document.location.protocol?'https://':'http://')+'s7.addthis.com/js/250/addthis_widget.js?pub=xa-4a42081245d3f3f5';

// find the first <SCRIPT> element and add it before
var s=document.getElementsByTagName('script')[0];s.parentNode.insertBefore(at,s);

// loop through all elements with the class=addthis_container
for(var x=0;x<els.length;x++){

// store pointer
var el = els[x];

// get our custom attribute values for the URL to bookmark and the title that describes it defaulting to the placeholders that
// will take the values from the page otherwise. By using data-title and data-url we are HTML 5 compliant
var title = els[x].getAttribute("data-title") || "[TITLE]";
var url = els[x].getAttribute("data-url") || "[URL]";

// create an A tag
var a=document.createElement('A');
a.setAttribute('href','http://www.addthis.com/bookmark.php');

// create an IMG tag
var i=document.createElement('IMG');
i.setAttribute('src','http://s7.addthis.com/static/btn/lg-share-en.gif');

// set up your desired image sizes
i.setAttribute('width','125');
i.setAttribute('height','16');
i.setAttribute('alt','Bookmark and Share');
i.style.cssText = 'border:0px;';

// append the image to the A tag
a.appendChild(i);

// append the A tag (and image) to the DIV with the class=addthis_container
el.appendChild(a);

// using DOM 2 event model to add events to our element - remember if you want to support IE before version 9 you will need to either use a wrapper addEvent
// function that uses addEvent for IE (and Opera) and addEventListener for IE 9, Firefox, Opera, Webkit and any other proper broweser
addEvent(a,"mouseover",function(e){if(!addthis_open(this, '', url, title)){StopEvent(e,a)}});
addEvent(a,"mouseout",function(){addthis_close});
addEvent(a,"click",function(e){if(!addthis_sendto()){StopEvent(e,a)}});

// cleanup
el=a=i=title=url=null;
}
}
})();


The code is pretty simple and makes use of modern browsers support for document.getElementsByClassName to find all elements with the class we identified our social media containers with. This can obviously be replaced with a selector engine such as Sizzle if required.

First off the code builds a SCRIPT element and inserts it into the DOM in the HEAD section of the page. The important thing to note here is that as this code is at the bottom of the page nothing should block the page from loading and even if the SCRIPT block was high up in the DOM the code only runs once the DOM has loaded anyway.
// create a script tag and insert it into the DOM in the HEAD
var at=document.createElement('script');at.type='text/javascript';

// make sure it loads asysnchronously so it doesn't block the DOM loading
at.async=true;
at.src=('https:'==document.location.protocol?'https://':'http://')+'s7.addthis.com/js/250/addthis_widget.js?pub=xa-4a42081245d3f3f5';

// find the first <SCRIPT> element and add it before
var s=document.getElementsByTagName('script')[0];s.parentNode.insertBefore(at,s);




The code then loops through each node that matches creating an A (anchor) tag and an IMG (image) tag with the correct dimensions and attributes for the title and URL. If none are supplied then the system will default to the document.location.href and document.title if no values are supplied which might be fine if it's the only widget on the page but if not values should be specified.


Events are then added to the A (anchor) tag to fire the popup of the AddThis DIV and to close it again and I have used a basic addEvent wrapper function to do this along with a StopEvent function to prevent event propagation and these are just basic cross browser functions to handle old cruddy browsers that no-one in their right mind should be using any more. As this is just an example I am not too bothered if this code fails in IE 4 or Netscape as its just an example of changing what is often plugin generated code.


You can see an example of the code here >>

http://www.strictly-software.com/addthis_test.htm

This methodology is being used more and more by developers but there are still many plugins available for Wordpress and Joomla that insert remote loading SCRIPTs all throughout the DOM as well as using document.write to insert remote SCRIPTS. These methods should be avoided if at all possible especially if you find that your pages hang when loading and as you can see from the example code it is pretty simple to convert to your favourite framework if required.