Sunday, 30 October 2022

What do you do when you have multiple blogs under one Google Blogger Account?

Changing The Names Of Replies And Posts

By Strictly-Software

If you are a blogger user and you have multiple blogs each with different names then you might run into the problem that when you post, or reply to a message using your Google account it will put your main Blogger account name in as the author, or comment author.

Now the way I have got round this is to use JavaScript in the footer of the blog. Add a JavaScript / HTML gadget to the footer of your blog and then you can fix this issue.

I will show you two ways you can fix this using old JavaScript and newer JavaScript .

To fix the name of post authors you can use this code. Say your blogger author name is Dave Williams but you are using pseudo names for your blog, say "Strictly-Software" for example then you don't want Dave Williams posted at the end of each post, you want Strictly-Software to appear.

Wrap these in SCRIPT blocks, whether a short <script>....</script> block or a full old style <script type="text/javascript"><!-- ..... //--></script> tags. Up to you.

This looks for all the spans with a class of "fn" which appears under the post where it says "Posted by ..... at 9:58pm" etc. So we loop through all the fn classes and change the innerHTML to the value we want.
 
(function(){
var x = document.getElementsByClassName("fn");
var i;
for (i = 0; i < x.length; i++) {
    x[i].innerHTML = "Strictly-Software";
}
})()

Now what about comments? 

If you are replying to someone and you don't want to use the Name/URL option to put your blogs name in, then you can use the Google option so it knows you own the comment, but then you can use some code to identify the links which contain the Google Authors name and if found update them to the name you want.

Same as before with the SCRIPT tags, although I put both together one after the other in one script block at the bottom of my layout.

Now, this code uses a more modern document.querySelectorAll method to find all elements that match the selector we pass in which will be an anchor tag. This is because all comments have the name of the person commenting wrapped in a comment tag IF you use a Google Account to post it. 

Obviously, Anonymous and Name/URL comments can't as there is nowhere to link to.

(function(){
	var x = document.querySelectorAll("a");
	var i;
	for (i = 0; i < x.length; i++) {		
		if(x[i].innerHTML=="Dave Williams"){
			x[i].innerHTML="Strictly-Software"
		}		
	}
})();

As you can see I am looping through all the elements that the querySelectorAll("a"); returns. 

If we had used just querySelector it would have only found the first A tag on the page and we need to find and replace all comments on the page which have Dave Williams as the current contents and if found, we replace it with Strictly-Software by using the innerHTML of the element.

I just thought I would post this as I had to do this tonight on one of my other blogs as I noticed the comment author was wrong. You could use the 2nd method for getting the author name as well if you wanted but I thought I would show you two ways of doing this.

Hope this helps if you have the same issue.

By Strictly-Software

Thursday, 11 August 2022

Testing For Internet Connectivity

Setup Measures For A Big API System

By Strictly-Software

I have a BOT that runs nightly, and in the SetUp() methods to ensure everything is okay, it runs a number of tests before logging to an API table in my database. This is so I know if the API I am using is available, whether I needed to login to it, when the last job was started and stopped, and when the last time I did log into the API. 

It just helps in the backend to see if there are any issues with the API without debugging it. Therefore the SetUp job has to be quite comprehensive.

The things I need to find out, which are logged into the logfile.log I keep building until the job is finished with major Method Input Params, Return Params, Handled Errors, Changed System Behaviour, SQL Statements that return an error e.g timeout, an unexpected divide by zero error and the like are the following from my SetUp job.

1. Whether I can access the Internet, this is done with a test to a page that just returns an IPv4 or IPv6 address. The major HTTP Status code I look for there is a 200 status code. If I get nowhere but get a status code of 1 that is a WebExceptionStatus.NameResolutionFailure, which means the DNS could not find the location and it's probably due to either your WIFI button being turned off (Flight Mode), or your Network Adapter having issues.

I test for web access with an obvious simple HTTP test to an IP page that returns my address, with 2 simple regular expressions that can ID if it's IPv4 or IPv6.


public string TestIPAddress(bool ipv4=true)
{
    string IPAddress = "";
    // if we are using an IPv6 address the normal page will return it if we want to force
    // get an IPV4 address then we go to the IPv4 page which is the default behaviour due to
    // VPNS and Proxies using IPV4 usually
    string remotePageURL = ((ipv4) ? "https://ipv4.icanhazip.com/" : "https://icanhazip.com/");
   
    this.UserAgent = this.BrainiacUserAgent; // use our BOT UserAgent as no need to spoof
    this.Referer = "LOCALHOST"; // use our localhost as we are running from our PC
    this.StoreResponse = true; // store the reponse from the page
    this.Timeout = (30000); // ms 30 seconds            

    // a simple BOT that just does a GET request I am sure you can find a C# BOT example on my blog or tons on the web
    this.MakeHTTPRequest(remotePageURL);            

    // if status code is between 200 & 300 we should have data
    if (this.StatusCode >= 200 && this.StatusCode < 300)
    {
		IPAddress = this.ResponseContent; // get the IP Address

		// quick test for IPAddress Match, IPv4 have 3 dots
		if(IPAddress.CountChars('.')==3)
		{
		   // An IpV4 address can replace dots and if all numbers then it is IPv4
		   string ip2 = IPAddress.Replace(".", "");
		   if(Regex.IsMatch(ip2, @"^\d+$"))
		   {
				this.HelperLib.LogMsg("IP Address is IPv4: " + IPAddress + ";","MEDIUM"); // log to log file
		   }
		}
		// otherwise if it contains : semi colons (and nos and certain HEX letters)
		else if (IPAddress.Contains(':'))
		{
		   // could be an IpV6 address check only numbers and letters when colons are removed                    
		   if (Regex.IsMatch(IPAddress, @"^[:0-9A-F]+$"))
		   {
				this.HelperLib.LogMsg("IP Address is IPv6: " + IPAddress + ";","MEDIUM"); // log to log file
		   }
		}
    }
    else
    {
		// no response, flight mode button enabled?
		this.HelperLib.LogMsg("We could not access the Internet URL: " + remotePageURL + "; Maybe Flight Mode is on or Adapter is broken and needs a refresh", "LOW");

		IPAddress = "No IP Address returned; HTTP Errror: " + this.StatusCode.ToString() + "; " + this.StatusDesc + ";";

		// call my IsNetworkOn function to test we have a network
		if(this.IsNetworkOn())
		{
		   this.LastErrorMessage = "The Internet Network Is On but we cannot access the Internet";
		}
		else
		{
		   this.LastErrorMessage = "The Internet Network Is Off and we cannot access the Internet";
		}

		this.HelperLib.LogMsg(this.LastErrorMessage,"LOW"); // log error

		// throw the error will be caught in a lower method and stop the script
		throw new System.Exception(this.LastErrorMessage);
    }
}


If that fails then I run this method to see if the Network is available. You can see in the code above that I call it if there is no 200-300 HTTP status response or IP address returned. 

I test it by running the console script with and without the flight mode button on my laptop.

public bool IsNetworkOn()
{
bool IsNetworkOn = System.Net.NetworkInformation;

NetworkInterface.GetIsNetworkAvailable();

return IsNetworkOn;
}
I also have a series of Proxy addresses I can use to get round blocks on IP addresses although I do try and use Karmic Scraping e.g no hammering of the site, caching pages I need to prevent re-getting them, gaps between retries if there is a temporary error where I change the referer, user-agent and proxy (if required), before trying again after so many seconds.

Also before I turn on a global setting in my HelperLib class which is the main class all other objects refer to and pass from one to another, I test to see if the computer I am on is using a global proxy or not. I do an HTTP request to a page that returns ISP info, and I have a list of ISP's which I can check

If my ISP comes back with the right ISP for my home, I know I am not on a global proxy, also it tells me whether the IP address from earlier is IPv4 or IPv6, then I know I am not using a global proxy. If Sky or Virgin is returned I know I have my VPN enabled on my laptop and that I am using a global proxy.

If I am using a global proxy all the tests and checks for "getting a new random proxy:port" when doing an HTTP request are canceled as I don't need to go through a VPN and then a proxy. If the Global Proxy is turned off I might choose to get a random proxy before each HTTP call. 

If that setting is enabled in my Settings class. As you can't have config page in console scripts that hook into a DLL, my code in the DLL for my settings is just a class called Settings with a big Switch Statement where I get the setting I need or set it.

Once I have checked my Internet access I then do an HTTP call to the Betfair API Operational page which tells me if the API is working or not. If it isn't then I cannot log in into it. If it is working I do a Login test to ensure I can login this involves:
  • Using 1 of 2 JSON methods to login to the 2 endpoints that give me API access with a hashed API code, and my username & password.
  • If I can log in, I update the API table in the DB with the time I logged in and that I am logged in. There may be times I don't need to login to the API such as when I am just scraping all the Races and Runner info.
  • I then do a Session re-get, I hold my session in a file so I don't have to re-get a new one each time but on a run I like to get a new Session and extract the session code from the JSON return to pass back in JSON calls to get data etc.
  • I then do a database test using my default connection string I just SELECT TOP(1) FROM SYS.OBJECTS and ensure I get a response.

Once I know the API is operational, the DB connectivity is working and that I can scrape pages outside the Betfair API I can set all the properties such as whether to use a random proxy/referer/user-agent if one of my calls errors (with a 5 second wait), if I am using a global proxy I don't bother with the proxies.

Believe it or not that is a lot of code across 5 classes just to ensure everything is okay on running the job from the console or Windows Service

I throw my own errors if I cannot HTTP scrape, not just log them, so I can see what needs fixing quickly, and I create a little report with the Time, BOT version, Whether it is live, testing, placing real bets, running from a service or console and other important info I might need to read quickly.

So this is just a quick article on how I go about ensuring a project using a big DLL (with all the HTTP, HelperLib, API etc) code is working when I run it.

Who knows you might want to do similar checks on your large automated systems. 

Who knows. I am just making bank from my AutoBOT which TRADES on Betfair for me!

© 2022 Strictly-Software

Wednesday, 3 August 2022

Extending Try/Catch To Handle Custom Exceptions - Part 2

Extending Core C# Object Try/Catch To Display Custom Regular Expression Exceptions - Part 2

By Strictly-Software

This is the second part of our custom Try/Catch overwrite program that is aimed to catch Regular Expression errors that would normally be thrown up by !IsMatch (doesn't match) and so on.
 
So why would you want to do this when the C# .NET RegEx functions already return a true or false or 0 matches etc which you could use for error detection?
 
Well, you can read the first article above this one for info however you may have a case where you have a BOT that is automatically running every day all day on certain sites where the HTML has been worked out and expression especially made to match them. 

If for example the HTML source changes and the piece of data you are collecting suddenly does not get returned anymore as your expression does not match anything instead of just logging to a file which means scanning and checking for errors once you notice your fully automated 24/7/365 system has stopped working 2 months later. 

Therefore instead of just logging the results of IsMatch you may want to throw up an error and stop the program so that a new expression can be made to match the HTML ASAP without lots of data being missed.
 
If you want more info on the Try/Catch class, read the first example article. Here we build just a basic BOT that is going to utilise the RegExException class we made last time. It is not a perfect BOT, you may get some warnings come up, I did it in VS 2021, and it's a specific BOT as its aims are to get the META Title, Description, and the full source of the URL that is passed into it. 

Any errors with the regular expressions not matching anymore are thrown up using our RegExException TryCatch cobject so that they detail the expression, text, or whichever of the 3 methods we made. If you want to use it as the basis of your own BOT you can but has been specifically designed to demonstrate this RegEx TryCatch example so you may want to edit or remove a lot of content first before building back on top of it if you want to make your own BOT in C#.
 
So here is the BOT

public class HTML
{
	public string URL  // property
	{ get; set; }

	public string HTMLSource  
	{ get; set; }

	public string HTMLTitle
	{ get; set; }

	public string HTMLDesc
	{ get; set; }

	public void GetHTML()
	{
	    string HTMLSource = "";
	    string HTMLTitle = "";
	    string HTMLDesc = "";

	    // if no URL then all properties are blank
	    if (String.IsNullOrWhiteSpace(this.URL))
	    {
			HTMLTitle=HTMLDesc = HTMLSource = "";
			return;
	    }

	    try
	    {
			HttpWebRequest request = (HttpWebRequest)WebRequest.Create(this.URL);
			HttpWebResponse response = (HttpWebResponse)request.GetResponse();

			if(response.StatusCode == HttpStatusCode.Forbidden)
			{
		    	throw new Exception(String.Format("We are not allowed to visit this URL {0} 403 - Forbidden", this.URL));
			}
			else if(response.StatusCode == HttpStatusCode.NotFound)
			{
		    	throw new Exception(String.Format("This URL cannot be found {0} 404 - Not Found", this.URL));
			}
            // 200 = OK, we have a response to analyse
            else if (response.StatusCode == HttpStatusCode.OK)
            {
                Stream receiveStream = response.GetResponseStream();
                StreamReader readStream = null;
                if (response.CharacterSet == null)
                {
                    readStream = new StreamReader(receiveStream);
                }
                else
                {
                    readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
                }

                string source = this.HTMLSource = readStream.ReadToEnd();

                if (!String.IsNullOrWhiteSpace(source))
                {
                    // extract title with a simple regex                        
                    string reTitle = @"^[\S\s]+?<title>([\S\s]+?)</title>[\S\s]+$";
                    // Get the match for META TITLE if we can using a wrapper method that takes a regex string, source string and group to return
                    string retval = this.GetMatch(reTitle, source, 1);
                    if (!String.IsNullOrWhiteSpace(retval))
                    {
                        this.HTMLTitle = retval;
                    }
                    else // failed to find the <title>Page Title</title> in the HTML source so throw a Regex Exception
                    {
                        throw new RegExException(reTitle, new Exception(String.Format("No match could be found for META Title with {0}", reTitle)));
                    }

                    // META DESCRIPTION
                    string reDesc = @"^[\s\S]+?<meta content=[""']([^>]+?)['""]\s*?name='description'\s*?\/>[\s\S]*$";
                    // Get the match for META DESC if we can using a wrapper method that takes a regex string, source string and group to return
                    retval = this.GetMatch(reDesc, source, 1);
                    if(!String.IsNullOrWhiteSpace(retval))
                    {
                        this.HTMLDesc = retval;
                    }
                    else // failed to find the <title>Page Title</title> in the HTML source so throw a Regex Exception
                    {
                      throw new RegExException(reDesc, new Exception(String.Format("No match could be found for META Description with {0}", reDesc)));
                    }
                }

                response.Close();
                readStream.Close();
            }
	    }
	    catch(WebException ex)
	    {
          	// handle 2 possible errors
          	if(ex.Status==WebExceptionStatus.ProtocolError)
          	{
              Console.WriteLine("Could not access {0} due to a protocol error", this.URL);
          	}
          	else if(ex.Status==WebExceptionStatus.NameResolutionFailure)
          	{
              Console.WriteLine("Could not access {0} due to a bad domain error", this.URL);
          	}
	    } // catch anything we haven't handled and throw so we can code in a way to handle it
	    catch (Exception ex)
	    {
			throw;
	    }
	}


	private string GetMatch(string re,string source,int group)
	{
	    string ret = "";

	    // sanity test
	    if (String.IsNullOrWhiteSpace(re) || String.IsNullOrWhiteSpace(source) || group == 0)
	    {
			return "";
	    }
	    // use common flags these could be passed in to method if need be
	    Regex regex = new Regex(re, RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Multiline);

	    if (regex.IsMatch(source))
	    {
          	MatchCollection matches = regex.Matches(source);

          	Console.WriteLine(String.Format("We have {0} matches", matches.Count.ToString()));

		foreach (Match r in matches)
		{
		    if (r.Groups[group].Success)
		    {
				ret = r.Groups[group].Value.ToString().Trim();
				break;
		    }
		}                
	    }
	    // return matched value
	    return ret;
	}
}
 
So as you can see this HTML Class is my basic BOT. It has 4 properties, URL (which is just to set the URL of the code to get) then: HTMLSource, HTMLTitle, and HTMLDesc which as you can see are all abbreviations of the 3 main bits of the content we want from our BOT, the META Title, META Description, and the whole HTML source. We set these using the code in our HTML classes GetHTML() which does the main work but uses another method GetMatch() to try and get the match of the parameters passed in e.g the expression, the source code to get any match from, and the no of the match to get.
 
For example, there may be bad XHTML/HTML all over the web where people have used ID"s multiple times or put elements that should only exist once in multiple times. With this last parameter, you can ensure you are getting the match you want, usually 1 for the first.
 
Hopefully, you can see what the class is doing and how the results of the IsMatch method can fire off a custom RegExException or not and how it is formatted.
 
In the next article, we will show the content that uses this class to return data and put it all together
 
 
© 2022 By Strictly-Software 

Thursday, 26 May 2022

Extending Try/Catch To Handle Custom Exceptions - Part 1

Extending Core C# Object Try/Catch To Display Custom Regular Expression Exceptions - Part 1

By Strictly-Software

Have you ever wanted to throw custom exception errors whether for business logic such as if a customer isn't found in your database, or whether more complex logic has failed that you don't want to handle but raise so that you know about it.
 
For me, I required this solution due to a BOT I use that collects information from various sites every day and uses regular expressions to find the pieces of data I require and extract them. The problem is that the site often changes its HTML source code and it will break a regular expression I have specifically crafted to extract the data using it.
 
For example, I could have a very simple regular expression that is getting the ordinal listing of a name using the following C# regular expression:
 
string regex = @"<span class=""ordinal_1"">(\d+?)</span>";
Then one day I will run the BOT and it won't return any ordinals for my values and when I look inside the log file I find it will have my own custom message due to finding no match such as "Unable to extract value from HTML source" and when I go and check the source it's because they have changed the HTML to something like this:
<span class="ordinal_1__swrf" class="o1 klmn">"1"<sup>st</sup></span> 
 
This is obviously gibberish many designers add into HTML to stop BOTS crawling their data hoping that the BOT developer has written a very specific regex that will break and return no data when met with such guff HTML. 
 
Obviously, the first expression whilst meeting the original HTML alright is too tight to handle extra guff within the source. 
 
Therefore it required a change in the expression to something a bit more flexible in case they added even more guff into the HTML:
 
string regex = @"<span class=""ordinal_+?[\s\S]+?"">""?(\d+?)""?<sup";
 
As you can see I have made the expression looser, with extra question marks ? in case quotes are or are not wrapped around values, and using non-greedy match any character and non-character expressions like [\s\S]+? to handle the gibberish from the point it appears to where I know it has to end at the closing quote or bracket.
 
So instead of just logging the fact I have missing data from crawls I wanted to raise the errors with TRY/CATCH and make the fact that a specific piece of HTML no longer matches my expression an exception that will get raised so I can see it as soon as it happens. 
 
Well with C# you can extend the base object TRY/CATCH so that your own exceptions based upon your own logic can be used. In future articles, we will build up to a full C# Project with a couple of classes, a simple BOT and some regular expressions we can use to test what happens when trying to extract common values from the HTML source on various URL's to throw exceptions.
 

Creating The TRY CATCH CLASS

First off the TRY / CATCH C# Class where I am extending the base object to call, usual messages but using String,Format so that I can pass in specific messages. I have put numbers at the start of each method so that when running the code later we can see which exceptions get called.
 
I have just created a new solution in Visual Studio 2022 called TRYCATCH and then named my RegExException that extends the Exception object. You can call your solution what you want but for a test, I don't see why just following what I am doing is not okay.

[Serializable]
public class RegExException : Exception
{
public RegExException() { }

public RegExException(string regex, string msg)
   : base(String.Format("1 Regular Expression Exception: Regular Expression: {0}; {1}", regex, msg )) { }

public RegExException(string msg)
    : base(String.Format("2 {0}", msg)) { }

public RegExException(string regex, Exception ex)
    : base(String.Format("3 Regular Exception is no longer working: {0} {1}", regex,ex.Message.ToString())) { }

}
 
As you can see the Class has 3 overloaded methods which either just take a single message, a regular expression string and a message, and a regular expression and an exception these values are placed in specific places within the exception messages.
 
You would cause a Regular Expression Exception to be thrown by placing something like this in your code:
 
throw new RegExException(String.Format("No match could be found with Regex {0} on supplied value '{1}'", re, val));
 
Hopefully, you can see what I am getting at, and we will build on this in a future post.
 
© 2022 By Strictly-Software 
 
 

Wednesday, 16 February 2022

Blogger EU Cookie Message Missing Problem & Solution

My EU Cookie Message Disappeared From My Site - How To Get It Back

By Strictly-Software

I had a bit of a weird experience recently, I found out, only from Google informing me, that for some reason one of my Blogger sites was not showing the EU Cookie Notice that should appear on all Blogger sites if in a European country where "Consent to use Cookies", is required by all website users.

It used to show, and my other blogger sites were still working and in fact, on my own PC, it was still showing the correct message e.g:

"This site uses cookies from Google to deliver its services and to analyse traffic. Your IP address and user agent are shared with Google, together with performance and security metrics, to ensure quality of service, generate usage statistics and to detect and address abuse."

However, when I viewed the site Google had told me about on another PC this was not appearing for some reason. 

I cleared all the cookies, path, domain, and session using the Web Extension called "Web Developer Toolbar", such as this Chrome version. After installing it, a grey cog appears in your toolbar if you fix it to. 

It is really helpful for turning password fields into text, if you want to see what you are typing, or need to see a Browser stored password in the field, or as needed in this instance, for deleting all kinds of Cookies. So after deleting all the Cookies, I refreshed the page and but the EU Cookie Message still didn't show.

Fixing Blogger Cookie Notice Not Showing

If you view the source of a blogger site that is showing the EU Cookie message, then you should find the following code in your source, not generated source, but the standard "View Source" options when you right-click on the page and view the context menu.

The following code should be just above the footer where all your widget scripts are loaded. Notice that I put some HTML comments above my version of the code so I could easily identify it from Bloggers version in the DOM when viewing the source. Why do this if Bloggers code is not in the source anyway? Wait and see.

<!-- this is my code as bloggers only appear in the source sometimes -->
<script defer='' src='/js/cookienotice.js'></script>
<script>
 document.addEventListener('DOMContentLoaded', function(event) {
      window.cookieChoices && cookieChoices.showCookieConsentBar && cookieChoices.showCookieConsentBar(
          (window.cookieOptions && cookieOptions.msg) || 'This site uses cookies from Google to deliver its services and to analyse traffic. Your IP address and user agent are shared with Google, together with performance and security metrics, to ensure quality of service, generate usage statistics and to detect and address abuse.',
          (window.cookieOptions && cookieOptions.close) || 'Ok',
          (window.cookieOptions && cookieOptions.learn) || 'Learn more',
          (window.cookieOptions && cookieOptions.link) || 'https://www.blogger.com/go/blogspot-cookies');
});
</script> 

So to fix the issue, copy the code out of the page and then go to your layout tool in Blogger. Add a widget at the bottom if no JavaScript/Text widget already exists and copy the code into it.

Now the odd thing is that as soon as I saved the widget and then my blogger site. I went to the website having issues and viewed the source code. When I did I saw that not only was my version of the code was in the HTML, but somehow this had put Bloggers own version back into the HTML as well!

Why this would do that I have no idea. However, it meant that I now had two lots of the same script being loaded and 2 lots of the EU Cookie code that shows the DIV and the options for wording appearing in my HTML.

The good thing though, is that this did not cause a problem for my site. I found by adding that code into a widget at the bottom of my page above Bloggers magically re-appearing code, that it did NOT cause the message to appear twice, but also that when I removed my own version of the code, the Blogger version remained.

Also even though I have put in a version of the Blogger code that uses English sentences into the HTML when I use a Proxy or VPN to visit a European country such as Germany, the wording appears in German.

I suspect that my code runs first as it's first in the DOM, then the Google code runs, overwriting my DIV with their DIV and of course the correct wording for the country we are in.

So as I thought everything was working, I removed my own code and saved the site. I then went to it, deleted path, domain, and session cookies, and then refreshed the page again and saw the blogger cookie code running okay. When viewing the source I could see that my code had gone but the blogger code was now still in the HTML, whereas it wasn't before.

However......

After a few hours when I came back to the computer which had not been showing the message and I re-checked by clearing all cookies (path, domain and session), and saw that Bloggers code had disappeared again, and the message was now not showing again!

Why this has happened I do not know as I had not re-saved the Blogger site in question during the time away so I have no idea what caused the Blogger EU Code to disappear again

There really should be an option in settings to force the Cookie compliance code to be inserted but as there isn't the answer seems to be to just leave your version of the cookie code in the HTML source in a widget at the bottom of the layout.

Why this works without causing issues I have no idea and it sounds like a bodge which it is, but as I cannot find any real answers to this problem online, or in Googles KB, I had to come up with a solution that worked to comply with Google's request and this seems to do it.

So the fact that when you have two lots of the same code in your HTML does NOT cause the message to appear twice is a good thing. This means that even if the original code re-appears then you are okay, and if it doesn't then your own code, which is a direct copy of the blogger code, runs instead. 

Also as your code runs first, if it is causing Bloggers code to also re-appear in the HTML then that will run afterwards ensuring the correct European language is shown in the message.

You can view the JavaScript which is loaded in by Blogger by just appending /js/cookienotice.js to any blogger site e.g this one, http://blog.strictly-software.com/js/cookienotice.js. You can then see the functions and HTML they use to show the DIV. You can also see at the top the ID's and Classes they put on the Cookie Message DIV.

So if you want to check which version of the EU Cookie code is running when both sets of JavaScript exist, you could add a bit of code underneath that checks for the Cookie DIV on display, and add some CSS to target cookieChoicesInfo, which is the ID of the DIV that is shown and you could change the background colour of the DIV to see if it is your DIV or Bloggers DIV that appears.

For example, you could put this under your JavaScript code to change the background colour of the DIV with the following code.

<style>
#cookieChoiceInfo{
	background-color: green !important;
}
</style>

Obviously green is a horrible colour for a background, but it easily stands out. When I did this I saw a Green DIV appear with the message in the correct language displayed, despite my EU options having English as the language for all the wording. 

This is because our code to load the script, and the cookie options into the page runs first, before any Blogger code that appears lower down in the HTML / DOM. When that Blogger code does run, it overwrites the DIV and the wording in the correct European language.

If you right-click on the DIV and choose "inspect" then the Developer Console will appear and you will be able to see that your style to change the background colour is being used on the Cookie message DIV. 

As it's a CSS style block with !important after the style, when the Blogger code overwrites the DIV and wording, the style for the background colour of the DIV is still being determined by our CSS Style block.

So the answer if your EU Cookie Compliance Message disappears is to add your own copy of their code into the site through a widget. 

This shouldn't cause any problems due to any duplicate DIV overwriting your DIV and if it disappears again then at least your version remains.

I just don't understand two things.

1. Why did the Cookie code disappear in the first place?

2. Why did the Blogger code re-appear when I added my own version of Bloggers own EU Cookie message code into the HTML and then why did it dissapear again a couple of hours? 

If anyone can answer these questions then please let me know. A search inside Googles Adsense site does not reveal any useful answers at all.

People just suggest adding query strings to your URL to force it to appear which is no good if your site is linked to from various Search Engines and other sites. Or to just delete all the cookies and refresh the page. 

These are two useless suggestions, and the only thing that seems to work for me is the solution I came up with above. So if you have the same problem try this solution.


By Strictly-Software

Thursday, 3 February 2022

Testing For The Brave Browser

How Can We Test For Brave?


By Strictly-Software

The problem with detecting the Brave browser which I use most of the time is that it hides as Chrome and doesn't have its own user agent. It used to, and the hope is that in the future it will again but at the moment it just shows a Chrome user-agent. 

For example, my latest Brave user-agent is:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36

It also apparently used to have a property on the windows object window.Brave which is no longer there anymore. I have read many articles on StackOverflow about detecting Brave and all the various methods that apparently used to work don't anymore due to objects and properties such as window.google and window.googletag no longer exists on the windows object.

As these posts were not written that long ago it seems that there have been a lot of changes on the window object by Chrome / Google / Brave etc. So when writing your own function as I did it is worth outputting all the keys on the window object first to see what is there and what isn't e.g.

const keys = Object.keys(window);
console.log(keys);

This just outputs all the keys such as events, other objects like document and navigator, and properties as well into the developer tools console area that all modern browsers have.

As Brave, Edge, Opera and Chrome are all based on the same Chromium browser they are basically the same standard-compliant browsers.

Remember we should always use feature detection rather than user-agent parsing to choose whether or not to do something in JavaScript but it is amazing when using a user-agent switcher how many modern-day sites break when I change a standards-compliant browsers agent string to IE 6 for example.

If the coder wrote their code properly it wouldn't matter if I changed the latest version of Chromes user-agent to IE 6 or even IE 4 or Netscape Navigator 4, all sites should work perfectly as instead of doing old school tests like

isIe = document.all;
isNet = document.layers;

And then branching off those two as we did in the old days of scripting which led to browsers like Opera which no one catered for, having to support both IE and Netscape event handlers and pretending to not exist. However, they did eventually add the Opera property to the window object so if you really wanted to find it you could just test for:

if(window.opera){
  alert("This is Opera");
}

However, this no longer exists and Opera uses the same WebKit Chromium base as all the other modern browsers apart from Firefox.

Also, remember that the issue with JavaScript is that every object can be overwritten or modified, that is why we have user-agent switchers in the first place because they can, and do overwrite and change the window.navigator object.

Therefore if you are really pedantic there is NO real way to test if anything is real in JavaScript as a plugin or injected script could have added properties to the Window object such as changing the user-agent or adding a property another browser uses to the window so that tests for window.chrome fail in FireFox because you have created a window.chrome property FireFox detects.

There are loads of ifs and buts and we can discuss the best approach all day due to injected code and extensions overwriting objects or accidental setting of objects by forgetting to do == and accidentally setting a variable or object with a single = by mistake, but let's answer the question,

I have read a lot about people trying to detect Brave and a lot of old objects on the Chrome browser like window.google and window.googletag that no longer exist or even window.Brave which apparently existed on the window object at one stage. 

Why might you want to even detect Brave if it is a standards-compliant Browser and you are doing feature detection and not user-agent sniffing anyway you might ask? 

Well good question, if you are writing your code properly you shouldn't ever need to know what Browser the code is running in as it will work whatever the user is running your web page in due to proper use of feature detection. 

However, you may have a site that wants to direct people to the right extension depository or download the correct add-on. Therefore to make the user comfortable in their decision you might want to show that you know what browser the user has so that they are happy downloading the add-on. 

Or there might be a myriad of other reasons such as asking users to update their browser due to an exploit that has been discovered or to ask old people still using IE6 to upgrade to a modern browser. Also if you want them to use one that removes cookies, trackers, and adverts, and is security conscious, you may want to direct non-Brave users to the Brave download page.

Therefore the function I have come up with tries to future proof by hoping Brave does put its name into the user-agent at some point, and it checks for the existence of properties in objects such as the window or navigator object.

You can put all your complaints and ideas for improving this function for detecting Brave in the comments section and maybe we can come up with a better solution.

// pass or don't pass in a user-agent if none is passed in it will get the current navigator agent
function IsBrave(ua){
	
	var isBrave = false;
	ua = (ua === null || ua === undefined) ? ua = window.navigator.userAgent : ua;

	ua = ua.toLowerCase();

	// make sure it's not Mozilla
	if("mozInnerScreenX" in window){		
		return isBrave;
	}else{
		// everything in JavaScript is over writable we could do this to pass the next test 
		// but then where would we stop as every object even window/navigator can be overwritten or changed...?
		// window.chrome="mozilla";
		// window.webkitStorageInfo = "mozilla";

		// make sure its chrome and webkit
		if("chrome" in window && "webkitStorageInfo" in window){
			// it looks like Chrome and has webkit
			if("brave" in navigator && "isBrave" in navigator.brave){
				isBrave = true;
			// test for Brave another way the way the framwwork Prototype checks for it
			}else if(Object.prototype.toString.call(navigator.brave) == '[object Brave]'){
				isBrave = true;			
			// have they put brave back in window object like they used to
			}else if("Brave" in window){
				isBrave = true;			
			// hope one day they put brave into the user-agent
			}else if(ua.indexOf("brave") > -1){
				isBrave = true;
			// make sure there is no mention of Opera or Edge in UA
			}else if(/chrome|crios/.test(ua) && /edge?|opera|opr\//.test(ua) && !navigator.brave){
				isBrave = false;			
			}
		}

		return isBrave;
	}
}

Of course if you really didn't want all these fallback tests and hope that Brave will sort their own user-agent out in the future or re-add a property to the window object as they apparently used to then you could just do something like this:
var w=window,n=w.navigator; // shorten key objects
let isBrave = !("mozInnerScreenX" in w) && ("chrome" in w && "webkitStorageInfo" in w && "brave" in n && "isBrave" in n.brave) ? true : false;

Just to show you that it works here is the "Browser" test page I keep on all my Browser's bookmark bars, written using just JavaScript with some AJAX calls and my own functions which lets me see the following info:
  • My Current IPv4 address by calling a URL that only returns IPv4.
  • My Current IPv6 address by calling a URL that will return one if I am using one, otherwise it converts the IPv4 into an IPv6 address format. Obviously, this is not my real IPv6 address as if I had one it would be returned it is just the IPv4 address converted into IPv6 format.
  • The UserAgent that the browser is showing me, this may be spoofed if a user-agent switcher is being used.
  • The real User-Agent if a spoofed user-agent is being used by a user-agent switcher extension/add-on.
  • The spoofed Browser Name.
  • The real Browser name.
  • Whether a user-agent switcher was detected. I have my own which does both the Request Header as well as overwriting the window.navigator object with different agents details. It creates a copy of the original navigator object so I can output it as well as the spoofed version.
  • An output of the current window.navigator object, if a user-agent switcher was used it will show the overwritten navigator object.
  • An output of the original window.navigator object if my own user-agent switcher was used then I create a copy of the same navigator string displayed for the current navigator object so both the spoofed and real navigator objects are outputted.
  • A script that loads in info about my location based on my IP address e.g: Town, County, Country, ISP, Hostname, Longitude, and Latitude.

Example Browser Test Output

If you look at this image, double click to open bigger if you need to, although it won't show you all the ISP & Location info at the bottom, it will show you the user-agents, both spoofed and real, as well as the spoofed browser and the real browser, which my custom function detects from either just a user-agent string or with the IsBrave() function I outputted above.



You can see here that although the user-agent strings are exactly the same my code that detects the browser has correctly recognised the real browser as Brave by using the function this article is about and for the spoofed Browser name it shows Chrome, which is the browser Brave tries to mask itself as.

This handy little script is on all my browsers bookmark bars for easy access and I find it very helpful for quickly seeing if a user-agent switcher is enabled and what if any, my browser it is pretending to be, as well as my current GEO-IP location in case I am using a global VPN, TOR, Opera or a proxy.

Let me know what you think of this function tested in Firefox, Chrome, Opera, Edge and of course Brave.

By Strictly-Software



Monday, 17 January 2022

Running JavaScript Before Any Other Scripts On A Page

Injecting A Script At The Top Of Your HTML Page


By Strictly-Software

If you are developing an extension for either Firefox, Opera or Chrome, Brave, Edge, and Chromium, then you might come across the need to be able to inject some code into the top of your HTML page so that it runs before any other code.

When developing extensions for Chromium based browsers such as Chrome, Brave and Edge, you will most likely do this from your content.js file which is one of the main files that holds code to be run at certain stages of a pages lifetime.

As the Chrome Knowledge Base says the property values for the "run_at" property include;

  • document_idle, the preferred setting where scripts are guaranteed to run after the DOM is complete and after the window.onload event has loaded all resources which can include other scripts.
  • document_end, content.js scripts are injected immediately after the DOM is complete, but before subresources like images and frames have loaded. So after the DOM is loaded but before window.onload has finished loading external resources.
  • document_start, ensures scripts are injected after any CSS files are loaded but before any other DOM is constructed or any other script is run.  
  • There is of course the "run_at" property in the manifest.json file, which can be set to "document_start", to enable the codes running before other code does. This is especially useful if you need to change header values before a web page loads so that modified values such as the Referer or User-Agent can be modified. However, there may also be a need for you to set up an object or variable that is inserted into the DOM for code within the HTML page to access.

    For example in a User-Agent switcher where you need to both overwrite the Navigator object in JavaScript and the Request header, you may want to create an object or variable that holds the original REAL navigator object or its user-agent so that any page you may create yourself or offer to your users the ability to see the REAL user-agent, browser, language, and other properties if they wanted to.

    For example, I have my own page that I use to show me the current user-agent and list the navigator properties

    However, if they have been modified by my own user-agent switcher extension I also offer up a variable holding the original REAL user-agent so that it can be shown and compared with the spoofed version to see what has changed. I also have a variable that holds the original navigator object in case I want to look at the properties.

    Therefore my HTML page may want to inspect this object if it exists with some code on the page.

    // check to see if I can access the original Navigator object and the user agent string
    if(typeof(origNavigator) !== 'undefined' && origUserAgent !== null)
    {
    	// get the real Browser name with my Detect Browser function using the original Navigator user-agent
    	let realBrowser = Browser.DetectBrowser(origUserAgent);
    
    	// output on the page in a DIV I have
    	G("#RealBrowser").innerHTML = "<b>Real Browser: " + realBrowser "</b>";
    }

    This code just uses a generic Browser detection function for taking a user-agent and finding the Browser name. It even detects Brave by ruling out other browsers and if it is Chrome at the end I check for the Brave properties or mention of the word in the string, which they used to have but newer versions have removed it. 

    However there is hope in the community that they will create a unique user-agent with the word Brave in as at the moment people are having to do object detection which is the better method, and as Brave tries to hide, there are plenty of query strings and other API calls which can be made to find out whether the result indicates Brave rather than Chrome.

    However, at the moment, I am just using a simple detection on the window.navigator object that if TRUE indicates that it is actually Brave NOT Chrome. 

    A later article shows a longer function I developed with fallbacks in case the objects do not exist anymore as there used to be a brave object on the window e.g window.brave that no longer exists, so did many objects for Chrome such as window.google and window.googletag that no longer exist. However, this article explains all that.

    This is just the one line test you can do, it ensures it is not FireFox with a test for a Mozilla only object window.mozInnerScreenX and then checks that it is a Chromium browser with tests for window.chrome and that it's also webkit with a test for window.webkitStorageInfo before some tests for navigator.brave and navigator.brave.isBrave to ensure it's Brave not Chrome e.g:

    // ensure that a Chrome user-agent is not actually Brave by checking some properties that seem to work to identify the browser at the moment anyway...
    let isBrave = !("mozInnerScreenX" in window) && ("chrome" in window && "webkitStorageInfo" in window && "brave" in navigator && "isBrave" in navigator.brave) ? true : false


    However, this article is more about injecting a script into the HEAD of your HTML so that code on the page can access any properties within it.

    As my extension offers an Original Navigator object and a string holding the original/real user-agent before I overwrite it, then I want this code to be the first piece of JavaScript on the page.

    This doesn't have to be limited to extensions and you may have code you want to inject in the HEAD when the DOMLoads before any other code.

    This is a function I wrote that attempts to place a string holding your JavaScript into a new Script block I create on the fly and then insert before any other SCRIPT in the document.head.

    However, if the page is malformed, or has no defined head area it falls back to just appending the script to the document.documentElement object.

    If you pass false in for the 2nd parameter which tells the function whether or not to remove the script after inserting it then if you view the generated source code for the page you will see the injected script code in the DOM.

    The code looks within the HEAD for another script block and if found it inserts it before the first one using insertBefore() however if there is NO script block in the HEAD then the function will just insert the script into the HEAD anyway using the appendChild() method.

    An example of the function in action with a simple bit of JavaScript that stores the original navigator object and user-agent is below. You might find multiple uses for such code in your own work.

    // store the JavaScript in a string
    var code = 'var origNavigator = window.navigator; var origUserAgent = origNavigator.userAgent;";
    
    // now call my function that will append the script in the head before any other and then remove it if required. For testing you may want to not remove it so you can view it in the generated DOM.
    appendScript(code,true);
    
    // function to append a script first in the DOM in the HEAD, with a true/false parameter that determines whether to remove it after sppending it.
    function appendScript(s,r=true){
    
    	// build script element up
    	var script = document.createElement('script');
    	script.type = 'text/javascript';
    	script.textContent = s;
    	
    	// we want our script to run 1st incase the page contains another script e.g we want our code that stores the orig navigator to run before we overwrite it
    
    	// check page has a head as it might be old badly written HTML
    	if(typeof(document.head) !== 'undefined' && document.head !== null)
    	{	
    		// get a reference to the document.head and also to any first script in the head
    		let head = document.head;
    		let scriptone = document.head.getElementsByTagName('script')[0];
    
    		// if a script exists then insert it before 1st one so we dont have code referencing navigator before we can overwrite it		
    		if(typeof(scriptone) !== 'undefined' && scriptone !== null){	
    			// add our script before the first script
    			head.insertBefore(script, scriptone);
    		// if no script exists then we insert it at the end of the head
    		}else{
    			// no script so just append to the HEAD object
    			document.head.appendChild(script);
    		}
    	// no HEAD so fall back to appending the code to the document.documentElement
    	}else{
    		// fallback for old HTML just append at end of document and hope no navigator reference is made before this runs
    		document.documentElement.appendChild(script);
    	}
    	// do we remove the script from the DOM
    	if(r){
    		// if so remove the script from the DOM
    		script.remove();
    	}
    }
    



    I find this function very useful for both writing extensions and also when I need to inject code on the fly and ensure it runs before any other scripts by using a onDOMLoad method.

    Let me know of any uses you find for it.

    Useful Resource: Content Scripts for Chrome Extension Development. 


    By Strictly-Software