Tuesday 29 September 2015

IE Bug with non Protocol Specific URLS

IE Bug with non Protocol Specific URLS

By Strictly-Software

I have recently come across a problem that seems to only affect IE and non protocol specific URLS.

These are becoming more and more common as they prevent warnings about insecure content and ensure that when you click on a link you go to the same protocol as the page you are on.

This can obviously be an issue if the site you are on is on an SSL but the site you want to go to doesn't have an HTTPS domain or vice-versa. However most big sites have both domains and will handle the transfer by redirecting the user and the posted data from one domain to another.

An example would be a PayPal button that's form posts to

action="//www.paypal.com/"

In the old days if you had a page on an SSL e.g https://www.mysite.com and had a link to an image or script from a non secure domain e.g http://www.somethirdpartysite.com/images/myimg.jpg you would get pop ups or warning messages in the browser about "non secure content" and would have to confirm that you wanted to load it.

Nowadays whilst you don't see these popups in modern browsers if you check the console (F12 in most browsers) you will still see JavaScript or Network errors if you are trying to load content cross domain and cross protocol.

S2 Membership Plugin

I came across this problem when a customer who was trying to join one of my WordPress subscription sites (that uses the great free S2 Membership Plugin), complained to say that when he clicked a payment button that I have on many pages, was being taken to PayPal's homepage rather than the standard payment page that details the type of purchase, price and options for payment.

It was working for him in Chrome and FireFox but not IE.

I tested this on IE 11 Win7 and Win8 myself and found that this was indeed true.

After hunting through the network steps in the developer toolbar section (F12) and comparing it to Chrome I found that the problem seemed to be IE doing a 301 redirect from PayPals HTTP domain to their HTTPS one. 

After analysing the response and request headers I suspect it is something to do with the non UTF-8 response that PayPal was returning to IE for some reason, probably because Internet Explorer wasn't requesting it as such for some reason.

Debugging The Problem


For the techies. This is a breakdown of the problem with network steps from both Chrome and IE and relevant headers etc.

First the Paypal button code which is encrypted by S2Member on the page. You get given the button content as standard square bracket short codes which get converted into HTML on output. Looking at the source on the page in both browsers on one button I could see the following.

1. Even though the button outputs the form action as https://www.paypal.com it seems that one of my plugins OR WordPress itself, although I suspect a caching plugin obviously, is changing my links (I haven't been able to narrow it down - or WordPress) is removing any protocols conflicts by using non protocol specific URLS.

So as my site doesn't have an SSL any HREF, SRC or ACTION that points to an HTTPS URL was being replaced with // e.g https://www.paypal.com on my page http://www.mysite.com/join was becoming //www.paypal.com in the source and generated source.

2. Examining the HTML of one of the buttons you can see this in any browser. I have cut short the encrypted button code as it's pointless outputting it all.


<form action="//www.paypal.com/cgi-bin/webscr" method="post">
<input type="hidden" name="cmd" value="_s-xclick">
<input name="encrypted" type="hidden" value="-----BEGIN PKCS7-----MIILQQYJKoZIhvcNAQcEoIILMjCCCy4CAQExgg..."


3. Outputting a test HTML page on my local computer and running it in IE 11 WORKED. This is was probably because I explicitly set the URL to https://www.paypal.com so no redirects were needed.

4. Therefore logically the problem was due to the lack of an HTTPS in the URL.

5. Comparing the network jumps.

1. Chrome

Name - Method - Status    - Type                     - Initiator
webscr - POST - 307      - x-www-form-urlencoded    - Other
webscr - POST - 200 - document             - https://www.paypal.com/cgi-bin/webscr

2. IE

URL         - Protocol  - Method - Result - Type       - Initiator
/cgi-bin/webscr         - HTTP      - POST   - 301       -           - click
/cgi-bin/webscr         - HTTPS    - POST   - 302       - text/html  - click
https://www.paypal.com/home - HTTPS   - POST   - 200       - text/html  - click

Although the titles are slightly different you can see they just are different words for the same thing e.g Status in Chrome or Result in IE both relate to the HTTP Status code the response returned.

As you can see Chrome also had to do a 307 (HTTP 1.1 successor to the 302 temporary redirect) from HTTP to HTTPS however it ended up on the correct page. Whereas in IE when I first clicked on the button it took me to the payment page in HTTP but then did a 301 (permanent) redirect to it in HTTPS and then a 302 (temporary) redirect to their home page.

If you want to know more about these 3 redirect status codes this is a good page to read.

The question was why couldn't IE take me to the correct payment page?

Well when I looked at the actual POST data that was being passed along to PayPal from IE on the first network hop I could see the following problem.

cmd=_s-xclick&encrypted=-----BEGIN----MIILQQYJKoZIhvcNAQcEoIILMjCCCy4CAQExgg...

Notice the Chinese character after the BEGIN where it should say PKCS7?

In Chrome however this data was exactly the same as the form e.g

cmd:_s-xclick
encrypted:-----BEGIN PKCS7-----MIILQQYJKoZIhvcNAQcEoIILMjCCCy4CAQExgg...

Therefore it looked like for some reason the posted data was being misinterpreted by IE whereas in Chrome it was not. Therefore I needed to check what character sets and response was being sent and returned.

Examining Request and Response Headers

When looking at the HTTP Request headers on the first POST to PayPal in IE I could see that the Accept-Language header was only asking for en-GB e.g a basic ASCII character set. Also there was quite a lack of request headers compared to IE. I have just copied the relevant ones that can be compared between browsers.

IE Request Headers

Key         Value
Content-Type: application/x-www-form-urlencoded
Accept-Language: en-GB
Accept-Encoding: gzip, deflate
Referer: http://www.mysite.com/join-now/
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Request: POST /cgi-bin/webscr HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Host:         www.paypal.com

Chrome Request Headers

Key                 Value
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Cache-Control: max-age=0
Connection: keep-alive
Content-Length:4080
Content-Type: application/x-www-form-urlencoded


And the responses for the Content-Type header which I think is key.

IE

Content-Type: text/html

Chrome

Content-Type: text/html; charset=UTF-8


So whilst Chrome is saying it will accept more language sets and gets back a charset of UTF-8 IE is only saying it will accept only en-GB and gets back just text/html.

I even tried seeing if I could add UTF-8 in as a language to accept in IE but there was no option to, so I tried adding Chinese which obviously use extended character sets and was the problematic character.
 However this made no difference even though the Accept-Language header was now:

Accept-Language: en-GB,zh-Hans;q=0.9,zh-CN;q=0.8,zh-SG;q=0.7,zh-Hant;q=0.6,zh-HK;q=0.4,zh-MO;q=0.3,zh-TW;q=0.2,zh;q=0.1

Conclusion


Therefore I came to the conclusion that I could not force IE to change it's behaviour and I doubt any phone calls to IE HQ or even PayPal would solve the issue. Therefore to allow IE users to be able to still pay on my site I needed a workaround.

1. I added to my technical test page which I make all users check before paying a test for IE and a warning about possible problems. This went alongside tests to ensure JavaScript and Cookies were enabled as both are needed for any modern JavaScript site.

2. I added in some JavaScript code in my footer that run on DOM load that looped through all FORM elements and checked the ACTION attribute. Even though when examining the results in the console they showed http://www.paypal.com rather than what I saw in the source //www.paypal.com I added some code in to ensure they always said HTTPS.

The function if you are interested is below and it seems to have fixed the problem in IE. If I view the generated source now I can see all form actions have HTTPS protocols.



// on DOM load loop through all FORM elements on the page
jQuery(document).ready(function () {	
	// get all form elements
	var o,,e=document.getElementsByTagName("FORM");
	for(var i=0,l=e.length;i<l;i++)
	{
		// get the action attribute
		o = e[i].action;

		// if current action is blank then skip
		if(o && o!="")
		{
			// if the start of the action is http (as non protocol specifc domains show up as http)
			// then replace the http with https
			if( /^http:\/\/www.paypal.com/.test(o) )
			{
				e[i].action = o.replace("http:","https:");
			}		
		
		}
	}	
});


So whilst this is a just a workaround for the IE bug it does solve the issue until Internet Explorer sorts itself out. Why they have this problem I have no idea.

I am outputting all my content as a UTF-8 charset and Chrome is obviously handling it correctly (along with Firefox and Safari).

So I can only presume it's an IE bug which isn't helped by an unknown (as yet) plugin (or WordPress) changing cross protocol URLs to the now standard //www.mysite.com format.

Therefore if you come across similar problems with redirects taking you to the wrong place, check your headers, compare browsers and if you spot something strange going on, try a JavaScript workaround to modify the DOM on page load.


© 2015 Strictly-Software

No comments: