Monday, 24 September 2012

Using String Builders to speed up string concatenation

Using String Builders to speed up string concatenation

If you are using a modern proper language then a string builder object like the one in C# is a standard tool for adding strings without the overhead of concatenation which can be a performance killer.

The reason is simple.

When you do this

a = "hello I am Rob";

a = a + " and I would like to say thank you";

a = a + " and good night";


A lot of languages have to make a copy of the string built so far and store it in memory before creating the new string.

This means that the longer the string gets the more memory is used up as two copies have to be held at the same time before being joined together.

I have actually seen ASP classic sites crash with out of memory errors caused by people using string concatenation to build up large RSS feeds.

The reason I am mentioning this is because of a comment I was given about my popular HTML Encoder object that handles double encoding, numerical and entity encoding and decoding with partial and fully encoded strings.

I have updated the numEncode function after the comment from Alex Oss to use a simple string builder which in JavaScript is very simple.

You just have an empty array, push the new strings into it (at the end of the array) and then join it together at the end to get the full string out. You can see the new function below.


// Numerically encodes all unicode characters
numEncode : function(s){ 
 if(this.isEmpty(s)) return ""; 

 var a = [],
  l = s.length; 
 
 for (var i=0,len=l.length;i "~"){ 
   a.push("&#"); 
   a.push(c.charCodeAt()); //numeric value of code point 
   a.push(";"); 
  }else{ 
   a.push(c); 
  } 
 } 
 
 return a.join("");  
}, 

You can download the latest version of my HTML Encoder Script for JavaScript here.

However in old languages like ASP classic you are stuck with either string concatenation or making your own string builder class.

I have made one which can be downloaded from my main website ASP String Builder Class.

You will notice that it ReDim's the array in chunks of 128 (which can be changed) and once 128 elements have been used it then ReDim's by another large chunk.

A counter is kept so we know how many element we actually have added and once we want to return the whole string we can either just RTRIM it (if we are joining with a blank space) or ReDim it back down to the right array size before joining it together.

This is just an example of how a string builder class is used and you could make a similar one in JavaScript that lets you access specific elements, the previous or next slot, update specific slots and set the delimiter like this ASP version.

Most modern languages have a String Builder Class but if you are using old languages or scripting languages like PHP or ASP classic then adding strings to an array before joining them together is the way to go for performances sake.