desalasworks presents:

a selection of works by steven de salas

25 Techniques for Javascript Performance Optimization

These are some the techniques I use for enhancing the performance of JavaScript, they have mostly been collected over various years of using the language to improve the responsiveness of websites and web applications.

My thanks go out to Marco of zingzing.co.uk for reminding me of the importance of optimizing JavaScript, and for teaching me some of the techniques below.

Most of the techniques involve common sense once you have understood the underlying problem. I’ve categorised performance-enhancement techniques into 5 broad categories, each with an underlying problem and solution as follows:

1. Avoid interaction with host objects

Repeated interaction with host objects will kill your performance.

THE PROBLEM:

Native JavaScript is compiled into machine code by most scripting engines offering incredible performance boost, however interaction with host (browser) objects outside the javascript native environment raises unpredictability and considerable performance lag, particularly when dealing with screen-rendered DOM objects or objects which cause Disk I/O (such as WebSQL).

THE SOLUTION:

You can’t really get away from DOM, but keep your interaction with in-browser objects to an absolute minimum.

THE TECHNIQUES:

  1. Use fast DOM traversal with document.getElementById().Given the availability of jQuery, it is now easier than ever to produce highly specific selectors based on a combination of tag names, classes and CSS3. You need to be aware that this approach involves several iterations while jQuery loops thorough each subset of DOM elements and tries to find a match. You can improve DOM traversal speeds by picking nodes by ID.
    // jQuery will need to iterate many times until it finds the right element
    var button = jQuery('body div.dialog > div.close-button:nth-child(2)')[0];
     
    // A far more optimized way is to skip jQuery altogether.
    var button = document.getElementById('dialog-close-button');
     
    // But if you need to use jQuery you can do it this way.
    var button = jQuery('#dialog-close-button')[0];
  2. Store pointer references to in-browser objects.Use this technique to reduce DOM traversal trips by storing references to browser objects during instantiation for later usage. For example, if you are not expecting your DOM to change you should store a reference to DOM or jQuery objects you are going to use when your page is created; if you are building a DOM structure such as a dialog window, make sure you store a few handy reference to DOM objects inside it during instantiation, so you dont need to find the same DOM object over an over again when a user clicks on something or drags the dialog window.

    If you haven’t stored a reference to a DOM object, and you need to iterate inside a function, you can create a local variable containing a reference to that DOM object, this will considerably speed up the iteration as the local variable is stored in the most accessible part of the stack.

  3. Keep your HTML super-lean (get rid of all those useless DIV and SPAN tags)This is extremely important, the time needed to query and modify DOM is directly proportional the the amount and complexity of HTML that needs to be rendered. Using half the amount of HTML will roughly double the DOM speed, and since DOM creates the greatest performance drag on any complex JavaScript app, this can produce a considerable improvement. See ‘Reduce Number of DOM Elements’ guidance in Yahoo YSlow.
  4. Batch your DOM changes, especially when updating styles.When making calls to modify DOM make sure you batch them up so as to avoid repeated screen rendering, for example when applying styling changes. The ideal approach here is to make many styling changes in one go by adding or removing a class, rather than apply each individual style separately. This is because every DOM change prompts the browser to re-render the whole UI using the boxing model. If you need to move an item across the page using X+Y coordinates, make sure that these two are applied at the same time rather than separately. See these examples in jQuery:
    // This will incurr 5 screen refreshes
    jQuery('#dialog-window').width(600).height(400).css('position': 'absolute')
                           .css('top', '200px').css('left', '200px');
    // Let jQuery handle the batching
    jQuery('#dialog-window').css({
         width: '600px',
         height: '400px',
         position: 'absolute',
         top: '200px',
         left: '200px'
    );
    // Or even better use a CSS class.
    jQuery('#dialog-window').addClass('mask-aligned-window');
  5. Build DOM separately before adding it to the page.As per the last item, every DOM update requires the whole screen to be refreshed, you can minimize the impact here by building DOM for your widget ‘off-line’ and then appending your DOM structure in one go.
  6. Use buffered DOM inside scrollable DIVs.This is an extension of the fourth point above (Keep HTML super-lean), you can use this technique to remove items from DOM that are not being visually rendered on screen, such as the area outside the viewport of a scrollable DIV, and append the nodes again when they are needed. This will reduce memory usage and DOM traversal speeds. Using this technique the guys at ExtJS have managed to produce an infinitely scrollable grid that doesn’t grind the browser down to a halt.

2. Manage and Actively reduce your Dependencies

Poorly managed JavaScript dependencies degrade user experience.

THE PROBLEM:

On-screen visual rendering and user experience is usually delayed while waiting for script dependencies load onto the browser. This is particularly bad for mobile users who have limited bandwidth capacity.

THE SOLUTION:

Actively manage and reduce dependency payload in your code.

THE TECHNIQUES: 

  1. Write code that reduces library dependencies to an absolute minimum.Use this approach to reduce the number of libraries your code requires to a minimum, ideally to none, thus creating an incredible boost to the loading times required for your page. You can reduce dependency on external libraries by making use of as much in-browser technology as you can, for example you can use document.getElementById('nodeId') instead of jQuery('#nodeId'), or document.getElementsByTagName('INPUT') instead of jQuery('INPUT') which will allow you to get rid of jQuery library dependency.

    If you need complex CSS selectors use Sizzle.js instead of jQuery, which is far more lightweight (4kb instead of 80kb+).Also, before adding any new library to the codebase, evaluate whether or you really need it. Perhaps you are just after 1 single feature in the whole library? If that’s the case then take the code apart and add the feature separately (but don’t forget to check the license and acknowledge author if necessary).

  2. Minimize and combine your code into modules.You can bundle distinct components of your application into combined *.js files and pass them through a javascript minimizer tool such as Google Closures or JsMin that gets rid of comments and whitespacing. The logic here is that a single minimized request for a 10Kb .js file completes faster than 10 requests for files that are 1-2kb each due to lower bandwidth usage and network latency.
  3. Use a post-load dependency manager for your libraries and modules.Much of your functionality will not need to be implemented until after the page loads. By using a dependency manager (such as RequireJS or head.js) to load your scripts after the page has completed rendering you are giving the user a few extra seconds to familiarise themselves with the layout and options before them. Make sure that your dependency manager can ‘remember’ which dependencies have been loaded so you dont end up loading the same libraries twice for each module. See guidance for Pre-Loading and Post-loading in Yahoo YSLow, and be mindful about loading only what is necessary at each stage of the user journey.
  4. Maximise use of caching (eTags, .js files, etc).Cache is your best friend when it comes to loading pages faster. Try to maximise the use of cache by applying ETags liberally and putting all your javascript into files ending in *.js found in static URI locations (avoid dynamic Java/C# bundle generations ending with *.jsp and *.ashx) . This will tell the browser to use the locally cached copy of your scripts for any pages loaded after the initial one.
  5. Move scripts to the end of the page (not recommended).This is the lazy way of handling post-load dependencies, ideally you should implement a post-load dependency manager, but if you only have one or two scripts to load into the page you can add them at the very end of the HTML document where the browser will start loading them after the page is rendered, giving the user a few extra seconds of interaction.

3. Be disciplined with event binding

Be a ninja when using event handling.

THE PROBLEM:

Browser and custom event handlers are an incredible tool for improving user experience and reducing the depth of the call stack, but since they are hard to track due to their ‘hidden’ execution they can fire many times repeatedly and quickly get out of hand, causing performance degradation.

THE SOLUTION:

Be mindful and disciplined when creating event handlers.

THE TECHNIQUES:

  1. Use event binding but do it carefully.Event binding is great for creating responsive applications, and it can even improve the performance of your code by reducing the depth of the call stack (so you avoid having a function calling a function which calls another function etc). However, because the flow of event execution cannot be traced easily, it is very important that you use event handlers sparingly, you walk through the execution and various user journeys to make sure they are not firing double, and you comment your code well so they next guy (which may be you a few months down the line) can follow what’s going on.
  2. Pay special attention event handlers that fire in quick succession (ie, ‘mouseover’).Browser events such as ‘mouseover’ and ‘resize’ are executed in quick succession up to several hundred times each second, this means that you need to ensure that an event handler bound to either of these events is coded optimally and can complete in less than 2-3 milliseconds. Any overhead greater than that will create a patchy user experience, specially in browsers such as IE that have poor rendering capabilities.
  3. Remember to unbind events when they are no longer needed.Unbinding events is as important as binding them. When you add a new event handler to your code make sure that you provide for it to stop firing when it is no longer needed, ideally by coding in the unbind behaviour at that point. Be careful when using jQuery to bind and unbind events, and make sure your selector points to a unique node, as a loose selector can create or remove more handlers than you intend to.
  4. Learn about event bubbling. Use jQuery.bind() instead of jQuery.live() and jQuery.delegate()If you are going to use event handlers, it is important that you understand how event bubbling propagates an event up the DOM tree to every ancestor node. You can use this knowledge to limit your dependency on event bubbling with approaches such as jQuery.live() and jQuery.delegate() that require full DOM traversal upon handling each event, or to  stop event bubbling for improved performance. See this great post on the subject.
  5. Use ‘mouseup’ instead of ‘click’.Remember that user interaction via the mouse or keyboard fires several events in a specific order. It is useful to remember the order in which these events fire so you can squeeze in your functionality before anything else gets handled, including native browser event handlers.

    A good example of this is to bind your functionality to the ‘mouseup’ event which fires before the ‘click’ event, this can produce a surprising performance boost in older browsers such as IE, making the difference between handling every interaction or missing some of the action if the user triggers clicks many times in succession.

4. Maximise the efficiency of your iterations

String concatenation performance becomes critical during long iterations.

THE PROBLEM:

Due to the processing time used, iterations are usually the first places where you can address performance flaws in an application.

THE SOLUTION:

Get rid of unnecessary loops and calls made inside loops.

THE TECHNIQUES:

  1. Use Array.prototype.join()for string concatenation inside IE.Joining strings using the plus sign (ie var ab = 'a' + 'b';) creates performance issues in IE when used within an iteration. This is because, like Java and C#, JavaScript uses unmutable strings. Basically, when you concatenate two strings, a third string is constructed for gathering the results with its own object instantiation logic and memory allocation. While other browsers have various compilation tricks around this, IE is particularly bad at it.

    A far better approach is to use an array for carrying out the donkey work, creating an array outside the loop, using push() to add items into to the array and then a join() to output the results. See this link for a more in-depth article on the subject.

  2. Harness the indexing power of JavaScript objects.Native JavaScript objects {} can be used as powerful HashMap data structureswith quick-lookup indexes to store references to other objects, acting similarly to the way database indexes work for speeding up search operations by preventing needless looping. You can use this notation to remove the need for iterating through a set of results, by simply calling the index as follows:
    var data = {
      index: {
                "joeb": {name: "joe", surname: "bloggs", age: 29 },
                "marys": {name: "mary", surname: "smith", age: 25 }
                // another 1000 records
             },
      get: function(username) {
                return this.index[username];
             }
    }
  3. Harness the power of array structures with push() and pop() and shift().Array push() pop() and shift() instructions have minimal processing overhead (20x that of object manipulation) due to being language constructs closely related to their low-level assembly language counterparts. In addition, using queue and stack data structures can help simplify your code logic and get rid of unnecessarily loops. See more on the topic in this article.
  4. Take advantage of reference types.JavaScript, much like other C-based languages, has both primitive and reference value types. Primitive types such as strings, booleans and integers are copied whenever they are passed into a new function, however reference types such as arrays, objects and dates are passed only as a light-weight reference.

    You can use this to get the most performance out of recursive functions, such as by passing a DOM node reference recursively to minimise DOM traversal, or by passing a reference parameter into a function that executes within an iteration. Also, remember that comparing object references is far more efficient than comparing strings.

5. Become friends with the JavaScript lexicon

Become a friend of the ECMA Standard and it make your code faster.

THE PROBLEM:

Due to its loosely-typed and free-for-all nature, JavaScript can be written using a very limited subset of lexical constructs with no disciple or controls applied to its use. Using simple function patterns repetitively often leads to poorly thought-out JavaScript that is inefficient in terms of performance and memory usage.

THE SOLUTION:

Learn when and how to apply the constructs of the ECMAScript language standard to maximise performance.

THE TECHNIQUES:

  1. Shorten the scope chainIn JavaScript, whenever a function is executed, a set of first order variables are instantiated as part of that function. These include the immediate scope of a function (the this variable) with its own scope chain, the arguments of the function and all  locally-declared variables. If you try and access a globally-declared variable further up the scope chain, it will take extra effort to traverse up the chain every level util the compiler can wire up the variable you are after. You can thus improve execution by only using the local scope (this), function arguments and locally declared variables inside each function. This article explains the matter further.
  2. Avoid creating functions unnecessarily.For the reasons outlined in the last point, every time a function is created a whole set of objects and variables need to be created and wired up to support function scope and execution. Thus, to improve performance only create functions where it makes sense to encapsulate code in this way.
  3. Make use of ‘this’, by passing correct scope using ‘call’ and ‘apply’.This is particularly useful for writing asynchronous code using callbacks, however it also improves performance because you are not relying on global or closure variables held further up the scope chain. You can get the most out of the scope variable (this) by rewiring it using the special call() and apply() methods that are built into each function. See the example below:
    var Person = Object.create({
      init: function(name) {
         this.name = name;
      },
      do: function(callback) {
         callback.apply(this);
      }
    });
    var john = new Person('john');
    john.do(function() {
        alert(this.name); // 'john' gets alerted because we rewired 'this'.
    });
  4. Learn and use native functions and constructs.ECMAScript provides a whole host of native constructs that save you having to write your own algorithms or rely on host objects. Some examples include Math.floor(), Math.round(), (new Date()).getTime() for timestamps, String.prototype.match() and String.prototype.replace() for regexes, parseInt(n, radix) for changing numeral systems, === instead of == for faster type-based comparsion, instanceof for checking type up the hierarchy, & and | for bitwise comparisons. And the list goes on and on. Make sure you use all these instead of trying to work out your own algorithms as you will not only be reinventing the wheel but affecting performance.
  5. Use ‘switch’ instead of lengthy ‘if-then-else’ statements.This is because  ’switch’ statements can be optimized more easily during compilation. There is an interesting article in O’Reily about using this approach with JavaScript.

 

WP Simple SpamCheck

This plugin allows WordPress to block over 95% of comments using a time-based hash. 

This allows for a minimum sanity check and yet should remove almost all spam comments without the need to sign up to any third party APIs.

You are now welcome to install and use this WordPress Plugin I developed out of frustration of having to sign up and pay for a key to the Akismet API services, and yet knowing that a simple time-based input validation could help get rid of the majority of my spam comments.

This is what the plugin looks like once installed.

So far I have been using this plugin myself for the past 12 months and I am very happy with the results. I normally receive around 400-600 spam comments a week and this has cut that down to an average of 1-2 which is far more manageable.

The solution is pretty low-tech, it only took about 2 days to put it together using some time-validation techniques I’ve successfully used in the past for one of my other websites (www.valuetrader.info).

The plugin is pretty effective given the lack of sophistication employed by the majority of spam bots however it is not very advanced and for that reason some spam comments may still make it through.

To install follow these instructions:

1. Download the file wp-simple-spamcheck.zip to your desktop.
2. Open the ‘Plugins’ section of your site
3. Click on ‘Add New’ and then ‘Upload’
4. Select the ‘wp-simple-spamcheck.zip‘ file you just saved and press ‘Install Now’.
5. Click ‘Apply’ once the installation has completed.
6. Hopefully, the ‘(Spamcheck Enabled)’ message should appear when entering comments.

Please be aware that some templates may not be able to implement this spam check plugin, if the ‘(Spamcheck Enabled)’ message does not appear then just uninstall and search for a different plugin from the other available options.

If you have installed this plugin and you find it useful. Please give it a rating in the WordPress plugin website so that other users can see it.

If you have any other problems just drop me a line here.

Object-Oriented JavaScript Inheritance

This is simple but crucial stuff in JavaScript. Its easy to forget how to do object-oriented inheritance from scratch when you are dealing with several JS frameworks and each of them has pre-built methods that support this functionality in a slightly different way.

JavaScript is a functional language that uses prototypal inheritance. That means you can use classical inheritance such as that supported by Java and C#, however you need to be disciplined about the way you write code and need do a couple of things every time you add a class to your type hierarchy.

When creating a new class, we are essentially creating a function and assigning a new instance of another function as the prototype. This is the basis of ‘prototypal’ inheritance, ie. if you are dealing with ‘Dog’ type that inherits from an ‘Animal’ class, the template for your dog is a newly created instance of animal.

For Example:

// Create a new class called 'Animal'
var Animal = function() {};
 
// Create an instance of the 'Animal'
var anAnimal = new Animal();
 
// Create a class called 'Dog'
var Dog = function() {};
 
// Assign the template
Dog.prototype = anAnimal;
 
// Instantiate the dog
var lassie = new Dog();

Abstracting inheritance

I’ve created a simple static method called ‘extend’ on the Object class. Most frameworks do this in one shape or another (ie ExtJS uses Ext.extend(), John Resig likes to use Class.extend() method, Mootools uses new Class(properties) approach etc).

A ‘static’ method means that only the ‘Object’ class template will have it, so we do not pass it over to our children through inheritance. See the following code:

// Create a static 'extends' method on the Object class
// This allows us to extend existing classes
// for classical object-oriented inheritance
Object.extend = function(superClass, definition) {
    var subClass = function() {};
    subClass.prototype = new superClass();
    for (var prop in definition) {
        subClass.prototype[prop] = definition[prop];
    }
    return subClass;
};

This allows us to simplify the process of using object-oriented inheritance by abstracting the prototype assignment into this separate function call.

// Create an 'Animal' class by extending
// the 'Object' class with our magic method
var Animal = Object.extend(Object, {
    move : function() {alert('moving...');}
});
 
// Create a 'Dog' class that extends 'Animal'
var Dog = Object.extend(Animal, {
    bark : function() {alert('woof');}
});
 
// Instantiate Lassie
var lassie = new Dog();
 
// She can move AND bark!
lassie.move();
lassie.bark();

Adding Constructors

But, we seem to have forgotten one thing. What about constructors?

JavaScript uses the function itself as the constructor for a new object. Thus, when we created the Animal object above we could have simply added some sample code to its constructor as follows:

// Assign the name of the animal when it gets instantiated
var Animal = new function(name) {
  this.name = name;
}
 
var lassie = new Animal('Lassie');
alert('My pets name is: ' + lassie.name);

Thus we can augment our class creation methodology using the same idea:

// Create a 'Subclass' with a constructor
var SubClass = Object.extend(SuperClass, {
    constructor: function(parameter) {
        this.x = parameter;
    },
    method1: function() {
        // ...
    }
});
 
// Instantiate it
var subClass = new SubClass('value');
alert(subClass.x);

In order to allow this kind of notation, we can modify our ‘extend’ method to take care of this special ‘constructor’ function. This is done as follows:

// Create a static 'extends' method on the Object class
// This allows us to extend existing classes
// for classical object-oriented inheritance
Object.extend = function(superClass, definition) {
    var subClass = function() {};
    // Our constructor becomes the 'subclass'
    if (definition.constructor !== Object)
        subClass = definition.constructor;
    subClass.prototype = new superClass();
    for (var prop in definition) {
    	if (prop != 'constructor')
            subClass.prototype[prop] = definition[prop];
    }
    return subClass;
};

And put it to use by writing minimal code that is meaningful to read and leaves the object-creation abstraction sitting behind the scenes:

// Create the 'Animal' class by extending
// the 'Object' class with our magic method
// this time using a constructor
var Animal = Object.extend(Object, {
    constructor: function(name) {
        this.name = name;
    },
    move: function() {
        alert('moving...');
    }
});
 
// Instantiate Lassie (as an animal)
var lassie = new Animal('Lassie');
 
// Now lassie has a name which is
// defined inside the object constructor
alert('My pets name is: ' + lassie.name);

Calling constructors through the inheritance chain.

Now we have come up against a small problem, our ‘Animal’ class can have a constructor, but its child class ‘Dog’ has no way to call this constructor so it can take advantage of the logic for instantiating all animals.

In order to do this we are going to add a special ‘superClass’ property to all of our classes automatically, then use the magic ‘call’ function to call this using the context of the dog instance (to put it in plain english, this will make each ‘Dog’ run the logic for an ‘Animal’).

Lets see, all together now:

// Create a static 'extends' method on the Object class
// This allows us to extend existing classes
// for classical object-oriented inheritance
Object.extend = function(superClass, definition) {
    var subClass = function() {};
    // Our constructor becomes the 'subclass'
    if (definition.constructor !== Object)
        subClass = definition.constructor;
    subClass.prototype = new superClass();
    for (var prop in definition) {
    	if (prop != 'constructor')
            subClass.prototype[prop] = definition[prop];
    }
    // Keep track of the parent class
    // so we can call its constructor too
    subClass.superClass = superClass;
    return subClass;
};
 
// Create the 'Animal' class by extending
// the 'Object' class with our magic method
// this time using a constructor
var Animal = Object.extend(Object, {
    constructor: function(name) {
        this.name = name;
    },
    move: function() {
        alert('moving...');
    }
});
 
// Create a 'Dog' class that inherits from it
var Dog = Object.extend(Animal, {
    constructor: function(name) {
        // Remember to call the super class constructor
        Dog.superClass.call(this, name);
    },
    bark: function() {
        alert('woof');
    }
});
 
// Instantiate Lassie
var lassie = new Dog('Lassie');
 
// She can move AND bark AND has a name!
lassie.move();
lassie.bark();
alert('My pets name is: ' + lassie.name);

Multiple inheritance using Interfaces

Now its possible to add support for multiple inheritance to our object creation syntax in Javascript. In single-inheritance object-oriented languages such as Java and C#, multiple inheritance (inheriting properties and methods from more than one class chain) is done using Interfaces.

The implementation is a bit more crude than in Java or C#, the reason why is because Javascript, unlike these languages, is not compiled so we cant throw compilation-time errors. This means that when we implement an interface we are limited to checking that the appropriate members are there at runtime and if not we through an error.

So, lets start first with our intended usage for Interfaces:

var SubClass = Object.extend(SuperClass, {
     method1: function() {
         alert('something');
     }
}).implement(Inteface1, Interface2 ...);

Going by this approach, we need to create a method ‘implement’ for every class, since in Javascript our classes are instances of the ‘Function’ object, this means adding this method to Function.prototype.

The code does start to be a bit hairy from this point, its important to point out that the context of this function call (ie this) is the class that we are testing for implementation of a particular interface.

Function.prototype.implement = function() {
    // Loop through each interface passed in and then check
    // that its members are implemented in the context object (this)
    for(var i = 0; i < arguments.length; i++) {
         var interf = arguments[i];
         // Is the interface a class type?
         if (interf.constructor === Object) {
             for (prop in interf) {
                 // Check methods and fields vs context object (this)
                 if (interf[prop].constructor === Function) {
                     if (!this.prototype[prop] ||
                          this.prototype[prop].constructor !== Function) {
                          throw new Error('Method [' + prop
                               + '] missing from class definition.');
                     }
                 } else {
                     if (this.prototype[prop] === undefined) {
                          throw new Error('Field [' + prop
                               + '] missing from class definition.');
                     }
                 }
             }
         }
    }
    // Remember to return the class being tested
    return this;
}

And there you have it! Its important to point out that this is one of many ways to abstract object-orientation in Javascript. There are different approaches, and as a programmer it helps to pick one you are more comfortable with. Personally, I like this approach because of its simplicity, readability, and ease of use:

// Create a 'Mammal' interface
var Mammal = {
    nurse: function() {}
};
 
// Create the 'Pet' interface
var Pet = {
    do: function(trick) {}
};
 
// Create a 'Dog' class that inherits from 'Animal'
// and implements the 'Mammal' and 'Pet' interfaces
var Dog = Object.extend(Animal, {
     constructor: function(name) {
          Dog.superClass.call(this, name);
     },
     bark: function() {
          alert('woof');
     },
     nurse: function(baby) {
          baby.food = 100;
     },
     do: function(trick) {
          alert(trick + 'ing...');
     }
}).implement(Mammal, Pet);
 
// Instantiate it
var lassie = new Dog('Lassie');
lassie.move();
lassie.bark();
lassie.nurse(new Dog('Baby'));
lassie.do('fetch');
alert('My pets name is: ' + lassie.name);

How to obtain SOAP Request body in C# Web Services

Microsoft left something out when designing web services, fortunately there is a nifty way to obtain the original SOAP request within a C# web service.

I’ve written an article on this topic before. Its possible to obtain the SOAP request body for logging purposes by using SoapExtensions. And that’s all well and good if you want to log the traffic between your SOAP web service and the outside world. But what if you want to change the behaviour of your web services based on the input that comes in?

Say for example, you want to validate your SOAP request against an XML schema to enforce additional validation than what comes out of the box with .NET by default.

The process is quite simple, you need to find the Request object and load the contents of little known property called ‘InputStream‘. You can mine the contents of the SOAP request and load them into an XML document easily as follows:

Creating an ‘Echo’ Soap Request

Using this technique we can create a simple Web Service that performs a simple ‘Echo’ of whatever you send into it. See the following code:

using System;
using System.Collections.Generic;
using System.Web;
using System.Xml;
using System.IO;
using System.Text;
using System.Web.Services;
using System.Web.Services.Protocols;
 
namespace SoapRequestEcho
{
  [WebService(
  Namespace = "http://soap.request.echo.com/",
  Name = "SoapRequestEcho")]
  public class EchoWebService : WebService
  {
    [WebMethod(Description = "Echo Soap Request")]
    public XmlDocument EchoSoapRequest(int input)
    {
      // Initialize soap request XML
      XmlDocument xmlSoapRequest = new XmlDocument();
 
      // Get raw request body
      using (Stream receiveStream = HttpContext.Current.Request.InputStream)
      {
        // Move to begining of input stream and read
        receiveStream.Position = 0;
        using (StreamReader readStream = 
                               new StreamReader(receiveStream, Encoding.UTF8))
        {
          // Load into XML document
          xmlSoapRequest.Load(readStream);
        }
      }
      // Return
      return xmlSoapRequest;
    }
  }
}

Testing our Soap Request

We can quickly test our SOAP request and check that we are processing whatever XML is being sent in and its coming out the other side untouched.

Next Steps: Performing Schema Validation

Now you can do what you want with your xmlSoapRequest object. It’ll contain exactly the same request as was sent into SOAP in the first place.

If you are after schema validation.The next step is to populate the xmlSoapRequest.Schemas property and then fire off the xmlSoapRequest.Validate() method.

Piece of cake.

SQL XML Performance in High-Volume Databases

XML may be a drag, but you can use it within SQL to turn your database server into a high-performance love machine.

Now I know many of you will be wondering: XML, performance and high-volume in the same sentence? Surely you must have gone nuts!

I can promise you I haven’t gone nuts. While I agree that XML in the back-end is bulky, unruly, and often a cause for performance-degradation instead of good news you desperately want to hear, there is at least one place where it can make a difference for the better.

Stored Procedures and their Limitations

You see back when Stored Procedures for relational databases were first created, they quickly became the greatest thing since sliced bread (and boy were they an improvement over writing SQL Code directly into your application), however there was one little problem with Stored Procedures that remained unsolved for a long time. That is, SQL deals in RecordSets (i.e. Tables), it is the essence of the language, however the input possibilities for Stored Procedures were always pretty limited, being simple data types such as strings, numbers, and booleans. Until recently, there was no parameter of data type RecordSet so you couldn’t easily enter a list of things as input into a Stored Procedure.

You see most applications deal with many CRUD (Create, Read, Update, Delete), and out of those Stored Procedures can only output (Read) many records at a time. However the CUD part of it (Create, Update and Delete) had to be done one record at a time when using simple data inputs. It is a fact that for most applications it remains this way even today.

Sometimes developers come up with a workaround to enter a list of parameters

This has long been a bit of a problem, and many developers over the years have tried to come up with workarounds to this problem (like using a long list of pipe-separated values), but the solutions have ranged from the not-so-great to the lets-hold-our-breath-and-hope-it-doesnt-fail-spectacularly.

High Volume Inserts and Updates

Entering records one at a time is fine and dandy for most applications, however those requiring high-volume inserts and updates are severely constrained by this fact. You say why? Well, imagine you have an input data feed that needs to insert 10,000 records to a table, then return a message to say how things went. There are 2 ways to do this:

a) You split the records and perform 10,000 separate INSERT operations, or

b) You keep the records together perform a single INSERT operation with 10,000 records.

Which one do you think will perform faster?

Its a no-brainer really, calling a stored procedure once and performing a single INSERT operation will perform significantly faster (over 1000 times faster usually) than doing all the individual inserts one at a time, specially when you factor in network latency speeds between you application server and database server if you are repeating multiple procedure calls in the database.

Here I made a pretty picture so you get the idea:

I hope you made some coffee, this is going to take a while.

So if you plan to insert one record at a time, the other side will probably have to wait a few minutes or hours to get a response back from you. However if you perform the load as a single INSERT, you can probably get a message back to them within a few seconds.

Now your standard run-off-the-mill developer will say: “Hey, we can thread this out into 100 different concurrent calls to the database!” But the thing here is that the database server can only handle so many concurrent INSERT operations at a given time, not to mention that it might become unresponsive under the sudden overload and that you are using up a lot of unnecessary bandwidth in the form of additional calls coming both ways over the network. Ultimately there is a better solution than the hammer-it-harder approach.

XML Saves the Day

So how does XML feature into this discussion?

Well, you see SQL Server (And Oracle), have a handy XML data type that can ALSO BE USED AS INPUT into a Stored Procedure. This technique has been available as far back as SQL 2000, but many developers are not aware of it.

This way you can get a response back in a few seconds.

Its quite easy to strip out records from XML input. You can even perform XML Schema validation inside SQL Server but I’m not going to get into that today.

(I’ll follow up on this a bit later. Just gotta get some stuff done first)

Faster Web Applications with Indexed Views

A short introduction into ‘Indexed Views’ a really handy performance-improvement tool available in SQL Server.

I’ve generally tried to stay clear of using traditional (non-indexed) SQL Views as they severely hinder performance when building applications that query a large set of data.

Traditional SQL Views and the Problems they Cause

Here is what happens when you create a View on a large database: Typically you’ll want to see data from several tables aggregated into just the results you are looking for, and while it is true that this happens, the view is a virtual query that takes up no space so every query you make to the View will be passed on to the underlying tables. Worst of all, if you try to use View in one of your stored procedures, the view needs to be fully resolved to all underlying records even if you use a WHERE clause outside it to limit a subset of data, however the same does not happen if you get rid of the View and use the same SELECT query with a WHERE clause!

SQL Views are slow because a query affects every underlying table

SQL Views are slow because a query affects every underlying table

You can imagine that if you are trying to build a ‘dashboard’ on a web application that gives you some totals and gets hit every 2-3 seconds, that means that millions of rows will be traversed over and over again. This can be somehow mitigated with cached output on stored procedures but its still murder on the database.

Improving Performance with ‘Indexed Views’

Now here comes the exciting bit:

  • What if you could automatically store just the records you need to create your dashboard?

That is exactly what happens when you create an index in one of your views. The data becomes materialized to disk and the results you are after are available (ie. ‘cached’) without having to query the underlying tables every time you are after some data.

Indexed Views are faster because only the view itself gets queried.

Indexed Views are faster because only the view itself gets queried.

The Downside of Using Indexed Views

Be aware that here are a couple of drawbacks in using this type of construct.

  1. First, your underlying tables become ‘schema-bound’, this means that you can no longer get rid of them or change their structure (add an extra column for example) without dropping the view first.
  2. Second, any insert or update into the underlying tables will be slowed down because they cause a refresh of the indexed view. This means transactions involving INSERT, DELETE or UPDATE into these tables will ideally have to be batched (ie, try to avoid inserting/updating one row at a time, insert/update many rows at a time instead)

However, in my opinion, the drawbacks may be well worth it, as most applications involve many database reads and few database writes.

More about Indexed Views

Support for Indexed Views in other database systems.

Oracle 8i and upwards have Materialized Views which are a very similar feature, MySQL however is one of those database systems that do not support Materialized (or Indexed) Views.

If you want to have similar functionality in MySQL and you use Stored Procedures for inputting data into your database, you can enhance the Stored Procedures that update/insert data by running an extra calculation at the end of the procedure that updates a summary table which acts as your view. This is essentially doing the same thing as an Indexed View but keeping it updated manually.

Hope the explanation was useful.

HTML5 Databases on iPhone

A tiny Javascript SQLite Client using the lastest W3C standards for offline storage.

I’ve just written this neat SQL Client using Javascript and SQLite.

HTML5 SQL Client running on iPhone

Since the iPhone browser is based on Webkit, which supports the latest W3C standards, its possible to create a SQL Client simply using Javascript.

The SQLite database is hosted by the browser process this means that it runs even when disconnected from the internet.

The following browsers are currently supporting HTML5 standards for database storage:

  • Google Chrome
  • Safari
  • Opera

To access this SQL Client on your own iPhone:

  1. Open up Safari
  2. Type html5db.desalasworks.com into the location bar
  3. Press “Go”. Thats it!

Otherwise you can also open it in your desktop browser by clicking the link above.

AJAX database access in C# – The simple way

Today, I’m going to throw the Microsoft textbook out the window and show you a really easy way to get your database records into a JavaScript application. Minimal hassle – maximum bang for your buck.

First, I’m assuming you chose a JavaScript framework such as Ext JS, Dojo, Yahoo UI, JQuery or any other fine library for your front-end widgets. If so then congratulations, this article is just for you.

The trick is to leave the middle-tier c# layer as thin as possible, implementing only the things your client layer cant do reliably: Security and Data Access.

In this example I am only showing how to write minimal code for Data Access, Security is too long a topic for a single article.

READING FROM THE DATABASE

Say you have a database with products in it. For this example I am using the Northwind database:

 

Here is some C# code I wrote earlier to open up the database and read the first record. If you want some examples of valid connection strings you can look here and here.

Here is the output when you run this code in your browser:

The magic here happens in lines 32 and 38.

Line 32 uses DataTable.Load() to get the database contents into a .NET data table as follows:

32     table.Load(reader, LoadOption.Upsert);

Line 38 uses the DataTable.WriteXml() method to write the contents of the table in XML format as a HTTP response.

38     table.WriteXml(writer);

Now in order to go one step further, your AJAX application needs to read INDIVIDUAL records, that means one at a time, and show them to the user.

Here is some modifications I made to the earlier code for this purpose:

And if you run this code and insert a “?ID=8″ at the end of your request (which you can easily append within javascript) you get the following result:

And thats it.

So where is the trick? Is that everything?

Ahh.. For those accustomed to programming ASP.NET I guess it comes as a bit of a surprise that it would be so easy to get XML formatted records out to the client layer.

Surely there has to be a catch somewhere? … A WCF service with implemented data contracts? An Object-Relational Mapping framework operating behind the scenes? Or at least a strongly typed Collection using Generics?

Nope, thats it. You can do this the hard way, but that’s not why you are reading this article. So now you can take off your C# hat and put on the JavaScript one because the rest of the logic goes on the client layer so that users can get the most of their UI experience.

SOURCE CODE

Here is the source code I used in this example, there are 2 files here: Product.aspx and Product.aspx.cs (code-behind) so I’ve zipped them up into this archive:

Product.aspx.zip

You are free to copy the code here but since you are not paying me for it I accept no liability if your site goes topsy-turvy. One more free tip, if you are going to put this into production you may want to use a Generic Handler (.ashx file) instead, its more light-weight and you don’t need all the functionality in the Page class.

Please note: I’ve left you the section “WRITING TO THE DATABASE” for a separate article.

Secure HTTP in IIS with SelfSSL

Its fairly easy to setup https functionality when you’re running an Apache web server however when trying to do the same thing in IIS you will encounter the problem of generating a certificate and having to rely on a Certification Authority such as Verisign or your own certification server to have the certificate authorised.

This problem is easily solved with Microsoft SelfSSL utility, which can be downloaded here. It will only install under XP or Windows 2003 however you can copy the SelfSSL.exe file directly into a 2000 computer and run it without a problem.

A typical command option you can run (this will give you a certificate valid for 120 days) is:

selfssl.exe /N:CN=(computername) /K:1024 /V:120 /S:1

And thats it! This will take care of the certificate generation and installation. Watch out if you are running multiple Web Sites under the one server as you will have to fiddle with the /s tag to point the SSL certificate in the right direction.

Logging SOAP Messages In .NET

C# Web Services provide an easy interface to incoming SOAP data because the SOAP message has already been deserialised at the entry point of a WebMethod.

The downside is that sometimes you’ll need to have access to the full SOAP request: body, headers and everything for tasks such as diagnosing any errors (which message broke the web service) or providing graceful handling of third party web services.

At first you’d think that the Request object can provide this information with methods such as Request.BinaryRead() but unfortunately when you’re using SOAP all you get there is a querystring.

The solution here is to use the SOAP Extensions. Here is some sample code:

public class SoapMessageLogger : SoapExtension
{
//…
public override void ProcessMessage(SoapMessage message)
{
switch(message.Stage)
{
case SoapMessageStage.BeforeDeserialize:
LogResponse(message); break;
case SoapMessageStage.AfterSerialize:
LogResponse(message); break;

// Do nothing on other states
case SoapMessageStage.AfterDeserialize;
case SoapMessageStage.BeforeSerialize;
default: break;
}
}
//…
}

More info on this can be found here and here.