=

desalasworks presents:

a selection of works by steven de salas

AutoTrader UK

This was a role working on Interface Development for Trader Media’s online marketplace with over 1 billion page impressions per month.

autotrader_main

I mainly focused on creating rich interfaces and user journeys with Event-driven and Object-Oriented JavaScript, DOM, HTML5, jQuery, AJAX, JSON and Backbone.js MVC framework continuously tested with Jasmine/Rhino.

autotrader_gallery

Our team used Agile development approach with Continuous Integration (CI), pair-programming and short development cycles based on Thoughtworks Extreme Programming (XP) model for Java/JUnit TDD projects.

autotrader_agile_dave

25 Techniques for Javascript Performance Optimization

These are some the techniques I use for enhancing the performance of JavaScript, they have mostly been collected over various years of using the language to improve the interactivity of websites and web applications.

My thanks go out to Marco of zingzing.co.uk for reminding me of the importance of optimizing JavaScript, and for teaching me some of the techniques below.

Most of the techniques involve common sense once you have understood the underlying problem. I’ve categorised them into 5 broad categories, each with an underlying problem and solution as follows:

1. Avoid interaction with host objects

Watch out for these guys. Repeated interaction with them will kill your performance.

THE PROBLEM:

Native JavaScript is compiled into machine code by most scripting engines offering incredible performance boost, however interaction with host (browser) objects outside the javascript native environment raises unpredictability and considerable performance lag, particularly when dealing with screen-rendered DOM objects or objects which cause Disk I/O (such as WebSQL).

THE SOLUTION:

You can’t really get away from them, but keep your interaction with host objects to an absolute minimum.

THE TECHNIQUES:

  1. Use CSS classes instead of JavaScript for DOM animation.

    Its a good habit to try an implement any animation (or DOM interaction) with CSS if you can get away with it. CSS3 Transitions have been around for a while now, so there are few excuses not to use them. You can even use a polyfill if you are worried about older browsers. Think also of hover menus using the :hover pseudo-class, or styling and display of elements using @keyframes, :before and :after, this is because unlike JavaScript, CSS solutions are heavily optimized by the browser, often down to the level of using the GPU for extra processing power.

    I realize this might sound like irony (if you want to optimize JavaScript – avoid using it for animation), but the reality is that this technique is executed from within your JavaScript code, it just involves putting more effort on the CSS classes.

  2. Use fast DOM traversal with document.getElementById().

    Given the availability of jQuery, it is now easier than ever to produce highly specific selectors based on a combination of tag names, classes and CSS3. You need to be aware that this approach involves several iterations while jQuery loops thorough each subset of DOM elements and tries to find a match. You can improve DOM traversal speeds by picking nodes by ID.

    // jQuery will need to iterate many times until it finds the right element
    var button = jQuery('body div.dialog > div.close-button:nth-child(2)')[0];
    
    // A far more optimized way is to skip jQuery altogether.
    var button = document.getElementById('dialog-close-button');
    
    // But if you need to use jQuery you can do it this way.
    var button = jQuery('#dialog-close-button')[0];
    
  3. Store pointer references to in-browser objects.

    Use this technique to reduce DOM traversal trips by storing references to browser objects during instantiation for later usage. For example, if you are not expecting your DOM to change you should store a reference to DOM or jQuery objects you are going to use when your page is created; if you are building a DOM structure such as a dialog window, make sure you store a few handy reference to DOM objects inside it during instantiation, so you dont need to find the same DOM object over an over again when a user clicks on something or drags the dialog window.

    If you haven’t stored a reference to a DOM object, and you need to iterate inside a function, you can create a local variable containing a reference to that DOM object, this will considerably speed up the iteration as the local variable is stored in the most accessible part of the stack.

  4. Keep your HTML super-lean (get rid of all those useless DIV and SPAN tags)

    This is extremely important, the time needed to query and modify DOM is directly proportional the the amount and complexity of HTML that needs to be rendered.

    Using half the amount of HTML will roughly double the DOM speed, and since DOM creates the greatest performance drag on any complex JavaScript app, this can produce a considerable improvement. See ‘Reduce Number of DOM Elements’ guidance in Yahoo YSlow.

  5. Batch your DOM changes, especially when updating styles.

    When making calls to modify DOM make sure you batch them up so as to avoid repeated screen rendering, for example when applying styling changes. The ideal approach here is to make many styling changes in one go by adding or removing a class, rather than apply each individual style separately. This is because every DOM change prompts the browser to re-render the whole UI using the boxing model. If you need to move an item across the page using X+Y coordinates, make sure that these two are applied at the same time rather than separately. See these examples in jQuery:

    // This will incurr 5 screen refreshes
    jQuery('#dialog-window').width(600).height(400).css('position': 'absolute')
                           .css('top', '200px').css('left', '200px');
    // Let jQuery handle the batching
    jQuery('#dialog-window').css({
         width: '600px',
         height: '400px',
         position: 'absolute',
         top: '200px',
         left: '200px'
    );
    // Or even better use a CSS class.
    jQuery('#dialog-window').addClass('mask-aligned-window');
  6. Build DOM separately before adding it to the page.

    As per the last item, every DOM update requires the whole screen to be refreshed, you can minimize the impact here by building DOM for your widget ‘off-line’ and then appending your DOM structure in one go.

  7. Use buffered DOM inside scrollable DIVs.

    This is an extension of the fourth point above (Keep HTML super-lean), you can use this technique to remove items from DOM that are not being visually rendered on screen, such as the area outside the viewport of a scrollable DIV, and append the nodes again when they are needed. This will reduce memory usage and DOM traversal speeds. Using this technique the guys at ExtJS have managed to produce an infinitely scrollable grid that doesn’t grind the browser down to a halt.

2. Manage and Actively reduce your Dependencies

Poorly managed JavaScript dependencies degrade user experience.

THE PROBLEM:

On-screen visual rendering and user experience is usually delayed while waiting for script dependencies load onto the browser. This is particularly bad for mobile users who have limited bandwidth capacity.

THE SOLUTION:

Actively manage and reduce dependency payload in your code.

THE TECHNIQUES: 

  1. Write code that reduces library dependencies to an absolute minimum.

    Use this approach to reduce the number of libraries your code requires to a minimum, ideally to none, thus creating an incredible boost to the loading times required for your page.

    You can reduce dependency on external libraries by making use of as much in-browser technology as you can, for example you can use document.getElementById('nodeId') instead of jQuery('#nodeId'), or document.getElementsByTagName('INPUT') instead of jQuery('INPUT') which will allow you to get rid of jQuery library dependency.

    If you need complex CSS selectors use Sizzle.js instead of jQuery, which is far more lightweight (4kb instead of 80kb+).

    Also, before adding any new library to the codebase, evaluate whether or you really need it. Perhaps you are just after 1 single feature in the whole library? If that’s the case then take the code apart and add the feature separately (but don’t forget to check the license and acknowledge author if necessary).

  2. Minimize and combine your code into modules.

    You can bundle distinct components of your application into combined *.js files and pass them through a javascript minimizer tool such as Google Closures or JsMin that gets rid of comments and whitespacing.

    The logic here is that a single minimized request for a 10Kb .js file completes faster than 10 requests for files that are 1-2kb each due to lower bandwidth usage and network latency.

  3. Use a post-load dependency manager for your libraries and modules.

    Much of your functionality will not need to be implemented until after the page loads. By using a dependency manager (such as RequireJS or Webpack) to load your scripts after the page has completed rendering you are giving the user a few extra seconds to familiarise themselves with the layout and options before them.

    Make sure that your dependency manager can ‘remember’ which dependencies have been loaded so you dont end up loading the same libraries twice for each module. See guidance for Pre-Loading and Post-loading in Yahoo YSLow, and be mindful about loading only what is necessary at each stage of the user journey.

  4. Maximise use of caching (eTags, .js files, etc).

    Cache is your best friend when it comes to loading pages faster. Try to maximise the use of cache by applying ETags liberally and putting all your javascript into files ending in *.js found in static URI locations (avoid dynamic Java/C# bundle generations ending with *.jsp and *.ashx) . This will tell the browser to use the locally cached copy of your scripts for any pages loaded after the initial one.

  5. Move scripts to the end of the page (not recommended).

    This is the lazy way of handling post-load dependencies, ideally you should implement a post-load dependency manager, but if you only have one or two scripts to load into the page you can add them at the very end of the HTML document where the browser will start loading them after the page is rendered, giving the user a few extra seconds of interaction.

3. Be disciplined with event binding

Be a ninja when using event handling.

THE PROBLEM:

Browser and custom event handlers are an incredible tool for improving user experience and reducing the depth of the call stack (so you avoid having a function calling a function which calls another function etc), but since they are hard to track due to their ‘hidden’ execution they can fire many times repeatedly and quickly get out of hand, causing performance degradation.

THE SOLUTION:

Be mindful and disciplined when creating event handlers. Get to know your weapons too, if you are using a framework then find out what’s going on underneath the hood.

THE TECHNIQUES:

  1. Use event binding but do it carefully.

    Event binding is great for creating responsive applications. However, it is important that you walk through the execution and various user journeys to make sure they are not firing multiple times or using up unnecessary resources behind the scenes. Comment your code well so they next guy (which may be you a few months down the line) can follow what’s going on and avoid this issue as well.

    If you are using AngularJS make sure you are getting rid of unnecessary ‘watchers’. These are background events that involve heavy processing and will slow down your app, particularly on mobile devices.

  2. Pay special attention event handlers that fire in quick succession (ie, ‘mousemove’).

    Browser events such as ‘mousemove’ and ‘resize’ are executed in quick succession up to several hundred times each second, this means that you need to ensure that an event handler bound to either of these events is coded optimally and can complete in less than 2-3 milliseconds.

    Any overhead greater than that will create a patchy user experience, specially in browsers such as IE that have poor rendering capabilities.

  3. Remember to unbind events when they are not needed.

    Unbinding events is almost as important as binding them. When you add a new event handler to your code make sure that you provide for it to stop firing when it is no longer needed, ideally using once-off execution constructs like jQuery.one() or coding in the unbind behaviour at that point. This will avoid you having the same handler bound multiple times degrading performance.

    If you are using jQuery to bind and unbind events, make sure your selector points to a unique node, as a loose selector can create or remove more handlers than you intend to.

  4. Learn about event bubbling.

    If you are going to use event handlers, it is important that you understand how event bubbling propagates an event up the DOM tree to every ancestor node. You can use this knowledge to limit your dependency on event bubbling with approaches such as jQuery.live() and jQuery.delegate() that require full DOM traversal upon handling each event, or to  stop event bubbling for improved performance. See this great post on the subject.

  5. Use ‘mouseup’ instead of ‘click’.

    Remember that user interaction via the mouse or keyboard fires several events in a specific order. It is useful to remember the order in which these events fire so you can squeeze in your functionality before anything else gets handled, including native browser event handlers.

    A good example of this is to bind your functionality to the ‘mouseup’ event which fires before the ‘click’ event, this can produce a surprising performance boost in older browsers such as IE, making the difference between handling every interaction or missing some of the action if the user triggers clicks many times in succession.

4. Maximise the efficiency of your iterations

Performance becomes critical during long iterations.

THE PROBLEM:

Due to the processing time used, iterations are usually the first places where you can address performance flaws in an application.

THE SOLUTION:

Get rid of unnecessary loops and calls made inside loops.

THE TECHNIQUES:

  1. Harness the indexing power of JavaScript objects.

    Native JavaScript objects {} can be used as powerful Hashtable data structures with quick-lookup indexes to store references to other objects, acting similarly to the way database indexes work for speeding up search operations by preventing needless looping. 

    So why bother iterating to find something? You can simply use a plain object as an index (think of a phone-book) to get to your desired item quickly and efficiently. Here is an example as follows:

    var data = {
      index: {
                "joeb": {name: "joe", surname: "bloggs", age: 29 },
                "marys": {name: "mary", surname: "smith", age: 25 }
                // another 1000 records
             },
      get: function(username) {
                return this.index[username];
             }
    }
  2. Harness the power of array structures with push() and pop() and shift().

    Array push() pop() and shift() instructions have minimal processing overhead (20x that of object manipulation) due to being language constructs closely related to their low-level assembly language counterparts. In addition, using queue and stack data structures can help simplify your code logic and get rid of unnecessarily loops. See more on the topic in this article.

  3. Take advantage of reference types.

    JavaScript, much like other C-based languages, has both primitive and reference value types. Primitive types such as strings, booleans and integers are copied whenever they are passed into a new function, however reference types such as arrays, objects and dates are passed only as a light-weight reference.You can use this to get the most performance out of recursive functions, such as by passing a DOM node reference recursively to minimise DOM traversal, or by passing a reference parameter into a function that executes within an iteration. Also, remember that comparing object references is far more efficient than comparing strings.

  4. Use Array.prototype.join() for string concatenation.

    PLEASE NOTE: This article was first written in 2012 when string concatenation was a hazard to be aware of. However, these days most JavaScript engines have compilation tricks that have made this issue obsolete. The wording below is only really relevant for historical purposes.

    Joining strings using the plus sign (ie var ab = 'a' + 'b';) creates performance issues in IE when used within an iteration. This is because, like Java and C#, JavaScript uses unmutable strings.

    Basically, when you concatenate two strings, a third string is constructed for gathering the results with its own object instantiation logic and memory allocation. While other browsers have various compilation tricks around this, IE is particularly bad at it.A far better approach is to use an array for carrying out the donkey work, creating an array outside the loop, using push() to add items into to the array and then a join() to output the results. See this link for a more in-depth article on the subject.

5. Become friends with the JavaScript lexicon

Become a friend of the ECMA Standard and it make your code faster.

THE PROBLEM:

Due to its loosely-typed and free-for-all nature, JavaScript can be written using a very limited subset of lexical constructs with no discipline or controls applied to its use. Using simple function patterns repetitively often leads to poorly thought-out ‘spaghetti’ code that is inefficient in terms of resource use.

THE SOLUTION:

Learn when and how to apply the constructs of the ECMAScript language standard to maximise performance.

THE TECHNIQUES:

  1. Shorten the scope chain

    In JavaScript, whenever a function is executed, a set of first order variables are instantiated as part of that function. These include the immediate scope of a function (the this variable) with its own scope chain, the arguments of the function and all locally-declared variables.

    If you try and access a globally-declared variable or a closure further up the scope chain, it will take extra effort to traverse up the chain every level util the compiler can wire up the variable you are after. You can thus improve execution by reducing the depth of the call stack, and by only using the local scope (this), the arguments of the function, as well as locally declared variables. This article explains the matter further.

  2. Make use of ‘this’, by passing correct scope using ‘call’ and ‘apply’.

    This is particularly useful for writing asynchronous code using callbacks, however it also improves performance because you are not relying on global or closure variables held further up the scope chain. You can get the most out of the scope variable (this) by rewiring it using the special call() and apply() methods that are built into each function. See the example below:

    var Person = Object.create({
      init: function(name) {
         this.name = name;
      },
      do: function(callback) {
         callback.apply(this);
      }
    });
    var john = new Person('john');
    john.do(function() {
        alert(this.name); // 'john' gets alerted because we rewired 'this'.
    });
  3. Learn and use native functions and constructs.

    ECMAScript provides a whole host of native constructs that save you having to write your own algorithms or rely on host objects. Some examples include Math.floor(), Math.round(), (new Date()).getTime() for timestamps, String.prototype.match() and String.prototype.replace() for regexes, parseInt(n, radix) for changing numeral systems, === instead of == for faster type-based comparsion, instanceof for checking type up the hierarchy, & and | for bitwise comparisons. And the list goes on and on.

    Make sure you use all these instead of trying to work out your own algorithms as you will not only be reinventing the wheel but affecting performance.

  4. Use ‘switch’ instead of lengthy ‘if-then-else’ statements.

    This is because  ‘switch’ statements can be optimized more easily during compilation. There is an interesting article in O’Reily about using this approach with JavaScript.

 

SQL XML Performance in High-Volume Databases

XML may be a drag, but you can use it within SQL to turn your database server into a high-performance love machine.

Now I know many of you will be wondering: XML, performance and high-volume in the same sentence? Surely you must have gone nuts!

I can promise you I haven’t gone nuts. While I agree that XML in the back-end is bulky, unruly, and often a cause for performance-degradation instead of good news you desperately want to hear, there is at least one place where it can make a difference for the better.

Stored Procedures and their Limitations

You see back when Stored Procedures for relational databases were first created, they quickly became the greatest thing since sliced bread (and boy were they an improvement over writing SQL Code directly into your application), however there was one little problem with Stored Procedures that remained unsolved for a long time. That is, SQL deals in RecordSets (i.e. Tables), it is the essence of the language, however the input possibilities for Stored Procedures were always pretty limited, being simple data types such as strings, numbers, and booleans. Until recently, there was no parameter of data type RecordSet so you couldn’t easily enter a list of things as input into a Stored Procedure.

You see most applications deal with many CRUD (Create, Read, Update, Delete), and out of those Stored Procedures can only output (Read) many records at a time. However the CUD part of it (Create, Update and Delete) had to be done one record at a time when using simple data inputs. It is a fact that for most applications it remains this way even today.

Sometimes developers come up with a workaround to enter a list of parameters

This has long been a bit of a problem, and many developers over the years have tried to come up with workarounds to this problem (like using a long list of pipe-separated values), but the solutions have ranged from the not-so-great to the lets-hold-our-breath-and-hope-it-doesnt-fail-spectacularly.

High Volume Inserts and Updates

Entering records one at a time is fine and dandy for most applications, however those requiring high-volume inserts and updates are severely constrained by this fact. You say why? Well, imagine you have an input data feed that needs to insert 10,000 records to a table, then return a message to say how things went. There are 2 ways to do this:

a) You split the records and perform 10,000 separate INSERT operations, or

b) You keep the records together perform a single INSERT operation with 10,000 records.

Which one do you think will perform faster?

Its a no-brainer really, calling a stored procedure once and performing a single INSERT operation will perform significantly faster (over 1000 times faster usually) than doing all the individual inserts one at a time, specially when you factor in network latency speeds between you application server and database server if you are repeating multiple procedure calls in the database.

Here I made a pretty picture so you get the idea:

I hope you made some coffee, this is going to take a while.

So if you plan to insert one record at a time, the other side will probably have to wait a few minutes or hours to get a response back from you. However if you perform the load as a single INSERT, you can probably get a message back to them within a few seconds.

Now your standard run-off-the-mill developer will say: “Hey, we can thread this out into 100 different concurrent calls to the database!” But the thing here is that the database server can only handle so many concurrent INSERT operations at a given time, not to mention that it might become unresponsive under the sudden overload and that you are using up a lot of unnecessary bandwidth in the form of additional calls coming both ways over the network. Ultimately there is a better solution than the hammer-it-harder approach.

XML Saves the Day

So how does XML feature into this discussion?

Well, you see SQL Server (And Oracle), have a handy XML data type that can ALSO BE USED AS INPUT into a Stored Procedure. This technique has been available as far back as SQL 2000, but many developers are not aware of it.

This way you can get a response back in a few seconds.

Its quite easy to strip out records from XML input. You can even perform XML Schema validation inside SQL Server but I’m not going to get into that today.

(I’ll follow up on this a bit later. Just gotta get some stuff done first)

Faster Web Applications with Indexed Views

A short introduction into ‘Indexed Views’ a really handy performance-improvement tool available in SQL Server.

I’ve generally tried to stay clear of using traditional (non-indexed) SQL Views as they severely hinder performance when building applications that query a large set of data.

Traditional SQL Views and the Problems they Cause

Here is what happens when you create a View on a large database: Typically you’ll want to see data from several tables aggregated into just the results you are looking for, and while it is true that this happens, the view is a virtual query that takes up no space so every query you make to the View will be passed on to the underlying tables. Worst of all, if you try to use View in one of your stored procedures, the view needs to be fully resolved to all underlying records even if you use a WHERE clause outside it to limit a subset of data, however the same does not happen if you get rid of the View and use the same SELECT query with a WHERE clause!

SQL Views are slow because a query affects every underlying table

SQL Views are slow because a query affects every underlying table

You can imagine that if you are trying to build a ‘dashboard’ on a web application that gives you some totals and gets hit every 2-3 seconds, that means that millions of rows will be traversed over and over again. This can be somehow mitigated with cached output on stored procedures but its still murder on the database.

Improving Performance with ‘Indexed Views’

Now here comes the exciting bit:

  • What if you could automatically store just the records you need to create your dashboard?

That is exactly what happens when you create an index in one of your views. The data becomes materialized to disk and the results you are after are available (ie. ‘cached’) without having to query the underlying tables every time you are after some data.

Indexed Views are faster because only the view itself gets queried.

Indexed Views are faster because only the view itself gets queried.

The Downside of Using Indexed Views

Be aware that here are a couple of drawbacks in using this type of construct.

  1. First, your underlying tables become ‘schema-bound’, this means that you can no longer get rid of them or change their structure (add an extra column for example) without dropping the view first.
  2. Second, any insert or update into the underlying tables will be slowed down because they cause a refresh of the indexed view. This means transactions involving INSERT, DELETE or UPDATE into these tables will ideally have to be batched (ie, try to avoid inserting/updating one row at a time, insert/update many rows at a time instead)

However, in my opinion, the drawbacks may be well worth it, as most applications involve many database reads and few database writes.

More about Indexed Views

Support for Indexed Views in other database systems.

Oracle 8i and upwards have Materialized Views which are a very similar feature, MySQL however is one of those database systems that do not support Materialized (or Indexed) Views.

If you want to have similar functionality in MySQL and you use Stored Procedures for inputting data into your database, you can enhance the Stored Procedures that update/insert data by running an extra calculation at the end of the procedure that updates a summary table which acts as your view. This is essentially doing the same thing as an Indexed View but keeping it updated manually.

Hope the explanation was useful.

Nandos’ End of Day Uploads

A daily upload process carried out by each of Nandos’ stores that copies daily Point-of-Sale information into a central database repository for analysis.

Start of Process

  • Nightly process run by store manager
  • Launched as a popup with pass-though authentication (POST variables)
  • Pushes data from local Point-of-Sale system into central database

Nan

Nandos End of Day Completed

  • Wages Information, Cash Registry & Inventory File uploaded from local till system into central database
  • Data insert using BULK INSERT (TRANSACT-SQL)
  • Data parsing is performed on SQL server using stored procedures for faster execution.

Nandos - End of Day Completed