XML may be a drag, but you can use it within SQL to turn your database server into a high-performance love machine.
Now I know many of you will be wondering: XML, performance and high-volume in the same sentence? Surely you must have gone nuts!
I can promise you I haven’t gone nuts. While I agree that XML in the back-end is bulky, unruly, and often a cause for performance-degradation instead of good news you desperately want to hear, there is at least one place where it can make a difference for the better.
Stored Procedures and their Limitations
You see back when Stored Procedures for relational databases were first created, they quickly became the greatest thing since sliced bread (and boy were they an improvement over writing SQL Code directly into your application), however there was one little problem with Stored Procedures that remained unsolved for a long time. That is, SQL deals in RecordSets (i.e. Tables), it is the essence of the language, however the input possibilities for Stored Procedures were always pretty limited, being simple data types such as strings, numbers, and booleans. Until recently, there was no parameter of data type RecordSet so you couldn’t easily enter a list of things as input into a Stored Procedure.
You see most applications deal with many CRUD (Create, Read, Update, Delete), and out of those Stored Procedures can only output (Read) many records at a time. However the CUD part of it (Create, Update and Delete) had to be done one record at a time when using simple data inputs. It is a fact that for most applications it remains this way even today.
Sometimes developers come up with a workaround to enter a list of parameters
This has long been a bit of a problem, and many developers over the years have tried to come up with workarounds to this problem (like using a long list of pipe-separated values), but the solutions have ranged from the not-so-great to the lets-hold-our-breath-and-hope-it-doesnt-fail-spectacularly.
High Volume Inserts and Updates
Entering records one at a time is fine and dandy for most applications, however those requiring high-volume inserts and updates are severely constrained by this fact. You say why? Well, imagine you have an input data feed that needs to insert 10,000 records to a table, then return a message to say how things went. There are 2 ways to do this:
a) You split the records and perform 10,000 separate INSERT operations, or
b) You keep the records together perform a single INSERT operation with 10,000 records.
Which one do you think will perform faster?
Its a no-brainer really, calling a stored procedure once and performing a single INSERT operation will perform significantly faster (over 1000 times faster usually) than doing all the individual inserts one at a time, specially when you factor in network latency speeds between you application server and database server if you are repeating multiple procedure calls in the database.
Here I made a pretty picture so you get the idea:
I hope you made some coffee, this is going to take a while.
So if you plan to insert one record at a time, the other side will probably have to wait a few minutes or hours to get a response back from you. However if you perform the load as a single INSERT, you can probably get a message back to them within a few seconds.
Now your standard run-off-the-mill developer will say: “Hey, we can thread this out into 100 different concurrent calls to the database!” But the thing here is that the database server can only handle so many concurrent INSERT operations at a given time, not to mention that it might become unresponsive under the sudden overload and that you are using up a lot of unnecessary bandwidth in the form of additional calls coming both ways over the network. Ultimately there is a better solution than the hammer-it-harder approach.
XML Saves the Day
So how does XML feature into this discussion?
Well, you see SQL Server (And Oracle), have a handy XML data type that can ALSO BE USED AS INPUT into a Stored Procedure. This technique has been available as far back as SQL 2000, but many developers are not aware of it.
This way you can get a response back in a few seconds.
Its quite easy to strip out records from XML input. You can even perform XML Schema validation inside SQL Server but I’m not going to get into that today.
(I’ll follow up on this a bit later. Just gotta get some stuff done first)