Thursday, 12 May 2016

Azure DocumentDb Go-Faster button

"Premature optimisation is the root of all evil."
That said, if you are using DocumentDb with the .Net client please do two things in your code right now. They are safe and they will speed up your database access - potentially by several orders of magnitude. I delayed putting it into my own application because I assumed that, like most of these things, it would only be a minor improvement in certain edge cases. In reality, I increased my sustained write speed by about 15x.

Increase the connection limit

If you do nothing else, at least stick this somewhere in your application startup. The most important bit is the DefaultConnectionLimit. The actual number you set for connection limit depends on your particular scenario. However, AFAIK, the main reason for keeping it low has to do with the risk of trading outbound connections for inbound connections in a web application - but if you have to worry about that, you almost certainly already know about it.
    ServicePointManager.UseNagleAlgorithm = false;
    ServicePointManager.Expect100Continue = false;
    ServicePointManager.DefaultConnectionLimit = 10000;

Use direct connection

Update: In SDK 1.9 Direct is now the default so you don't have to do this anymore
When you construct your document client, do it something like this:
    return new DocumentClient(
            new ConnectionPolicy()
                     ConnectionMode = ConnectionMode.Direct,
                     ConnectionProtocol = Protocol.Tcp


You need to only have a single DocumentDb Client within you app domain - otherwise you may get intermittent socket errors. However, by default, the underlying .Net comms layer will only allow you to have two (2) concurrent connections. So, no matter how much you do in parallel or how many separate web requests you are serving, only two concurrent calls will be made to the database - which will completely kill your performance. Setting the DefaultConnectionLimit to a higher number allows you to have more parallel database connections and is genuinely DocumentDb's go-faster button.
The other bits on ServicePointManager are much more marginal but won't hurt and may help with other Azure services, including Table and Blob Storage.

Update: In SDK 1.9 Direct is now the default
Using the Direct Connection option when creating your DocumentClient is also pretty much a no-brainer. The default setting to use http and a gateway is just to make sure it "always works", even behind obscure corporate firewalls. Changing to Direct and Tcp is safe - the .Net SDK will handle the work for you so there is no extra code you have to write. I haven't measured how much extra raw performance it gives you and it may be somewhat marginal, but just get it in there while you are at it.

See also an example from Microsoft here: They also increase the minimum number of active threads in the background thread pool. You may want to look at that too, I haven't tested it.