The purpose of this post is to provide links to various posts about the performance of Azure and, in particular, Azure Storage.
Azure Storage Team
The Azure Storage Team blog – obligatory reading for anyone working with Azure Storage – has a number of posts about performance.
The Windows Azure Storage Abstractions and their Scalability Targets post documents limits for storage capacity and performance targets for Azure blobs, queues and tables. The post describes a scalability target of 500 operations per second for a single partition in an Azure Table and a single Azure Queue. There is an additional scalability target of a “few thousand requests per second” for each Azure storage account. The scalability target for a single blob is “up to 60 MBytes/sec.”
The Nagle’s Algorithm is Not Friendly towards Small Requests post describes issues pertaining to how TCP/IP handles small messages < 1460 bytes). It transpires that Azure Storage performance may be improved by turning Nagle off. The post shows how to do this.
Rob Gillen (@argodev) has done a lot of testing of Azure Storage performance and, in particular, on maximizing throughput for uploading and downloading of Azure Blobs. He has documented this in a series of posts: Part 1, Part 2, Part 3. The most surprising observation is that it while operating completely inside Windows Azure it is not worth doing parallel downloads of a blob because the overhead of reconstructing the blob is too high.
He has another post, External File Upload Optimizations for Windows Azure, that documents his testing of using various block sizes for uploads of blobs from outside an Azure datacenter. He suggests that choosing a 1MB block size may be an appropriate rule-of-thumb choice. These uploads can, of course, be performed in parallel.
University of Virginia
A research group at the University of Virginia presented the results of its investigations into the performance of Windows Azure at the 2010 ACM International Symposium on High Performance Distributed Computing. The document is downloadable from the ACM Library or (somewhat cheaper) from the webpage of one of the researchers. There is also a PowerPoint version for those with a short attention span.
The paper covers both Windows Azure and Azure Storage. The researchers used up to 192 instances at a time to investigate both the times taken for instance-management tasks and the maximum storage throughput as the number of instances varied. There is a wealth of performance information which deserves the attention of anyone developing scalable services in Windows Azure. Who would have guessed, for example, that inserting a 1KB entity into an Azure Table is 26 times faster than updating the same entity while inserting a 64KB entity is only 4 times faster.
This work is credited to: Zach Hill, Jie Li, Ming Mao, Arkaitz Ruiz-Alvarez, and Marty Humphrey – all of the University of Virginia.
Microsoft eXtreme Computing Group
The Microsoft eXtreme Computing Group has a website, Azurescope, that documents benchmarking and guidance for Windows Azure. The Azurescope website has a page describing Best Practices for Developing on Window Azure and a set of pages with code samples demonstrating various optimal techniques for using Windows Azure Storage services.
A group at ETH Zurich investigated the performance-cost ratio for transaction processing in various cloud services including SQL Azure. Donald Kossman, Tim Kraska and Simon Loesing presented the results at SIGMOD 2010.
In a PDC 10 presentation on Inside Windows Azure Virtual Machines, Hoi Vo gave the following advice: do not trust any documentation about performance. He further advised that guidance is not empirical data. This is good advice since each application has its own performance characteristics.
UPDATE 10/25/2010 – Added section on ETH Zurich
UPDATE 11/4/2010 – Added section on PDC 10