Partitions in Windows Azure Table

The best place to start learning about Windows Azure Table is the eponymous whitepaper in the resources section of the Windows Azure website. The next step is to look at the Windows Azure Storage Services API Reference documentation on MSDN and particularly the Table Service API section. The Windows Azure forum on MSDN is a good resource for posing questions and hopefully getting some useful responses. Microsoft staff have been good at using the forum to provide additional information and clarification not yet available in the regular documentation.

Brad Calder of Microsoft has provided additional information on the use of partitions in Windows Azure Table on several threads, here and here, and I thought it useful to collate some of the information in a single post (Any errors in this post are, of course, all mine). The post addresses the cloud implementation of Windows Azure Table not the development storage implementation.

Table

In Windows Azure Table, the smallest unit of data that can be stored is an entity comprising a collection of typed name-value pairs referred to as properties. Each entity belongs to a table just as does a row in a relational database. However, tables in Windows Azure Table do not have a fixed schema and there is no requirement that there be any structural or type similarity between different entities in a table. (Note that this is not true of development storage which requires a schema because it stores the data using SQL Server Express.)

There are three special properties that must be present in every entity:

  • PartitionKey
  • RowKey
  • Timestamp

The Timestamp property has a DateTime value maintained by the Windows Azure Table system to facilitate optimistic concurrency. The PartitionKey and RowKey properties both have String values (up to 1KB in size) and together they form a unique primary key for an entity in a table. The developer is responsible for maintaining their values. The only index currently provided on a table is on PartitionKey and RowKey in that order. Any query that does not filter on at least PartitionKey will result in a table scan. Furthermore, every insert, update, and delete operation must specify a PartitionKey.

Partition

Entities with identical values of PartitionKey in a single table form a collection named a partition. There is no restriction currently on partition size and it is possible to create a table with every entity in its own partition and a table with every entity in a single partition. At commercial launch, tables and partitions will have the same upper limit expected to be in the order of TBs. Partitions provide support for two important features of Windows Azure Table: entity group transactions; and load balancing.

An entity group transaction is a batch transaction supporting multiple Insert Entity, Update Entity, Merge Entity, and Delete Entity operations on entities in a single partition. A failure of a single operation in an entity group transaction will cause the entire transaction to be rolled back. It is not possible to batch operations against entities in different partitions.

Windows Azure Table is designed to provide massively scalable storage. To facilitate scalability the system spreads this data across many storage nodes and to maintain performance levels it may move the data from one storage node to another. This load balancing is performed at the partition level. When a storage node becomes too hot, a partition is the unit of granularity with which entities are migrated off it.

Partition Server

Since the upper bound on partition size is so high, and Windows Azure Table is a shared system, it is possible that the data in a single partition spans multiple storage nodes. A Partition Server is used to associate a partition with the underlying storage nodes containing its data. All access to partition data is managed through the same Partition Server which facilitates entity group transactions and provides an order to transactions affecting entities in the partition.

Some Best Practices

  • Choose an appropriate partitioning scheme
    Concerns include: large partitions reduce the ability to load balance partitions; small partitions reduce the ability to query efficiently and perform entity group transactions. The best technique is to test the application to ensure performance is satisfactory under realistic loads.
  • Always specify a PartitionKey in a query
    A query that does not specify the PartitionKey is sent to every partition server which may seriously affect performance. A query that specifies the PartitionKey can be sent directly to the Partition Server for that partition. Try to choose a partitioning scheme such that the dominant queries use PartitionKey. A query that specifies only a PartitionKey scans an entire partition. A query that specifies neither the PartitionKey nor the RowKey is guaranteed to table scan.
  • Always assume a query can return a continuation token
    The MSDN documentation suggests that a continuation token can be passed back from a query: returning more than 1,000 entities; that failed to execute in a reasonable time; that crossed a partition boundary. However, it appears that continuation tokens can be passed back in response to any query that does not specify both the PartitionKey and the RowKey.

UPDATE 1/25/2010

Corrected text of Some Best Practices – Always specify a PartitionKey in a Query.

Technorati Tags:
About these ads

About Neil Mackenzie

Cloud Solutions Architect. Microsoft
This entry was posted in Storage Service, Windows Azure. Bookmark the permalink.

2 Responses to Partitions in Windows Azure Table

  1. noname says:

    Best explanation on this topic!

  2. Pingback: Exploring Windows Azure Storage APIs By Building a Storage Explorer Application - Paolo Salvatori's Blog - Site Home - MSDN Blogs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s