Introduction to SQL Azure Federations

4/9/14 – UPDATE: Connor Cunningham, of the Azure SQL Database team, has provided in a comment a link to updated guidance on the use of Federations. The short version is that new projects should implement manual sharding, and that existing projects should migrate to manual sharding.

SQL Azure Federations is the managed sharding technology that provides scale-out data and scale-out performance to SQL Azure. A single SQL Azure database can contain up to 150GB of data and is hosted as one of many tenants in a physical server in a Windows Azure datacenter. SQL Azure Federations removes the size and performance limits inherent to the multi-tenant nature of SQL Azure.

Cihan Biyikoglu has done an awesome job priming the pump on SQL Azure Federations by posting an enormous number of posts explaining the technology. Cihan and Scott Klein did a short introduction to SQL Azure Federations during the recent Learn Windows Azure event. I did a post on SQL Azure Federations in the middle of last year. This post is really a narrative for the SQL Azure Federations presentation I did at SQL Saturday 109 at Microsoft Silicon Valley. (Thanks to Mark Ginnebaugh and Ross Mistry for doing a great job organizing the event.)

Motivation – Scalability

The computing industry has long had a history of trying to solve a performance problem by scaling up the performance of the system used to solve it by adding more cores, more memory, more whatever. However, the cost of scaling up a single system becomes prohibitive as it gets built of ever-more specialized and exotic componentry. Essentially, the price-performance curve for scaling up a system becomes the limiting feature in the utility of a scaling-up to solve a performance problem.

It has been recognized for many years that scaling out by using many instances of commodity hardware can be a very cost effective way of improving performance. The supercomputing industry in the 1980s essentially become a battle between companies using the fastest hardware available regardless of cost and other companies using large numbers of inexpensive cores. This battle has clearly been won by the scale-out vendors since the fastest computer in the World today is the Fujitsu K computer with 705,024 SPARC64 2GHz cores.

The rise of Google, and its datacenters filled with inexpensive commodity hardware, demonstrated to everyone how a scale-out system was able to work at internet scale in a way that would have been impossible for a scale-up system. Cloud services like Amazon EC2 and Windows Azure further demonstrate the benefit of scale-out through their ability to provide on-demand compute at low cost.

Scale-out brings another benefit – system resilience. By using many systems to run a system, it becomes resilient to the failure of a small percentage of the systems. Windows Azure takes advantage of this in its use of rolling upgrades, which take up to 20% of the system down, when it performs system upgrades and in-place service upgrades.

Google’s use of scale-out also demonstrated the system resilience to be gained by scaling out. The service could survive the failure of one or more servers in a manner that would not have been possible in a scale-up system.

In a 1992 paper, David DeWitt and Jim Gray asked – Parallel Database Systems: the Future of Database Systems or a passing Fad? This paper discusses the scaling out of data by sharding (or partitioning) it among many systems, each of which manages a part of the data. As well as increasing the total amount of data that can be managed, data sharding improves performance and throughput by scaling out the systems processing the data.

DeWitt and Gray discuss three ways of distributing the data among shards: round robin, has bucket; and range (e.g. a-e, f-k, l-m, n-s, t-z). Range partitioning is easy to understand but can suffer from data and operational skew if care is not taken to ensure the data is distributed evenly across the shards. Foursquare suffered a multi-hour outage when data skew on a user table partitioned by user id caused the failure of a database server. Another problem with sharded data is the routing of database connections to the appropriate shard.

Sharding-as-a-Service

SQL Azure Federations is a managed sharding service for SQL Azure. It provides Transact-SQL support to manage sharded databases and to hide the complexity of routing connections to the appropriate shard. The SQL Azure section of the Windows Azure Portal has an area providing GUI support for many of the SQL Azure Federation management tasks.

In SQL Azure Federations, a federated database comprises a root database and one or federations each of which comprises a set of sharded databases referred to as federation members. There could, for example, be a customer federation with federation members containing customer data and a product federation with federation members containing product data. The root database and each federation member are normal SQL Azure databases which contain federation metadata.

A federation is specified by: the federation name; the distribution key which provides a name to use when specifying the column on which data in a table is to be partitioned; and the data type of that column. SQL Azure Federations supports only range distribution, and it is necessary to identify the range column in each federated table.

When using SQL Azure Federations, all connections are routed to the root database and a new USE FEDERATION statement is used to indicate to the SQL Azure service gateway where to route the connection. An important consequence of this is that the client does not have to open connections to individual federation members and consequently fragment the connection pool. This provides a significant performance benefit over a do-it-yourself sharding approach.

The root database and each federation member can be managed like other SQL Azure databases. For example, they can have different sizes and different schema. They can be accessed directly through SQL Server Management Studio and arbitrary SQL statements invoked in them. However, the full power of SQL Azure Federations is exposed only when the databases are accessed as federation members, via the Transact SQL extension statements, rather than by direct connection.

In each federated table – i.e. a table participating in a federation – the distribution key of the federation must be in the clustered index and each unique index. An implication of this is that it is not possible to ensure that a column is unique across the federation unless that column is the distribution key. Furthermore, non-federated tables in a federation member cannot have a foreign key relationship with a federated table.

A federated database contains three types of table:

federated
reference
common

Federated tables contain the data that is federated across each federation member. Each federated table comprises only the part of the data in the distribution-key range allocated to that federation member. When the federation member is split, this data is distributed among the two new federation members. A federated table is created by appending FEDERATED ON to a CREATE TABLE statement. For example, with a federation with distribution key, CustomerId, and a table in which custId column contains the distribution valued the following defines it to be a federated table:

CREATE TABLE (…)
FEDERATED ON (Customerid = custId)

Reference tables are normal tables existing in each federation member. They would typically be used for small amounts of reference data that does not need to scale with the federation. However, reference tables and their contents are copied to the new federation members formed when an existing federation member is split. Reference tables are created with the regular CREATE TABLE statement.

Common tables are normal tables existing in the root database. They would be typically be used for application data that does not need to be present in each federation member. They are created with the regular CREATE TABLE statement.

When a new federated or reference table is added to a federation it must be added separately to each member of the federation. The benefit of automatic data distribution for a federated table or automatic data copying for a reference table apply only when a federation member is split.

Federation Statements

SQL Azure Federations is supported by a number of new Transact SQL statements:

CREATE FEDERATION
USE FEDERATION
ALTER FEDERATION
DROP FEDERATION

The CREATE FEDERATION statement is invoked in the root database to create a new federation. It has the following syntax:

CREATE FEDERATION federation_name (distribution_name <data_type> RANGE)

The keyword RANGE is needed and serves as a reminder that the federation uses a range distribution. Note that the range of a federation member includes the low of the allocated range and excludes the high of the allocated range. The range is therefore closed on the lower end and open on the high end – i.e. [a, f).

federation_name specifies the name of the federation, distribution_namespecifies the name used to identify the range column in the federation. SQL Azure Federations supports only the following data types for the range columns:

INT
BIGINT
UNIQUEIDENTIFIER
VARBINARY( n ) – with n <= 900

For example:

CREATE FEDERATION CustomerFederation (CustomerId UNIQUEIDENTIFIER RANGE)

Note that the ordering of the GUIDs used in the UNIQUEIDENIFIER is non-obvious. For example, the value marked 8 is the most significant in the following:

00000000-0000-0000-0000-800000000000

The USE FEDERATIONstatement is used on a connection to indicate to the SQL Azure service gateway which database in the federated database subsequent Transact SQL statements should be routed. It comes in two forms.

The first form routes subsequent statements on the connection to the root database:

USE FEDERATION ROOT WITH RESET

The WITH RESET is necessary, and serves as a reminder that the connection is reset so that any existing context is deleted.

The second form routes subsequent statements on the connection to the appropriate federation member:

USE FEDERATION federation_name
(distribution_name = value)
WITH FILTERING={ON|OFF}, RESET

federation_name specifies the federation. ]distribution_name specifies a value identifying a federation member and can contain any value in the range allocated to the federation member. FILTERING=ON is provided to simplify the migration of legacy applications. When it is used, SQL Azure Federations appends a filter, specifying the distribution_name value, to every Transact SQL statement sent subsequently on the connection. This filter restricts the statements to only the federated data associated with that value. FILTERING=OFF allows subsequent statements to be used against any data in the associated federation member.

The ALTER FEDERATIONis used to split a federation member and to drop a federation member.

The ALTER FEDERATION … SPLIT AT statement is used to split a federation member in two and repartition the data in the member into the two new federation members. SQL Azure performs this split asynchronously and on completion it brings the new federation members online and drops the original federation member.

For example, consider a federation two members of which are defined by [100,400) and [400, 500). On completion of the following ALTER FEDERATION statement there will be three members in this range: [100,200), [200, 400) and [400, 500):

ALTER FEDERATION CustomerFederation
SPLIT AT (CustomerId = 200)

The ALTER FEDERATION … DROP AT statement is used to drop a federation member and extend a neighboring range to cover the dropped member. The data of the dropped member is lost. There are two variants one specifying LOW and the other specifying HIGH. The LOW version extends the low end of a range through the federation member with the next lowest range and then drops the lower federation member. The HIGH version extends the high end of the range through the federation member with the next highest range and then drops the higher federation member.

For example, consider a federation with the following federation members for part of its range: [100,200), [200, 400) and [400, 500)

ALTER FEDERATION CustomerFederation
DROP AT (LOW CustomerId = 200)

Leads to the following federation members: [100, 400), [400,500) in which all data in the original federation member [100, 200) has been deleted.

ALTER FEDERATION CustomerFederation
DROP AT (HIGH CustomerId = 200)

Leads to the following federation members: [100, 400), [400,500) in which all data in the original federation member [200, 400) has been deleted.

The DROP FEDERATION statement stops any further connection attempts to the federation and then drops all federation members.

Dynamic Management Views

There are various DMVs to support the management and use of SQL Azure Federations:

The following DMVs provide metadata about the definition of the federations in a federated database

The following DMVs provide information about the history of changes in a federated database:

The following DMVs provide information about the asynchronous operations performed on a federated database:

The following DMVs provide information about errors that occur during the asynchronous operations performed on a federated database:

Of particular interest is the sys.federation_member_distributionDMV. In the root database this provides information on the range distributions for each federation member in the federated database. It has the following structure:

federation_id int
member_id int (database id)
Distribution_name sysname
Range_low sqlvariant
Range_high sqlvariant

sys.federation_member_distribution is important because it can be used in the creation of fan-out queries against a federated database. The DMV can be queried to retrieve the Range_low values for the federation members since the low value is contained in the member. Separate connections can be opened and a USE FEDERATION statement invoked for each value of Range_low. Then the query can be invoked in parallel on each connection.

SQL Azure Federations Migration Wizard

George Huey has an article in MSDN Magazine on Scaling out with SQL Azure Federations. In the article, he introduces a SQL Azure Federations Migration Wizard which can be used to migrate data to a federated database.

2 Responses to Introduction to SQL Azure Federations

markginnebaugh says:

June 14, 2012 at 6:07 am

Really great post and references! This is a great starting point for developers considering Windows Azure SQL Database for highly scalable systems.

Conor Cunningham [MSFT] says:

April 9, 2014 at 12:03 pm

Current guidance on sharding from Microsoft can be found here:
http://msdn.microsoft.com/library/azure/dn495641.aspx