Windows Azure Traffic Manager

The Windows Azure Traffic Manager provides several methods of distributing internet traffic among two or more hosted services, all accessible with the same URL, in one or more Windows Azure datacenters. It uses a heartbeat to detect the availability of a hosted service. The Traffic Manager provides various ways of handling the lack of availability of a hosted service.

The CTP for the Windows Azure Traffic Manager was announced at Mix 2011, and a colorful hands-on lab was introduced in the Windows Azure Platform Training Kit (April 2011). The lab has also been added to the April 2011 refresh of the Windows Azure Platform Training Course – a good, but under-appreciated, resource. In his Mix 11 presentation, David Robinson demonstrates the use of Windows Azure Traffic Manager and SQL Azure Data Sync CTP2 to have a fully distributed application with both hosted services and SQL Azure databases hosted in different Windows Azure datacenters. You can apply to join the CTP on the Beta Programs section of the Windows Azure Portal.

Heartbeat

The Traffic Manager requests a heartbeat web page from the hosted service every 30 seconds. If it does not get a 200 OK response for this heartbeat three consecutive times the Traffic Manager assumes that the hosted service is unavailable and takes it out of load-balancer rotation.  (In fact, on not getting a 200 OK the Traffic Manager immediately issues another request – and failure requires three pairs of failed requests every 30 seconds.)

Traffic Manager Policies

The Traffic Manager is configured at the subscription level on the Windows Azure Portal through the creation of one or more Traffic Manager policies. Each policy associates a load-balancing technique with two or more hosted services which are subject to the policy. A hosted service can be in more than one policy at the same time. Policies can be enabled and disabled

The Traffic Manager supports various load balancing techniques for allocating traffic to hosted services.

  • failover
  • performance
  • round robin

With failover, all traffic is directed to a single hosted service. When the Traffic Manager detects that the hosted service is not available it modifies DNS records and directs all traffic to the hosted service configured for failover. This failover hosted service can be in the same or another Windows Azure datacenter. Since it takes 90 seconds for the Traffic Manager to detect the failure and it takes a minute or two for DNS propagation the service will be unavailable for a few minutes.

In a round robin configuration, the Traffic Manager uses a round-robin algorithm to distribute traffic equally among all hosted services configured in the policy. The Traffic Manager automatically removes from the load-balancer rotation any hosted service it detects as unavailable. The hosted services can be in one or more Windows Azure datacenters.

With performance, the Traffic Manager uses information collected about internet latency to direct traffic to the “closest” hosted service. This is useful only if the hosted services are in different Windows Azure datacenters.

Thoughts of Sorts

The Windows Azure Traffic Manager is really easy to use and works as advertised. The configuration is simple with a nice user experience. This is the type of feature that simplifies the task of developing scalable internet services. In particular, a lot of people ask about automated failover when they initially find out about distributed datacenters. And it is a feature I have seen a lot of hand waving over. It looks like the hand waving is about to end.

About Neil Mackenzie

Cloud Solutions Architect. Microsoft
This entry was posted in Windows Azure and tagged , . Bookmark the permalink.

11 Responses to Windows Azure Traffic Manager

  1. Pingback: DotNetShoutout

  2. Pingback: Steve Porter's Blog : Tons of Great Azure Resources

  3. Pingback: CC3: The Cloud Value Chain | CloudCast

  4. Pingback: Windows Azure Platform: April 27th Links « The Slalom Blog

  5. How does the “performance” exactly work? does that depend on total response time of users’ requests or is it purely based on geographics? can we set it up so that it’s governed by user requests’ SLA (for eg, for the same user, if requests to a datacenter are taking too long, consider sending future requests to other datacenters)?

    Assume you have two deployments in two datacenters (dc1 and dc2). Assume user1 is initially sent to dc1 by Azure GTM because it’s closer to it than dc2. But if user1 starts seeing bad and slow response (because of some performance dip in dc1’s deployment for example), would the GTM consider sending user1 to dc2, though it’s far away compared to dc1?

  6. Mohammad Hajjit –

    I suspect that performance is measured over some period of time so that the likelihood is that a user would always see the same datacenter for a single session. This datacenter may not be the geographically closest one. However, it is presumably possible that some user may be unfortunate enough to be redirected to another datacenter. Depending on the nature of your service you may want to take account of that possibility.

  7. Rob Boucher says:

    Yep – If you check out the docs, it explains this to some extent. Performance is not real-time performance. It doesn’t take into account the load on the services. What happens is that at certain intervals the Windows Azure Platform networking backend tests network performance between different datacenters and IP addresses and basically builds a table (simplied a bit). Traffic Manager uses that table to find the “closest” node. The interval is not clear, but it’s certainly not hourly or daily at this point.

  8. Pingback: Windows Azure platform – Tools and Utilities « laxmikantpatil

  9. Pingback: CloudCast

  10. Pingback: Introducere în Windows Azure Traffic Manager la CodeCamp Cluj-Napoca, 24 Martie 2012

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s