In a Cloud Services World, Data Center Location is NOT About Latency

There are many factors that determine the best location for a data center, but for most cloud applications, latency is not one of them.

Philip Carden

March 28, 2014

9 Min Read
In a Cloud Services World, Data Center Location is NOT About Latency

Reliable and low-cost green power, cool climate, physical security, geo and political stability, and access to skilled labor. Find those things, plus existing (or buildable) fiber infrastructure, so that there's cost-effective and protected dark fiber, or wavelength access to key IXPs (Internet exchange points), and you have yourself a data center location.

OK, it has to be on the right continent, but, apart from that, latency should not be a consideration. For clarity, I'm using the term data center to refer to industrial-scale, dedicated, secure facilities (as distinct from server rooms).

Living in the past
Before we talk protocols, let's talk people: We're all living in the past. About 80 milliseconds (ms) in the past to be exact, which is the time it takes for our brains and nervous systems to synchronize stimuli arriving on different neural paths of different latencies.

If you see a hand clap, you perceive the sound and sight at the same time even though the sound takes longer to arrive and to process. Your brain allows itself 80ms or so to reassemble events correctly. That's why a synchronization delay between video and audio suddenly becomes annoying if it's more than 80ms -- your built-in sensory auto-correct flushes its proverbial buffer.

That provides a bit of perspective -- 10ms just doesn't matter. So we can ignore several often cited contributors to latency: CPE and network packet processing times (tens or hundreds of microseconds); packet latency due to serialization (1ms for a 1500 byte packet on a 10Mbit/s link); even the user-plane radio latency in LTE (less than 10ms, assuming no radio congestion).

What really matters are three things: server response time; network queuing (radio or IP); and the speed-of-light in fiber, which is negligible across town, about 60ms round-trip across the Atlantic (London-New York), and 120ms round-trip across the Pacific (Sydney to San Jose).

Characterizing a cloud application
Behind the fancy jargon, cloud applications are still mostly about browsers or "apps" fetching pages using http or https over TCP, with each page made up of sub-elements that are described in the main HTML file. There's no such thing as a typical page, but these days there's likely around a hundred sub-elements totaling around 1MB for a transactional app (think software-as-a-service) and more than twice that for media-dense apps (think social networking).

Of the first megabyte, envisage 100k for HTML and CSS (Cascading Style Sheets), 400k for scripts and 500k for images (with vast variation between sites).

For most sites, each of those sub-elements is fetched separately over their own TCP connection from URIs (Uniform Resource Identifiers) identified in the main HTML file.

For frequently used pages, many of the elements will already be locally cached, including CSS and scripts, but the HTML page will still need to be retrieved before anything else can start. Once the HTML starts arriving, the browser can start to render the page (using a CSS that is locally cached) but only then do the other requests start going out to fetch the meat of the page (mostly dynamic media content, since the big scripts are also normally cached).

A small number of large sites have recently started using the SPDY protocol to optimize this process by multiplexing and compressing HTTP requests and proactively fetching anticipated content. However, this doesn't affect TCP and SSL, which, as we'll see, are the main offenders in the latency department (at least among protocols).

A page-load walkthrough
Let's walk through what happens without the complications of redirects, encryption, network caching or CDNs (we'll come back to them).

After DNS resolution (which is fast, since cached), we'll need two transpacific round trips before we start receiving the page -- one to establish the TCP connection and another for the first HTTP request.

Since the CSS, layout images, and key scripts will be locally cached, the page will start rendering when the HTML starts arriving, after about 300ms (two round trips, each with 120ms light-delay plus say 30ms of queuing and server time).

We're not close to done -- now that we have the HTML, we need to go back and fetch all the sub-elements that are not locally cached. If we assume a broadband access speed of 10 Mbit/s to be our slowest link, then we can calculate the serialization delay of files arriving -- minimal for the HTML (16ms if it's 20KB) and a few times that for the first content image (say 80ms for a largish image). We'll clock in at 700ms for the first image to start rendering -- 300ms for the HTML fetch, 300ms for the image fetch, and about 100ms of serialization delay for the HTML and first image file.

The sub-elements are not all fetched in parallel, because each browser limits the number of parallel TCP connections to a particular host (typically to six), but once the first wave of data starts arriving the limiting factor often becomes the serialization delay in the last mile -- if half of the 1MB page is not locally cached, then we've got 500KB of data to transfer -- so if all goes very well we could get the page fully rendered in about a second (four round trips at 600ms plus serialization of 500KB, which is 400ms on a 10Mbit/s link).

Moving the data center
Now let's move our data center from Sydney to Melbourne (a 90-minute flight apart). We've added 10ms per round-trip of light delay (assuming the fiber path is still via Sydney). So it's 320ms instead of 300ms before the user starts getting a response, and 740ms instead of 700ms before the images start rendering. No perceptible difference. Not even close.

What if we have congestion or a slow server response? Everything is much slower and the relative impact of the extra distance is further reduced -- so an even less perceptible difference.

What if we have more round trips? For example, what if there's a redirect (one round trip) or if the page uses SSL (one additional round trip if credentials are already cached)? Each only adds 10ms, so there's still no perceptible difference, especially compared with the bigger difference that comes from traversing the ocean extra times. What if the user is local (in Sydney say), or there's a high proportion of network cache or CDN-served content? Everything is much faster, but the difference between the two data center locations is still the same. Again, no perceptible difference.

Next page: Moving the data center even further away

Really moving the data center
What if we move the data center much further away: to New Zealand instead of Melbourne? That's a three-hour flight away and 22ms of light round trip, about the same distance as London to Moscow.

If we still route via Sydney and assume six round trips before images start arriving we could almost imagine that we approach a perceptible difference (22ms x 6 = 132ms) until we remember that impact is split across two cognitive events: first, the page starts rendering (an imperceptible 66ms difference); then the images start rendering (another imperceptible 66ms difference). And that's a data center three hours flight away!

This is a good example for another reason -- in practice, the shortest fiber path from New Zealand to California is more direct (via Hawaii) and actually shorter than the Sydney-California route. That's what often happens -- you hit alternative fiber infrastructure well before you've moved a distance that has a human-perceptible impact.

What the future holds
The speed of light in fiber is not going to change. What will change, as protocols improve, is the number of round trips required to start rendering a page -- TCP Fast Open (an Internet Draft) uses a client-side cookie to allow data to be sent on the first round trip for subsequent connections, while Google's QUIC (Quick UDP Internet Connections) protocol provides a TCP alternative over UDP (User Datagram Protocol) that also incorporates encryption (avoiding both the TCP connection and SSL establishment round trips).

We will also likely see wider adoption of SPDY and derivatives (HTTP2.0) so that fewer connections are required and there's some additional latency reduction by better anticipation of sub-elements. These developments are significant because they will effectively eliminate the round-trip multiplier on speed-of-light differences. That 40ms difference between Sydney and Melbourne becomes 10ms, and even the transpacific speed-of-light round-trip (120ms) moves towards being an insignificant part of the overall user experience. Improving the protocol makes the world smaller.

Some apps are more equal than others
A majority of today's cloud applications are well characterized by the above example, but there will be more and more applications that have much lower latency requirements. There are those where humans are not involved -- such as process or car control, automated trading, and the virtualization of IP or radio network functions.

But there are also some emerging categories where low latency is important to the human experience -- virtual workspaces, remote control of network video, augmented reality and immersive cloud-based gaming. For the first group -- let's call them latency-sensitive machine-to-machine (M2M) -- millisecond latencies can be significant, while the second category (latency-sensitive human interactions) are a little less fussy -- tens of milliseconds can be acceptable.

A little perspective again: At the speed of light in fiber you can assume intra-city is less than 1ms (e.g. Manhattan is 0.1ms long, so 0.2ms round-trip), cities in the same megalopolis are less than 10ms apart (e.g. New York to Boston is a 4ms round-trip), while across the US in a straight path would be a 40ms round-trip.

Location, location, location
That means that for our latency-sensitive M2M apps, the server may need to be nearby in the same city, and for the latency-sensitive human interaction apps we'll likely need to be in the same cluster of cities.

What that means is that, for this category of applications, we move away from the big industrial-scale data center model towards a much more distributed cloud-node model, where the virtual machines doing the processing for low latency applications are instantiated close to users, for example, in repurposed telephone exchanges.

So server location will become increasingly important for a sub-set of applications, while for the majority of cloud applications, the location of data centers is already very flexible, and will become much more so over time, as protocols improve.

— Philip Carden, founding partner, Number Eight Capital

Read more about:


About the Author(s)

Philip Carden

Philip Carden is a founding partner of Number Eight Capital and a well-known figure in the telecommunications industry.  He chaired two of the first telecoms industry conferences on customer experience, and is extensively published on that topic as well as security, telecommunications engineering and operations management.  He was formerly the global head of the Consulting Services business division at Alcatel-Lucent.

Subscribe and receive the latest news from the industry.
Join 62,000+ members. Yes it's completely free.

You May Also Like