Cloud Services

Facebook Slashes Data Center Power Consumption

Facebook is developing a new traffic management technology called Autoscale to optimize energy consumption by 10-15% at its data centers.

Facebook currently uses a traditional round-robin approach to load balancing, but found that was less than optimal, because servers running low-level loads use power more inefficiently than idle servers or servers running at moderate or greater loads, writes Qiang Wu, Facebook infrastructure software engineer, on the Facebook Code Engineering Blog.

Autoscale is designed to optimize workloads so that servers are either idling, or running at medium capacity. It tries to avoid assigning workloads in a way that results in servers running at low capacity, Wu writes.

An idle server consumes about 60 watts. It takes a big power hit, to 130 watts, when it jumps to low-level CPU utilization, for a small number of requests per second. But it only takes a small power hit, to 150 watts, when it goes from low-level to medium-level CPU utilization, Wu writes.

Therefore, from a power-efficiency perspective, we should try to avoid running a server at low RPS and instead try to run at medium RPS.

To tackle this problem and utilize power more efficiently, we changed the way that load is distributed to the different web servers in a cluster. The basic idea of Autoscale is that instead of a purely round-robin approach, the load balancer will concentrate workload to a server until it has at least a medium-level workload. If the overall workload is low (like at around midnight), the load balancer will use only a subset of servers. Other servers can be left running idle or be used for batch-processing workloads.

Though the idea sounds simple, it is a challenging task to implement effectively and robustly for a large-scale system.

Autoscale dynamically adjusts the size of the server pool in use, so that each active server will get at least a medium-level CPU load. Servers not in the active pool don't receive traffic.

Power up your data center knowledge on Light Reading's data center infrastructure channel

Optimizing both performance and power consumption was key in developing decision logic for traffic management: "On one hand, we want to maximize the energy-saving opportunity. On the other, we don't want to over-concentrate the traffic in a way that could affect site performance."

Results have been promising:

Autoscale led to a 27% power savings around midnight (and, as expected, the power saving was 0% around peak hours). The average power saving over a 24-hour cycle is about 10-15% for different web clusters.

Normalized power consumption for a production web cluster with and without Autoscale. Source: Facebook.
Normalized power consumption for a production web cluster with and without Autoscale. Source: Facebook.

Facebook is driving open source data center hardware design with its own Open Compute project. The project is self-serving -- Facebook runs among the most massive data centers in the world, and data center cost savings improves Facebook's bottom line. Facebook says it has saved $1.2 billion over three years using the Open Compute hardware designs it champions. (See Open Compute Project Takes on Networking.)

Earlier this week, Facebook bought PrivateCore, a security software company, to beef up its server security. (See Facebook Buys PrivateCore for Server Security.)

— Mitch Wagner, Circle me on Google+Follow me on TwitterVisit my LinkedIn profileFollow me on Facebook, West Coast Bureau Chief, Light Reading. Got a tip about SDN or NFV? Send it to [email protected]

SachinEE 8/12/2014 | 2:23:27 PM
Re: good news for Facebook With the millions of users that Facebook has bagged over he years, it is good to know that they have found a way to improve their data centers. Facebook is used world wide and they make a lot of profit but this new occurrence will no doubt help them to cut costs and probably boost their earnings by a great percentage. It is a wonder that they did not do this sooner.
brooks7 8/11/2014 | 11:44:17 AM
Re: Not been done before ? Dennis,

Note, that the applications that people like Amazon and Google do are diverse.  It may not be as easy to eliminate compute power in that environment.  Scaling computing and storage in those environments might have very different curves.


mendyk 8/11/2014 | 11:39:05 AM
Re: Not been done before ? Maybe they have but choose not to crow about it. Or maybe they haven't. The point of the story was to highlight what FB is doing to improve its efficiency and margins.
Whatdoyouwant 8/11/2014 | 11:31:47 AM
Re: Not been done before ? I meant by the industry in general.  Google, Yahoo, Amazon, etc..   none of these guys have figured this out ?
mendyk 8/11/2014 | 10:36:51 AM
Re: Not been done before ? When your growth is in triple digits, you don't worry so much about sweating down costs. That comes with maturity. Now that Facebook is a public company, it has to focus more on stuff like margins.
Whatdoyouwant 8/11/2014 | 9:46:42 AM
Not been done before ? Is it me or does this seem like something that would have been done a long time ago ?
danielcawrey 8/9/2014 | 6:31:59 PM
Re: Something good from Facebook I did not realize that the optimum power consumption for a server instance was at medium level capacity. This is really useful - something that hardware designers and cloud service providers should usel to optimize machines and cut costs.

More and more services are being powered through the cloud - which means there is huge opportunity to slash operating expenses. 
thebulk 8/9/2014 | 11:36:47 AM
Re: Something good from Facebook Just think of the cost they are cutting with this! It really is impressive. 
mendyk 8/9/2014 | 9:01:31 AM
Something good from Facebook Congratulations to Facebook for figuring out how to make its data centers more efficient. Do you think there's a similar potential for energy cost savings for network operators that deploy virtualization?
Sign In