Service Provider Cloud

Twitter Moves 300 Petabytes to Google Cloud – That's a Lot of Covfefe

Twitter is using tools running on Google Cloud to find out more about how people use its service, including monitoring abuse and mapping how conversations develop.

"We operate at a massive scale with a relatively small team," Twitter CTO Parag Agrawal said during a brief presentation at Google's annual cloud conference in San Francisco in July.

That small team needs to wrangle a lot of data, with over 300 petabytes migrated to Google Cloud, primarily for cold storage and ad hoc analysis, Agrawal said.

Twitter selected Google Cloud because the service could deliver performance. Google's architecture allows Twitter to scale storage and compute independently, and run multiple data analyses simultaneously.

Google also provides cost efficiency, using custom machine types to keep costs down. And Twitter liked Google's culture, including its open source commitment and focus on security, which gives Twitter confidence in the future of the partnership, Agrawal said.

Twitter made a long-term bet that Google is investing more in big data than Twitter can, said Derek Lyon, Twitter director of engineering and data infrastructure, in a later session at the conference.

The analytics move is part of a larger migration to Google cloud for multiple Twitter infrastructure platforms, including real-time analytics, NoSQL, messaging, object store, general computing, batch computing and Hadoop, Lyon said.

Twitter began its migration two years ago. Before migrating, Twitter took two to three months to evaluate multiple providers, Lyon said. Evaluation included a ten-year financial time horizon -- ten years being the life of a data center. Twitter compared the costs and merits of running analytics on-premises and in multiple different cloud scenarios. The evaluation factored in both migration costs and long-term operations.

Over the course of its evaluation, Twitter determined that a go-slow migration approach is best. "An immediate all-in migration at Twitter scale is expensive, distracting and risky," Lyon said.

Photo by Max Pixel
Photo by Max Pixel

To keep the migration at a manageable scale, Twitter focused on Hadoop. Hadoop comprises a filesystem, distributed applications and runtime, with data processing frameworks. It's distributed, horizontally scalable and operates at large scale -- terabytes and petabytes of data, using tens of thousands of nodes. It's fault-tolerant, self-healing and runs on commodity hardware, Dave Beckett, Twitter Hadoop site reliability engineer, said.

As standard, open source software, running Hadoop in the cloud has precedent, Lyon said. Hadoop has fewer dependencies on the production environment in Twitter, so it can be separated easier.

Hadoop is strategically important for Twitter. It stores core data, including tweets, users and impressions; logs of requests and clicks; as well as backups. It provides metrics, model training, experiments and ad hoc analysis for insight, Beckett said.

Twitter runs Hadoop on more than 500,000 compute cores, with 300 petabytes of logical storage, and more than 1 trillion messages per day.

After analysis, Twitter rejected a "lift and shift" approach. Lift and shift was tempting because it leads to the fastest migration. But it would require a major rearchitecture after migration to capture the benefits of cloud. And Twitter was concerned about the overall cost, risk and distraction of lift-and-shift at scale, Lyon said.

Although Twitter plans further cloud migration, production workloads are staying on premises, Lyon said. One reason production workloads are staying put is that production workloads rely on purpose-built hardware. Also, focus on nonproduction workloads limits risks, Lyon said.

Twitter engineers encountered resistance to the cloud migration within the company, Lyon said. Engineers had to overcome internal prejudices and opinions derived without adequate research.

"A lot of people had a lot of prior opinions about what worked well and what didn't work well. There were a lot of anecdotal stories about what other people were doing in the industry," Lyon said.

Additionally, Twitter had concerns about whether Google could serve a company at Twitter's scale. "As a test, we asked for five petabytes of flash [storage] and Google was able to turn it around quickly," Lyon said. "This was a test of Google's capability to fill demand, and Google turned it around."

Watch the Twitter Hadoop presentation here:

Related posts:

— Mitch Wagner Follow me on Twitter Visit my LinkedIn profile Visit me on Tumblr Follow me on Facebook Executive Editor, Light Reading

Be the first to post a comment regarding this story.
Sign In