Cloud Strategies

Google & Netflix Launch 'Kayenta' for Cloud-Scale Continuous Software Delivery

Enterprises looking to deploy application updates rapidly without breaking things are getting help from two of the biggest and most successful hyperscale cloud providers.

Google (Nasdaq: GOOG) and Netflix Inc. (Nasdaq: NFLX) are teaming up to launch "Kayenta," an open source tool for continuous software delivery at cloud scale.

Kayenta is designed to help cloud application providers get beyond the old "waterfall" software delivery method, where updates take months or years to come out, and move to a continuous update cycle, where updates are happening all the time.

Cloud-scale applications are complex and easily broken, much like this 1931 device from cartoonist Rube Goldberg.
Cloud-scale applications are complex and easily broken, much like this 1931 device from cartoonist Rube Goldberg.

Boost your knowledge of cloud-native software and innovations driving data center transformations! Join us in Austin at the fifth annual Big Communications Event May 14-16. The event is free for communications service providers -- secure your seat today!

The problem with continuous updates to software delivered over the cloud -- a.k.a. software-as-a-service, or SaaS -- is that developers find it difficult to test software to be sure it doesn't break production systems before deployment. The solution to that problem is to deploy to a small number of users at first, test for problems, and then deploy to a larger number if no problems are found. If problems are found, then roll back, adjust and iterate. This process is called "canary analysis," after the old practice of coal miners bringing canaries to work (great for the coal miners, hard on the canaries).

Using canary analysis, it's easy to detect problems if the update crashes your application, but slight degradations of service can be difficult to detect, and yet extremely harmful if deployed to users on a global scale, Andrew Phillips, Google Cloud product manager, said in an interview.

"As humans, we're bad at detecting small changes, and we're very bad at determining whether a small change is in the statistically expected range of fluctuations," Phillips said. In other words, it can be difficult to determine whether a small change in application performance is due to a code update, or whether the change is just random.

That's where Kayenta comes in. Kayenta is an open source tool that works with Spinnaker -- an open source continuous deployment tool initially developed by Netflix -- to automate rolling out software updates at small scale, test for small changes, and then either roll the update out at wider scale or roll it back for bug fix, Phillips said.

"Every organization says, on the one hand, we must move faster, but we also have to stay safe -- can't afford to break all our production applications," Phillips said. Kayenta is designed to help enterprises resolve that paradox.

"Developed jointly by Google and Netflix, Kayenta is an evolution of Netflix's internal canary system, reimagined to be completely open, extensible, and capable of handling more advanced use cases," according to a post on the Google blog Tuesday. "It gives enterprise teams the confidence to quickly push production changes by reducing error-prone, time-intensive, and cumbersome manual or ad-hoc canary analysis."

Kayenta apparently competes with at least one startup. Jyoti Bansal, who founded AppDynamics, which sold to Cisco for $3.7 billion last year, is focused on "continuous application delivery as a service" with his new startup, Harness. Harness is designed to let app developers get new features and upgrades out to users fast, while also ensuring security and application stability. (See AppDynamics Founder Launches 'Harness' for Continuous App Delivery.)

And in a related development, startup Gremlin is looking to make "chaos engineering" widely available -- taking out components of an Internet application, such as individual servers or connections -- on a controlled basis, to test whether the system recovers gracefully. (See Gremlin Looks to Bring 'Chaos Engineering' to the Masses)

Related posts:

— Mitch Wagner Follow me on Twitter Visit my LinkedIn profile Visit my blog Follow me on Facebook Editor, Enterprise Cloud, Light Reading

darkducobu 7/29/2018 | 1:54:42 PM
Re: Full speed Software quality has a cost and necessarily for the subject of windows, the hardware and software configurations must be very different. In many companies, especially the one of the OS editor, between the tests carried out on the internet and by beta testers, as a client we could hope that the first versions would be more stable.
Michelle 4/25/2018 | 10:29:42 PM
Re: Full speed I hadn't thought of it that way, but you're right. It feels as though nobody is paying attention to update results.
mendyk 4/25/2018 | 1:57:12 PM
Re: Full speed It's almost like Microsoft has outsourced its update operations to Homer J. Simpson LLC. No matter how smart tech companies think they are, they are prone to fall into the same efficiency traps as legacy businesses.
Michelle 4/25/2018 | 1:52:00 PM
Re: Full speed I think you may be right. Time spent troubleshooting problems after an update is on the uptick. Not every update is trouble, but the ones that are take up far too much time for users. It's a sad state of affairs, really.
mendyk 4/24/2018 | 2:49:55 PM
Re: Full speed Microsoft gave up worrying about the effects of their dreaded updates on users long ago.
Michelle 4/24/2018 | 2:10:55 PM
Re: Full speed I've had problems with Windows 8 and 10 this month. It really seems like the latest round of updates weren't tested before release. To be fair, I have had problems with Windows 10 updates for the last 2-3 months...
kq4ym 4/23/2018 | 12:02:56 PM
Re: Full speed This would seem to be a solution to the canary method that will " test for problems, and then deploy to a larger number if no problems are found. If problems are found, then roll back, adjust and iterate." and Google/Netflix may very well have come up with a way to speed that up on a larger scale. I've experienced big problems with Windows 10 updates this month, that would seem to beg for their solution. Bears watching to see how effective it becomes.
Mitch Wagner 4/15/2018 | 4:06:30 PM
Re: Full speed No linear relation but same philosophy - incremental changing and testing. 
Michelle 4/14/2018 | 1:10:07 PM
Full speed This is an exciting release. Breaking things at scale is generally discouraged so this is a great solution. Does this have any relation to Netflix's Chaos Monkey? Is it a later iteration or built upon the same rules?
Sign In