From Virtualization to Cloudification
James Crawshaw, Senior Analyst – OSS/BSS Transformation, Heavy Reading
A recent article in The Economist (Cloudification will mean upheaval in telecoms) describes the cloudification of telecom as the softwarization of networks. It references the ambition of Alex Choi, SK Telecom's CTO, to make radio the fourth component of cloud, after computing, storage and networking. It also refers to AT&T's ECOMP project which was recently folded into the Chinese-led OPEN-O initiative to create ONAP. The Economist also speculates on the terrifying (for operators) prospect of an Amazon Telecom Services.
Virtualization and cloudification (or cloud computing) are often used interchangeably but they are different concepts. Virtualization increases the utilization of hardware resources, running more software on a given amount of physical infrastructure. The virtualization could be at an operating system level or a network, compute or storage resource level.
Cloud computing, on the other hand, refers to the delivery of shared computing resources on demand through the Internet (public cloud) or enterprise private networks (private cloud). The beauty of public cloud services such as AWS is that they are self-serve (a credit card is all you need), highly automated, elastic/scalable and pay-as-you-go (I mentioned the credit card, right?). Cloud computing makes use of virtualization to enable the elasticity and achieve economies of scale.
So, in an NFV context, all equipment vendors need to do is port their applications that currently run on ASICs and FPGAs to run on virtual machines (VMs) running on x86 processors, right? Er, no. Most physical network functions are stateful applications, that expect to have local storage and a custom ASIC for packet processing, and are designed to scale up. Cloud-native applications have a clear separation between application processing and the associated data. The application is stateless and the states (data) are stored in "the cloud." Unless we re-architect the application to be stateless we gain few of the benefits of cloud computing. With enhanced performance of CPU, network and storage I/O, stateless applications can now achieve high performance while being easier to scale out and recover from failure.
Where these re-architected VNFs run in the network will depend on their latency and reliability requirements. Certain BSS and OSS elements could be centralized in a core data center with strong disaster recovery capability if they are critical but not latency-sensitive. Conversely, video CDN or IoT applications might be deployed closer to the users in edge data centers to ensure low latency.
Huawei Technologies Co. Ltd. sees four key challenges to building a telco cloud architecture: 1) interoperability; 2) automation; 3) reliability; and 4) adaptability.
Current telecom networks comprise multiple network functions from myriad vendors. Interoperability tests, such as those of the NIA, are important to verify VNFs work together nicely in a cloud environment. Even if different vendors say their products are based on standard platforms such as OpenStack, they might not necessarily interoperate well with each other. The TriCircle project is an initiative to drive greater interoperability between OpenStack participants with the aim of enabling greater automation.
To be useful, cloudification should increase the efficiency of operations and make telcos more agile in the deployment of new services. A key part of this is a culture change to DevOps which enables more regular updates of applications and easy roll back should an update have unexpected consequences. DevOps introduces new tools for IT maintenance including technologies such as containers which enable greater utilization of hardware resources and therefore greater operational efficiency.
As much as we all love the cloud, there is still an expectation of high reliability and availability of telco services. This requires DC-level and system-level reliability design, including concepts such as active-standby and active-active disaster recovery (yes, active-active is a thing). A hierarchical solution can ensure every layer has its availability design. For example, in the VM layer, OpenStack has tools to enable high availability without a single point of failure.
Cloud networks must adapt to the diverse workloads of CSPs such as high forwarding, high throughput, or low latency. One size does not fit all. For some applications, VMs running on x86 chips are simply not going to meet all requirements and may require the use of physical network appliances (hybrid NFV) such as core routers. In some cases, such as DPI, a VNF might run on a VM but offload some workload onto an FPGA. Of course, some techniques, like DPDK-accelerated Open vSwitch, can enhance the packet forwarding power of a VM. And the capability of ARM-based servers is increasing all the time. As such, not only do telco cloud networks need to be adaptable, so too do telco cloud architects.
This blog is sponsored by Huawei.
— James Crawshaw, Senior Analyst, OSS/BSS Transformation, Heavy Reading