As traditional communications service providers (CSPs) seek to gain the promised benefits of Network Functions Virtualization (NFV), it is also prudent to look at potential downsides.
Replacing today's carrier-class network equipment with unproven software running on commercial servers scares the daylights out of some operators. And yet, new cloud/SaaS/OTT providers such as Facebook, Netflix, Google and Amazon provide a variety of cloud-based services that are not built using traditional telco gear and architectures.
I know this isn't going to be popular but we need to consider that Facebook et al. may just be the new faces of carrier class. How do they do it? Is there an opportunity to change how telco services are delivered, and how users consume them?
NFV: Bringing the cloud to the world of telco
CSPs know that the cloud brought tremendous advances in low cost, on-demand compute and storage enabled by rapid development of software hosted on standard servers. The benefits of the cloud model are apparent in terms of direct user services such as Facebook, Netflix and LinkedIn; online markets such as Amazon and Uber; and compute utilities like Amazon AWS and Microsoft Azure.
In its simplest incarnation, NFV is the idea of applying these principles to the world of telecom. Today we use closed network appliances (e.g., router, firewall, VoIP gateway, etc.), as shown in Figure 1.
The appliances may differ in appearance, size and cost, but from a network view they implement a needed network function. For example, Figure 1 shows a deployment of customer edge (CE) and provider edge (PE) routers, which differ dramatically from each other in size and complexity. However, they both look like routers at the network topology level.
With NFV, we replace the appliances with software virtual network functions (VNFs) running on servers, as shown in Figure 2. The network view does not change. The routing function is equivalent, regardless of whether it is implemented by appliances or software running on a server.
The network view of the service elements did not change, but what about the other aspects, such as resilience and uptime? One of the big concerns of CSPs regarding NFV is the low reliability of servers compared to today's highly resilient network elements. How can CSPs get the benefits of NFV without giving up service availability?
Different implementation, different resiliency
Today, CSPs achieve resilience in service delivery by using a network composed of highly available elements connected by redundant communications links, as shown in Figure 3.
At each point in the network there is a highly reliable element, along with a redundant element available to take over in case of a failure:
- At the customer site there are two small routers that coordinate activity using a protocol such as HSRP.
- In the metro area there are redundant communication facilities that are diversely routed to provide protection against the typical failures, such as excavations and floods.
- In the central office (CO) or point of presence (POP) there is a chassis-based router with redundant blades.
Note that the addition of resilient elements does not necessarily change the network view: That's because the redundant elements are held in an inactive state until needed.
Clouds are implemented in data centers that take a radically different approach to resilience. Data centers comprise a set of unreliable elements that are orchestrated into a reliable service. Cloud service providers allocate a percentage of excess compute and storage elements that are used to provide backup when others fail. The orchestration function provides the command and control to detect these failures and to move any affected functions from the failed unit to an available standby unit.
Clearly this approach works. Companies such as our previous examples of Netflix, Google and Amazon are able to provide resilient services. How can we apply this model to NFV? An example is shown in Figure 4.
How is resilience implemented in this NFV world?
- In the metro area there are redundant communication facilities as before. You can't virtualize physical pipes!
- At the customer site and CO/POP we have a choice. We could use two servers, or use two blades in a blade server. In either case the router VNFs coordinate activity as before, using a protocol such as HSRP.
By moving to a virtualized service, do we automatically gain the cloud benefits of dynamism in terms of scalable and on-demand services? Not necessarily. NFV orchestration is also needed to allocate and control the needed virtual resources. Furthermore, I believe pure-play virtualization is needed to enable the placement of functions where they need to be based on service requirements and available resources. (See The Case For Pure Play Virtualization.)
So, with NFV and orchestration we can implement a reliable service. Aren't we just recreating the wheel?
Wait -- don't answer yet! There's more…
Of course not. With cloud technologies such as NFV, SDN and orchestration, we are not only able to recreate today's services, we are able to make them better.
- They can be delivered on demand via customer portals, rather than weeks later with truck rolls.
- They can be scalable to meet time-varying demand.
- They can have innovative pricing models based on usage, time of day or try before you buy.
- They can have new resiliency models that take advantage of multiple data centers to provide geographic diversity to protect against regional disasters.
That's good, and it gets better. With programmability in the network and sophisticated control systems (orchestration and SDN), it is now possible to rapidly develop and deploy new services that will drive new revenue and new profitability.
— Prayson Pate, CTO, Overture Networks Inc.