The Challenges of Operationalizing NFV
James Crawshaw, Senior Analyst – Service Provider IT and Automation, Heavy Reading
In Building the Network of the Future, AT&T executives explain how they foresee the transition to software-defined networking (SDN), a cornerstone of which is the Open Network Automation Platform (ONAP) that AT&T has been instrumental in founding.
Being an OSS geek, I skipped ahead to chapter 15 -- Network Operations -- penned by Irene Shannon and Jennifer Yates. They write that while NFV provides an opportunity to reduce opex and improve customer experience, it introduces additional layers of operational complexity that "put more onus on the operator to integrate technologies that were traditionally integrated by a vendor."
This chimes with the results of a survey that Amdocs recently undertook that asked CSPs about the most significant barriers to implementing open source NFV (as opposed to sourcing a turnkey solution from one supplier). Maturity/stability (35%) was the chief concern, which is no surprise given that many of the open source NFV projects are quite new -- ONAP was formed in February 2017, OSM's inaugural meeting was in April 2016.
VNF interoperability (29%) was the next most popular response, though this is not necessarily an issue specific to open source. The interoperability problem comes from the disaggregation of network management systems, software, operating systems and underlying hardware (processors). With physical infrastructure this all came pre-integrated; with NFV the operator has the fun job of gluing it all together themselves.
DevOps skills shortage (15%) was the least significant barrier, according to the survey. Familiarity with DevOps tools such as GIT, Chef and Ansible may be in short supply in network operations teams today, but there is nothing particularly complicated about them. AT&T's Shannon and Yates see "adoption of agile software development techniques and DevOps principles" as key to rapid development, certification and deployment of NFV/SDN.
Security (20%) is always a popular concern -- no one ever got fired for worrying about security risks. But is NFV inherently less secure than legacy technology? SS7 has well known vulnerabilities, and PBX and IP PBX hacking cost the global telecom industry $7.5 billion in 2016, according to the Communication Fraud Control Association. One of the touted advantages of open source is that it is more secure than proprietary software as although the source code is in the public domain, vulnerabilities are spotted quickly and patches are swiftly made available (though you still need to implement them, of course).
Open source -- faster, cheaper, more secure
Everyone likes OSS, right? That's open source software of course, not operating support systems (which are slightly less popular, but equally exciting...). Shannon and Yates note that "standardizing VNF interfaces to provide common mechanisms for deploying, configuring, and managing network functions requires an industry-wide commitment. Standards bodies are the traditional forum … but increasingly service providers are leveraging … open source communities to drive change."
Amdocs's survey asked operators what they think open source can do for NFV. Opex reduction (33%) was the most popular response: Cost saving is a key attraction of open source, though whether this is accounted for as a capex saving (license fees) or opex saving (bearing in mind support requirements) is moot.
Faster/cheaper automation (27%), which is analogous to opex savings, was another popular response.
Faster evolution through greater collaboration (31%) was the second most popular response. Open source projects are often characterized by a fast pace of development, provided that the project reaches a critical mass of suitably motivated contributors.
Preparing existing OSS for the virtualized future
The Amdocs survey asked CSPs about their approach to incorporating NFV into OSS. The most popular response (40%) was that they would extend their existing OSS for hybrid network/services. Given that legacy/physical network function/devices will continue to be used for the foreseeable future, CSPs will have to manage hybrid (part physical, part virtualized) networks. Hence, existing OSS will need to be extended (in some way) to work together with the new management and orchestration systems that support NFV. Shannon and Yates write that "to be able to relate end-to-end service measurements with VNF and PNFs, we must have detailed end-to-end service paths … these can be nontrivial to obtain."
They go on to note that "change management [and troubleshooting] activities will need to be coordinated across [physical and virtual] domains." In fact, Shannon and Yates see the scope for ONAP to manage PNFs and hence "it can enable legacy OSSs and BSSs to be retired."
Just over a quarter of respondents said they would wait for solutions to mature. The survey sample would appear to be skewed towards early adopters, as most Tier 2 and 3 operators are in a wait-and-see mode, hoping Tier 1s will do the heavy lifting in NFV and allow them to ride their coat tails once the operational problems have been ironed out. Around a quarter of respondents said they would develop greenfield OSS for NFV services, presumably a reference to new management and orchestration systems such as OSM and ONAP.
A minority (9%) said they would manage VNFs on existing OSS -- we wish them luck!
The key operational challenges of NFV
Finally, Amdocs asked about the greatest operational challenge for NFV services deployment. There was broadly equal support for assurance (25%), change management for VNFs and associated services (24%), bridging the IT Ops and Net Ops domains (22%), and onboarding VNFs (20%).
Service assurance does indeed seem to be one of the major concerns with operationalizing NFV. How can we tell if a fault is originating in the VNF itself or in the underlying hardware? Shannon and Yates write: "An issue at the hardware layer (cloud servers) that impacts the software layer (VNF) will alarm at both layers. Without appropriate intelligence and automation, the alarming on the hardware and software layers could result in both the hardware operations team and team managing the VNF to simultaneously respond to the issues reported and investigate the root cause in parallel."
The dynamic nature of NFV means there will be constant change in the network -- presumably this needs to be handled automatically by orchestration systems. Indeed, Shannon and Yates write that "ONAP provides a consistent and systematic approach for change management across all network functions, thereby eliminating the need for ad hoc scripts."
As for bridging IT and networking operations -- there will certainly be greater commonality between the two domains. The demarcation will probably shift to those who manage real-time, customer-impacting applications and those that manage the rest. Shannon and Yates write "the operations role becomes more closely aligned to that of a traditional software company, such as Google or Facebook. However, there still remains a very strong role for networking expertise."
As for onboarding VNFs, currently operators are busy investigating lots of different VNFs from established suppliers and new vendors. Systems integrators can help with the interoperability testing and onboarding process, ensuring that VNFs will work in the specific NFV environment an operator has chosen (i.e. its specific combination of computing infrastructure, VIM, VNFM, NFVO, etc.). Over time the number of new VNFs being introduced and trialed by operators is likely to diminish, making the onboarding challenge less of an issue.
A steep learning curve for all
So, have AT&T et al solved all the problems of NFV with ONAP? Do CSPs just download the code and find a friendly systems integrator to get it working? Probably not. Even Shannon and Yates from AT&T admit that "the effectiveness of ONAP will be governed by the effectiveness of the policies and analytics that define the control loops."
Indeed, they go on to say that "a 'bad' control loop, introduced either accidentally or maliciously, could have a significant negative impact on network and service." Service designers and operators will be responsible for setting appropriate policies and ensuring "guard rails" are in place to prevent policy conflicts leading to unintended negative consequences.
NFV may reduce the need for Tier 1 support rushing to central offices to replace faulty equipment, but it will increase the need for Tier 3 skills such as mapping alarm correlations, scripting policies, training AI agents, troubleshooting VNF errors and debugging software-based problems. It will be a steep learning curve for all but hopefully a fulfilling one. As Henry Ford said, "Anyone who stops learning is old, whether at twenty or eighty. Anyone who keeps learning stays young."
This blog is sponsored by Amdocs.
To see an infographic of the Amdocs survey results follow this link: https://www.amdocs.com/nfvtalk
— James Crawshaw, Senior Analyst, Heavy Reading