QoE Represents a T&M Challenge
Communications services providers are beginning to pay more attention to quality of experience, which represents a challenge for test and measurement. Virtualization is exacerbating the issue.
Evaluating quality of experience (QoE) is complicated by the growing number and variety of applications, in part because nearly every application comes with a different set of dependencies, explained Spirent Communications plc Senior Methodologist Chris Chapman in a recent discussion with Light Reading.
Another issue is that QoE and security -- two endeavors that were once mostly separate -- will be increasingly bound together, Chapman said.
And finally, while quality of service (QoS) can be measured with objective metrics, evaluating QoE requires leaving the ISO stack behind, going beyond layer 7 (applications) to take into account people and their subjective and changing expectations about the quality of the applications they use.
That means communications service providers (CSPs) are going to need to think long and hard about what QoE means as they move forward if they want their test and measurement (T&M) vendors to respond with appropriate products and services, Chapman suggested.
QoE is a value in and of itself, but the process of defining and measuring QoE is going to have a significant additional benefit, Chapman believes. Service providers will be able to use the same layer 7 information they gather for QoE purposes to better assess how efficiently they're using their networks. As a practical matter, Chapman said, service providers will be able to gain a better understanding of how much equipment and capacity they ought to buy.
Simply being able to deliver a packet-based service hasn't been good enough for years; pretty much every CSP is capable of delivering voice, broadband and video in nearly any combination necessary.
The prevailing concern today is how reliably a service provider can deliver these products. Having superior QoS is going to be a competitive advantage. Eventually, however, every company is going to approach limits on how much more they can improve. What's next? Those companies that max out on QoS are going to look to provide superior QoE as the next competitive advantage to pursue.
Meanwhile, consumer expectation of quality is rising all the time. Twenty years ago, just being able to access the World Wide Web or to make a cellular call was a revelation. No more. The "wow" factor is gone, Chapman observed. The expectation of quality is increasing, and soon enough the industry is going to get back to the five-9s level of reliability and quality that characterized the POTS (plain old telephone service) era, Chapman said. "Maybe just one time in my entire life the dial tone doesn't work. You can hear a pin drop on the other side of the connection. We're approaching the point where it just has to work -- a sort of web dial tone," he said.
"Here's what people don't understand about testing," Chapman continued. "If you jump in and use a tester, if you jump in and start configuring things, you've already failed, because you didn't stop to think. That's always the most critical step."
Before you figure out what to test, you have to consider how the people who are using the network perceive quality, Chapman argues. "It's often a simple formula. It might be how long does it take for my page to load? Do I get transaction errors -- 404s or an X where a picture is supposed to be? Do I get this experience day in and day out?"
The problem is that most of the traditional measures cease to apply at the level of personal experience. "So you have a big bandwidth number; why is that even important? I don't know," he continued.
With Skype or Netflix, it might not matter at all. The issue might be latency, or the dependencies between the protocols used by each application. For an application like Skype, testing the HTTP connection isn't enough. There's a voice component and a video component. Every application has dependencies, and it's important to understand what they are before you can improve the QoE of whatever application it is.
"You have to ask a lot of questions like what protocols are permitted in my network? For the permitted protocols, which are the critical flows? Is CRM more important than bit torrent -- and of course it is, you might not even want to allow bit torrent? How do you measure pass/fail?"
And this is where looking at QoE begins to dovetail with loading issues, Chapman notes.
"It's not just an examination of traffic. How do my patterns driven with my loading profile in my network -- how will that actually work? How much can I scale up to? Two years from now, will I have to strip things out of my data centers and replace it?
"And I think that's what is actually driving this -- the move to data center virtualization, because there's a lot of fear out there about moving from bare metal to VMs, and especially hosted VMs," Chapman continued.
He referred to a conversation he had with the CTO of a customer. The old way to do things was to throw a bunch of hardware at the problem to be sure it was 10X deeper than it needed to be in terms of system resources -- cores, memory, whatever. Now, flexibility and saving money require putting some of the load into the cloud. "This CTO was nervous as heck. 'I'm losing control over this,' he told me. 'How can I test so I don't lose my job?' "
You have to measure to tell, Chapman explained, and once you know what the level of quality is, you can tell what you need to handle the load efficiently.
This is the argument for network monitoring. The key is making sure you're monitoring the right things.
"At that point, what you need is something we can't provide customer," Chapman said, "and that's a QoE policy. Every CTO should have a QoE policy, by service. These are the allowed services; of those, these are the priorities. Snapchat, for example, may be allowed as a protocol, but I probably don't want to prioritize that over my SIP traffic. Next I look at my corporate protocols, my corporate services, now what's my golden measure?
"Now that I have these two things -- a way to measure and a policy -- now I have a yardstick I can use to continuously measure, Chapman continued. "This is what's important about live network monitoring -- you need to do it all the time. You need to see when things are working or not working -- that's the basic function of monitoring. But not just, is it up or down, but is quality degrading over time? Is there a macro event in the shared cloud space that is impacting my QoE every Tuesday and Thursday, I need to be able to collect that."
Which brings up yet another issue. Once an operator has those capabilities in place, it also has -- perhaps for the first time in some instances -- a way to monitor SLAs, and enforce them. Chapman said some companies are beginning to do that, and some of those sometimes save money by going to their partners and negotiating when service levels fall below agreed-to levels.
— Brian Santo, Senior Editor, Components, T&M, Light Reading