If you attended the recent SCTE Cable-Tec Expo, you probably came across many mentions of latency using a plethora of terms, including "Low Latency," "Low Latency DOCSIS" (or LLD), "Low Latency Xhaul" (or LLX), "5G Low Latency" and others. But what do they all mean, how are they similar and different and what's being done about latency?
Latency itself is simply a technical term for how much time it takes for a packet to travel from its source to its destination, particularly over computer networks. Low latency, however, is a widely used term with no specific definition: in essence, it just means that the latency of a given network is lower than it used to be, or that the latency is low enough to support a particular application or use case. In other words, whether something achieves low latency depends entirely upon the context in which you're discussing it.
Even though it's a nebulous term, low latency has been getting a lot of attention lately. Part of the reason is a growing realization that latency is – in many instances – more important to an end user's experience than raw transmission speed. But a lot of it comes from the 5G focus on low latency as a key feature that differentiates 5G from the generally high latencies of prior mobile technologies.
Although the term is just now gaining widespread attention, it's actually something that CableLabs has been working on for years with DOCSIS networks.
Our first effort at managing latency goes all the way back to 2011. At that time, we discovered that some users were experiencing extremely high average latency – on the order of one second when under load – due to what was referred to as "buffer bloat." Cable modem and cable modem termination system (CMTS) designs used large buffers to ensure maximum throughput without regard for latency. To address this issue, CableLabs worked with manufacturers to define controls that allow cable operators to manage the size of a cable modem’s buffer, reducing the average latency under load to around 100 milliseconds (ms) without impacting throughput.
In 2015, our next effort came in the form of a feature called Active Queue Management (AQM), one of the key new features added in the DOCSIS 3.1 specifications. AQM assures low queue (or buffer) occupancy while maintaining the ability to absorb a momentary traffic burst – unlike the buffer controls which set a hard limit on buffer size. With AQM, average latency under load is reduced to around 10 ms, a further order of magnitude improvement.
When deployed, these changes reduce the latency for all traffic, enabling significant improvements in the user experience. However, we saw there was still room for further improvement, as well as use cases that would benefit from reduced and/or more consistent latency.
As we looked more closely at the remaining sources of latency and jitter in networks today, it became clear most of the issues were caused by applications that send large bursts of traffic, causing temporary backups in network buffers. AQM was designed to accommodate these bursts, allowing those applications to achieve good throughput performance. But it turns out that other applications were suffering as a result. So, addressing this issue was the primary focus of our Low Latency DOCSIS (LLD) project.
The solution that CableLabs and our partners developed for this is referred to as Dual Queue Coupled AQM. As the name implies, this solution provides two queues: a classic queue for the applications that send large traffic bursts (causing latency and jitter for themselves and others), and a low latency queue for the applications that don't. By volume, most of the traffic today is in the former category – we refer to it as Queue-Building (or QB) – and includes applications like video streaming and file downloads. The latter category – Non-Queue-Building (or NQB) – includes applications like video collaboration and online gaming, many of which are especially sensitive to latency and jitter.
In this approach, any application can identify its NQB traffic via a marking in the packet header. The cable modem or CMTS then uses that information to direct each packet to the appropriate queue. The improvement is dramatic. Our simulations have shown that it may be possible to achieve round-trip latencies for NQB traffic of ~1 ms when combined with a new upstream scheduling service called Proactive Grant Service (or PGS). All without negatively impacting the throughput of QB traffic.
And those are critical points to emphasize: applications identify the type of traffic they're sending, and neither queue is getting bandwidth priority over the other. Rather, both queues share a single pool of bandwidth in a fair manner, with each queue optimized for the specific traffic type identified by the application. As a result, there's no incentive for applications to mismark the traffic, as it will actually hurt its performance rather than help it (although there is a protection function to safeguard the low latency queue, just in case).
Another use case that will benefit from reduced, consistent latency is when DOCSIS networks carry mobile traffic. The ubiquity of coax networks – combined with the growing capacity of DOCSIS technology – can make it extremely cost-effective to run mobile traffic over a DOCSIS network, which we refer to as mobile xhaul (meaning backhaul, midhaul and/or fronthaul). A major obstacle to widespread usage is latency, which for many mobile applications needs to remain reliably below 10 ms across the xhaul network.
Unfortunately, the dual-queue solution defined in LLD doesn't help us here, because the main source of latency is different. When a DOCSIS network carries mobile traffic, the traffic must traverse two different request-grant loops: one for each network.
When a mobile phone – or a DOCSIS cable modem – needs to send something, it must request and receive a transmission opportunity before it can actually send that data. That results in what we call a media access delay. When mobile traffic runs across a DOCSIS network, that traffic encounters two separate media access delays, which both add to the overall delay.
CableLabs' Low Latency Xhaul (LLX) project set out to address this issue. LLX technology seeks to essentially bypass the DOCSIS request-grant loop by allowing the mobile scheduler to communicate directly with the CMTS to warn it about upcoming traffic. It does this with a Bandwidth Report (BWR) message, a technique known as pipelining. That allows the CMTS to provide a grant for mobile traffic in advance, limiting the time that the mobile traffic needs to cross the DOCSIS network reliably to less than 5 ms, and often as little as 1-2 ms. That addresses one of the main hurdles to using DOCSIS networks for this service.
Which brings us to the present. CableLabs has been working extensively with equipment suppliers to incorporate support for LLD into their devices, which we expect to see become generally available in cable modems in the coming year. And on the LLX front, in addition to demonstrating a proof of concept, we've contributed our work to the O-RAN Alliance, which has standardized it as the Cooperative Transport Interface (CTI) for use over both DOCSIS and PON networks, making it applicable beyond just the cable industry.
Latency is a key part of the user experience, which is why it is one of the pillars of the cable industry's 10G strategy. New technologies such as LLD and LLX will be a significant part of that.
— Matt Schmitt, Principal Architect, CableLabs