Donald Trump is president for the second time, the world is still intact and you've just finished an epic round of gaming on a low-latency network slice. You call up the AI agent on your Vision Pro 3 and ask it to arrange the usual accommodation for your next business trip. It conveys the details to another AI agent at the hotel. For network operators, the billion-dollar question is what happens in between these chatting AIs.
Ever since generative AI stormed the public consciousness in late 2022, the conversation has been largely about the Internet companies, the chipmaker cresting the AI wave and the large language models (LLM) that have produced it. Amazon, Google and Microsoft have spent billions of dollars on Nvidia's graphical processing units (GPUs) to train those LLMs and unleash them on the world. It is at this next "AI inference" stage – when live data is fed into fully trained models – that operators like the UK's BT spy a role.
"The latency to create that seamless experience for the customer is super, super important," said Greg McCall, BT's chief networks officer, who provided the example of AI agents arranging accommodation on opposite ends of a network link. Any loss of "packets" in this process could leave you with half an answer about your hotel booking, he said.
The telco wave of generative AI?
BT executives, then, are already starting to think about a second phase of generative AI and what it could mean for their investments at the "edge," a catch-all for network locations that are much closer to end users. One possibility is that an indeterminate number of edge facilities play host to general-purpose compute-based applications and clusters of Nvidia's GPUs, as well as hyperscaler platforms such as AWS Wavelength, said Howard Watson, BT's chief security and networks officer, during a recent press briefing in London.
That obviously throws up concerns about cost and complexity for the operator's most senior technology executive. "What we've got to try to do is navigate a route through that where it doesn't end up being every single solution at the edge," he said. But operators have already installed technologies that would allow them to connect clusters of GPUs spread across multiple data centers.
Today, these GPU clusters are typically connected inside a given data center using a technology called Infiniband. And practically the only Infiniband vendor, according to a report by Rosenblatt Securities, is Mellanox, a company Nvidia acquired for a $6.9 billion fee in 2019. "It is a lossless, compute-to-compute, point-to-point connectivity platform that is quite proprietary," explained Watson.
The alternative familiar to the telco industry is Ethernet. With the shift to AI inference, Rosenblatt expects that to become "the predominant transport technology for AI networking," it said in its report issued in August last year. Annual sales of Ethernet products used in AI networking will grow from around $1 billion in 2022 to about $6 billion in 2027, Rosenblatt predicted back then, with Infiniband revenues increasing from $1.5 billion to $4 billion over this period.
"The rest of the industry is trying to find a way of taking a GPU and using it on Ethernet, which we as a telco industry can provide," said Watson. "And as that power battle evolves, we'll see the extent to which the network, from a telco perspective, can play a critical role."
Superchips toward 6G
What partly intrigues BT about Nvidia's GPUs is the opportunity to use them as hardware accelerators for some network functions. Last year, Nvidia began promoting a "superchip" called Grace Hopper (named after the famous computer scientist and rear admiral) for cloud radio access network (RAN) deployments. The Grace component is a central processing unit based on the blueprints of Arm, a UK chip designer, and it would handle less computationally demanding RAN software. Nvidia's idea is to run the hungriest RAN functions on Hopper, the GPU part, which could also double as an AI chip.
It's a vision that clearly holds some excitement for Mark Henry, BT's head of network strategy. And he evidently does not think the cost and power-hungry nature of Nvidia's GPUs automatically rules them out of use. "The RAN is really expensive," he said at BT's press briefing. "We already have 18,000 basestations with loads of hardware acceleration on them today, but this is in no way disaggregated or abstracted, so we can't use it for anything else."
BT has been looking at Japanese telcos such as Softbank that have already agreed deals with Nvidia, apparently receptive to its positioning of GPUs for both RAN acceleration and AI. "I think that platform is really interesting," said Henry. "It is probably a little hint of what a 6G basestation might look like."
Even so, introducing Nvidia in this manner would seemingly require BT to make some big changes. The operator today buys purpose-built RAN technology from Ericsson, Huawei and Nokia (although Huawei is being phased out under government orders). According to press reports last week, Nvidia is in talks about developing chips for Ericsson, which named it as a partner on cloud RAN trials in 2019. Yet neither of the Nordic vendors has announced a formal deal with Nvidia.
The US chipmaker is also stumping up its own software, branded Aerial, for those hungry "Layer 1" functions. But this would intrude on the activities of Ericsson and Nokia, which have always brought their own Layer 1 software to any deployment. In cloud RAN, Ericsson's strategy is to put Layer 1 software on Intel's chips, while Nokia's equivalent hardware partner is Marvell.
Horizontal ambitions
For BT, the ultimate interest is in being able to host its RAN on the same cloud platform as other telco workloads and realize efficiencies. Resisting use of the public cloud, or private cloud alternatives such as Red Hat and VMware, it has built its own telco cloud with the support of Canonical, a UK software company, using Juniper Networks for orchestration on top of Cisco and Dell compute.
This platform already hosts Ericsson-provided core network technology, which supports all but around 1% of network traffic – delivered on 2G and 3G systems – still left on an older Huawei core. Under strict government rules, this was supposed to have been switched off by December. But Watson insists authorities have been impressed with BT's effort, saying this has involved 170 million software migrations for 30 million customer SIMs in just 18 months. And he expects Huawei to be gone by the end of March. Besides moving to an Ericsson core, BT is also switching from Huawei technology for managing customers' data bundles to an online charging system developed by Amdocs.
The challenge with the internally built cloud is persuading vendors to produce the "cloud-native" applications that will sit on it. Until now, big developers have tended to focus on building applications for their own rather than third-party clouds. The consequence has been a proliferation of cloud silos. "With the first generation of this, the main telco vendors, Ericsson and Nokia, essentially built a telco cloud platform themselves to essentially try to replicate their vertically integrated world," said Watson. "Getting beyond that was initially hard."
Still, with operators like BT determined to avoid industry-wide platforms such as Microsoft Azure and Red Hat, there is analyst concern that vendors may face an array of different telco clouds, each demanding some customization. Watson appears to believe that is avoidable provided people adhere to Kubernetes, an open-source technology for managing cloud-native workloads.
"The nature of cloud-native is that it can run on any stack," explained Gabriela Styf Sjöman, BT's managing director of research and networks strategy. "I think the challenge for the legacy network equipment providers is that they can't afford to rewrite the code. It has been a long journey for them to first virtualize. To then make it cloud-native is a massive investment."