Featured Story
How Huawei went from Chinese startup to global 5G power
A new book by the Washington Post's Eva Dou is a comprehensive and readable account of Huawei's rapid rise on the world's telecom stage.
A commercial rollout by Japan's SoftBank of a 5G network developed with Nvidia will start in 2026, and it has implications for traditional vendors.
At some point, perhaps when those dreamed-of artificial intelligence (AI) revenues don't materialize, the hyperscaler appetite for buying Nvidia's graphics processing units (GPUs) – the chips needed to train AI's large language models – is bound to weaken. As rapacious as any private-sector entity, Nvidia is naturally thinking about who it can target next. And telcos, with their real-estate portfolios, are high on the list. Nvidia's pitch is that its GPUs can be sprinkled around a country and used to power AI as well as run the 5G (or 6G) radio access network (RAN).
The telco concern is twofold: the sheer cost of those GPUs; and the difficulty of making them pay. One source reckons the latest Blackwell chip costs about $40,000. "I'm still wondering, when we're talking about power optimization on the RAN, how come you have people coming with a 700-watt GPU?" said Gilles Garcia of AMD, a chipmaker that offers a 45-watt field programmable gate array for the same RAN functions. In the hyperscaler world, the latest chatter is about locating energy-ravenous GPU clusters alongside nuclear power plants – which sounds like the prologue to a movie involving a murderous cyborg played by a musclebound Austrian.
Nvidia's response is to argue that its GPUs, when architected correctly in a cloud RAN, can be cost effective. More importantly, they can earn revenues for telcos from the sale of "inference," when trained AI models are put to work on various tasks. For every $1 invested in what it calls AI-RAN infrastructure, a telco can make $5 over a five-year period out of this inferencing business, it reckons.
Sounds too good to be true? That has been the reaction of some observers to Nvidia's latest updates. But skepticism has not stopped Japan's SoftBank from moving ahead with a 5G deployment where Nvidia has a prominent role. A pilot of the new-look network has already taken place at 20 mobile sites supported by a single Nvidia-powered supercomputer. Next year, SoftBank will deploy hundreds of sites, according to Nvidia, while SoftBank says a rollout will start in 2026 "across its own commercial network."
Scary times for the Nordics
This is potentially the most disruptive and exciting development to have happened in mobile networks for years. If it takes off, it could have a shrivelling effect on open RAN and its various underweight performers, like the arrival of the Terminator-era Schwarzenegger in a gym of normal-size men. But it raises lots and lots of questions that will take a long time to answer.
For a start, what does a rollout of this system, which SoftBank is calling AITRAS, mean for existing vendors such as Ericsson and Nokia? Neither features whatsoever in AITRAS. For RAN software, SoftBank appears to have taken Nvidia's Aerial-branded Layer 1, which covers the most demanding baseband tasks, and done some customization of its own. The rest comes from Fujitsu, a Japanese rival to the Nordic vendors. Layer 1 is hosted on Nvidia's Hopper GPU, while Layer 2 sits on Grace, an Nvidia central processing unit that is based on the architecture of Arm, a UK chip designer majority owned by SoftBank.
All that is encased in a supercomputer that can be racked, like a server or appliance, in a data center. An Nvidia-built data processing unit is also included to support the fronthaul connections to radios. But the radio units in the 20-site pilot come from Fujitsu, not Ericsson or Nokia. "[We] will consider including Fujitsu's radio units and those from other vendors in the commercialization of the product, depending on technical specifications and market requirements," said a SoftBank spokesperson by email in response to queries.
Nvidia has previously said Ericsson and Nokia are free to put their Layer 1 software on its GPUs, and both Nordic vendors are collaborating with it through a group called the AI-RAN Alliance and a related project that includes T-Mobile US. But their existing software is not deployable on the compute unified device architecture (CUDA) platform used by Nvidia, according to a reliable source. Anything they do would have to be written from scratch, gobbling resources.
Those aren't cheap, of course. Nvidia is currently advertising for a principal engineer to work on AI-RAN and 6G architecture and it expects to pay someone between $272,000 and $419,750 annually to do the job. But Nvidia is currently hamstrung by its lack of radio expertise, according to a source close to the matter. Today, it employs a small number of engineers compared with thousands at Ericsson and Nokia, he said.
Of AI gods and robodogs
Still, it is the revenue and cost numbers attached to this scheme that will seem implausible to many observers. Among Nvidia's claims is that its technology offers a 40% saving on power consumption versus "best-in-class custom RAN-only systems today" and a 60% saving when compared with an x86-based virtual RAN.
"The comparison is done for Watt/Gbps and assumes the same number of cells for each type of solution," said an Nvidia spokesperson by email when pressed for an explanation. Savings are possible, said Nvidia, "due to the very high density of cells achieved per server, which would otherwise require multiple basestations."
A traditional RAN, of course, requires a dedicated appliance to be installed at each mast site for those Layer 1 and Layer 2 compute functions. SoftBank's pilot instead went for a centralized RAN architecture, whereby RAN compute moves off mast sites and into a smaller number of aggregation points. The expectation is that each Nvidia server will be able to handle 20 5G sites using 100MHz bandwidth (in RAN-only mode).
If feasible, this implies a 95% reduction in the amount of compute hardware needed across the entire network. The trouble is that radio units at mast sites would need connecting to the servers now moved to a different location. SoftBank and other Japanese telcos have already made substantial investments in fiber lines for these fronthaul links. But telcos in many other countries have not. Such rearchitecting would be expensive and risky.
In the meantime, SoftBank has been at work on a serverless API (application programming interface) called Nvidia AI Enterprise to capture those inferencing revenues. The idea is to ferry external cloud workloads via this API to the AI-RAN server and charge for usage. To show how this might work in action, SoftBank somewhat creepily had a robot dog follow a person around. With inferencing in the telco network, robodog moved instantly in response to the human, said the company. When the workloads were hosted in the cloud, it "struggled to keep up."
Viewers of the Black Mirror episode called Metalhead, in which platoons of robodogs wipe out humanity, may prefer to keep AI in the cloud. SoftBank's trial might prove that hosting workloads nearer devices cuts latency, a measure of the journey time for a signal. But a mass-market need for super-responsive robodogs is currently hard to envisage. And without more realistic examples, those revenue projections are hard to believe.
SoftBank thinks it can earn a return of "up to" 219% for every AI-RAN server it puts in the network. But James Crawshaw, a principal analyst with Light Reading sister company Omdia, is amused by the small print qualification from Nvidia, which says revenues are not guaranteed. "Because obviously if telecom operators could generate $5 of revenue from every $1 of capex, even if that revenue were spread over five years, they would all invest in GPUs and the price of inferencing would fall through the floor," he said in a LinkedIn post.
"Someone with very limited understanding of economics has allowed this nonsense to be published even with the caveats," he wrote. "Nvidia – you are better than this. You don't need to promise fairy stories. And you don't need to persuade telcos to invest in GPUs. Aren't you capacity-constrained enough already?"
You May Also Like