Mellanox, Intel & the Data Center Bottleneck
Intel Corp. (Nasdaq: INTC) and Mellanox Technologies Ltd. (Nasdaq: MLNX) are both making gradual progress in their race to unclog a bottleneck that arises when scaling computer networks up to massive numbers of nodes or processors (CPUs). The problem is a present concern in high-performance computing (HPC), and if left unresolved it threatens to hobble data centers, and not just because they're beginning to rely on HPC for big data analytics.
The basic problem is that the speed gain in each successive generation of processor is increasingly more modest. This phenomenon was an important factor in the development of the multicore approach; if you can't get faster, you can do more by using more processors simultaneously. But the increase in processing power that results does not scale linearly with the number of processors. This approach, too, is delivering diminishing returns.
Supercomputers and data centers are different beasts, but what they have in common is a dependence on massive numbers of processors, and a perpetual need for more and faster processing.
But with faster and more both running out of steam, what's the solution? Intel and Mellanox both propose making interconnect more efficient, but disagree on how. Intel proposes its Omni-Path Architecture (OPA). Mellanox, meanwhile, is pegging its future to intelligent interconnect. (See Mellanox Capitalizes on 25G Transition.)
The bottleneck is subtext for some of the activity in other corners of the server chip market. It is said to be one of the reasons why Google (Nasdaq: GOOG) is looking at alternatives to Intel processors for its data centers. Three years ago, Google entertained the possibility of designing its own data center chips, though it never commercialized anything. Earlier this year, rumors were widespread that Google was looking at Qualcomm's server chips, which to this day Qualcomm Inc. (Nasdaq: QCOM) is showing only to the largest potential customers. Qualcomm is one of the companies looking to compete with Intel with ARM-based server chips. Cavium Inc. (Nasdaq: CAVM) is another. (See Cavium Debuts SoC for Data Center Servers.)
Intel's OPA is an alternative to Infiniband interconnect that takes a fabric approach to scaling up nodes. The new architecture encompasses PCIe adapters, silicon, switches, cables, and management software. OPA boasts latencies roughly on par with EDR Infiniband, Intel documents say, but has other improvements to overall network efficiency, such as link-level error correction and traffic optimization that gives preference to higher-priority traffic.
Intel announced the architecture last year, and earlier this summer launched yet another version of its Xeon processor, the Knights Landing Xeon Phi, which is optimized in ways that make it useful to support OPA.
Intel made the Knights Landing Xeon Phi available in India for the first time today (Sept. 26), hoping to attract HPC business in that country. Intel previously announced that several supercomputing centers (e.g., the Texas Advanced Computing Center, the Pittsburgh Supercomputer Center) are using the architecture. More recently, Intel and Fujitsu announced that the latter is building a supercomputer based on Knights Landing and OPA for the Joint Center for Advanced High Performance Computing (JCAHPC) in Japan; the date of completion for that project is this December.
Mellanox is proposing to make processors more efficient by relieving them of certain housekeeping tasks and -- just as importantly if not more so -- doing some of the calculations away from CPUs.
That of course plays to the company's strengths as a leading supplier of data center interconnect, but Mellanox Vice President of Marketing Gilad Shainer insists Mellanox's approach is the only way to truly solve the bottleneck.
In data centers, almost every server -- itself an aggregation of multicore CPUs -- is perpetually fetching data from multiple others and constantly aggregating and analyzing it.
What Mellanox proposes is to analyze data wherever it is, as the data moves, by putting some processing capability in interconnect, in the switch network and in NICs. It would have to run on the switch ASIC itself, rather than adding a CPU, because that's just moving the problem, not solving it.
here on Light Reading.
The first step that Mellanox is taking, Shainer told Light Reading, is moving algorithms for data aggregation and data reduction off the CPUs.
"Now you can do things much faster because you don't need to wait for data. Also, you minimize the amount of data that has to move," Shainer said.
He said you can accelerate those specific algorithms, getting a 10X improvement in those specific algorithms.
"You will see a growing number of data algorithms moving away from the CPU. The network will not become a general purpose CPU, because that's what you have a CPU for. But you can move some of these algorithms," he said.
Another avenue to pursue is to start using in-network memory. "You start having pieces of information so that it's available to all processors," Shainer noted. Commonly used data can be stored in multiple places (perhaps in NICs) so that it is more readily available.
Mellanox's approach is not necessarily inimical to Intel's. It can be accomplished without the participation of CPU vendors, though it would work much better if CPU vendors were to provide certain interfaces that would make the process work smoother.
— Brian Santo, Senior Editor, Components, T&M, Light Reading