& cplSiteName &

Mellanox, Intel & the Data Center Bottleneck

Brian Santo
9/26/2016
50%
50%

Intel Corp. (Nasdaq: INTC) and Mellanox Technologies Ltd. (Nasdaq: MLNX) are both making gradual progress in their race to unclog a bottleneck that arises when scaling computer networks up to massive numbers of nodes or processors (CPUs). The problem is a present concern in high-performance computing (HPC), and if left unresolved it threatens to hobble data centers, and not just because they're beginning to rely on HPC for big data analytics.

The basic problem is that the speed gain in each successive generation of processor is increasingly more modest. This phenomenon was an important factor in the development of the multicore approach; if you can't get faster, you can do more by using more processors simultaneously. But the increase in processing power that results does not scale linearly with the number of processors. This approach, too, is delivering diminishing returns.

Supercomputers and data centers are different beasts, but what they have in common is a dependence on massive numbers of processors, and a perpetual need for more and faster processing.

But with faster and more both running out of steam, what's the solution? Intel and Mellanox both propose making interconnect more efficient, but disagree on how. Intel proposes its Omni-Path Architecture (OPA). Mellanox, meanwhile, is pegging its future to intelligent interconnect. (See Mellanox Capitalizes on 25G Transition.)

The bottleneck is subtext for some of the activity in other corners of the server chip market. It is said to be one of the reasons why Google (Nasdaq: GOOG) is looking at alternatives to Intel processors for its data centers. Three years ago, Google entertained the possibility of designing its own data center chips, though it never commercialized anything. Earlier this year, rumors were widespread that Google was looking at Qualcomm's server chips, which to this day Qualcomm Inc. (Nasdaq: QCOM) is showing only to the largest potential customers. Qualcomm is one of the companies looking to compete with Intel with ARM-based server chips. Cavium Inc. (Nasdaq: CAVM) is another. (See Cavium Debuts SoC for Data Center Servers.)

Intel's OPA is an alternative to Infiniband interconnect that takes a fabric approach to scaling up nodes. The new architecture encompasses PCIe adapters, silicon, switches, cables, and management software. OPA boasts latencies roughly on par with EDR Infiniband, Intel documents say, but has other improvements to overall network efficiency, such as link-level error correction and traffic optimization that gives preference to higher-priority traffic.

Intel announced the architecture last year, and earlier this summer launched yet another version of its Xeon processor, the Knights Landing Xeon Phi, which is optimized in ways that make it useful to support OPA.

Intel made the Knights Landing Xeon Phi available in India for the first time today (Sept. 26), hoping to attract HPC business in that country. Intel previously announced that several supercomputing centers (e.g., the Texas Advanced Computing Center, the Pittsburgh Supercomputer Center) are using the architecture. More recently, Intel and Fujitsu announced that the latter is building a supercomputer based on Knights Landing and OPA for the Joint Center for Advanced High Performance Computing (JCAHPC) in Japan; the date of completion for that project is this December.

Mellanox is proposing to make processors more efficient by relieving them of certain housekeeping tasks and -- just as importantly if not more so -- doing some of the calculations away from CPUs.

That of course plays to the company's strengths as a leading supplier of data center interconnect, but Mellanox Vice President of Marketing Gilad Shainer insists Mellanox's approach is the only way to truly solve the bottleneck.

In data centers, almost every server -- itself an aggregation of multicore CPUs -- is perpetually fetching data from multiple others and constantly aggregating and analyzing it.

What Mellanox proposes is to analyze data wherever it is, as the data moves, by putting some processing capability in interconnect, in the switch network and in NICs. It would have to run on the switch ASIC itself, rather than adding a CPU, because that's just moving the problem, not solving it.


Want to know more about communications ICs? Check out our comms chips channel
here on Light Reading.


The first step that Mellanox is taking, Shainer told Light Reading, is moving algorithms for data aggregation and data reduction off the CPUs.

"Now you can do things much faster because you don't need to wait for data. Also, you minimize the amount of data that has to move," Shainer said.

He said you can accelerate those specific algorithms, getting a 10X improvement in those specific algorithms.

"You will see a growing number of data algorithms moving away from the CPU. The network will not become a general purpose CPU, because that's what you have a CPU for. But you can move some of these algorithms," he said.

Another avenue to pursue is to start using in-network memory. "You start having pieces of information so that it's available to all processors," Shainer noted. Commonly used data can be stored in multiple places (perhaps in NICs) so that it is more readily available.

Mellanox's approach is not necessarily inimical to Intel's. It can be accomplished without the participation of CPU vendors, though it would work much better if CPU vendors were to provide certain interfaces that would make the process work smoother.

— Brian Santo, Senior Editor, Components, T&M, Light Reading

(1)  | 
Comment  | 
Print  | 
Newest First  |  Oldest First  |  Threaded View        ADD A COMMENT
HeadIT22387
50%
50%
HeadIT22387,
User Rank: Light Beer
10/3/2016 | 3:25:30 AM
Data Center Interconnect
Trends and Directions in Data Center Interconnect has been changing and this article was certainly helpful to me . Thanks for this informative article .
Featured Video
From The Founder
John Chambers is still as passionate about business and innovation as he ever was at Cisco, finds Steve Saunders.
Flash Poll
Upcoming Live Events
June 26, 2018, Nice, France
September 12, 2018, Los Angeles, CA
September 24-26, 2018, Westin Westminster, Denver
October 9, 2018, The Westin Times Square, New York
October 17, 2018, Chicago, Illinois
October 23, 2018, Georgia World Congress Centre, Atlanta, GA
November 7-8, 2018, London, United Kingdom
November 8, 2018, The Montcalm by Marble Arch, London
November 15, 2018, The Westin Times Square, New York
December 4-6, 2018, Lisbon, Portugal
All Upcoming Live Events
Hot Topics
NFV Is Down but Not Out
Iain Morris, News Editor, 5/22/2018
What VeloCloud Cost VMware
Phil Harvey, US News Editor, 5/21/2018
Verizon CEO Says LA Is Second 5G City
Dan Jones, Mobile Editor, 5/16/2018
TM Forum Sea-Change Overcomes That Sinking Feeling
Iain Morris, News Editor, 5/17/2018
Live Digital Audio

A CSP's digital transformation involves so much more than technology. Crucial – and often most challenging – is the cultural transformation that goes along with it. As Sigma's Chief Technology Officer, Catherine Michel has extensive experience with technology as she leads the company's entire product portfolio and strategy. But she's also no stranger to merging technology and culture, having taken a company — Tribold — from inception to acquisition (by Sigma in 2013), and she continues to advise service providers on how to drive their own transformations. This impressive female leader and vocal advocate for other women in the industry will join Women in Comms for a live radio show to discuss all things digital transformation, including the cultural transformation that goes along with it.

Like Us on Facebook
Twitter Feed