Terabit in Action at SC14
The Terabit Demonstrator Project will showcase a 1 Terabit (10 x 100GigE) data path over about 1,000 kilometers at SC14, the Supercomputing Conference, in New Orleans. This project focuses on demonstrating terabit applications that demand extremely high data rates, stringent QoS, the application of new SDN approaches and virtualized network functions (VNFs).
The project follows last year's demonstration. In June 2013 at the ISC13 in Leipzig, Germany, a collaboration of research institutions and industrial partners led by T-Systems International GmbH demonstrated the feasibility and the benefits from a data path with 400 Gbit/s, including DWDM, routing/switching, computing, storage and applications. (See The 400Gbit/s Demonstrator .)
This activity has been re-launched, with some of the 2013 partners leaving and some new ones, including Brocade, ADVA Optical Networking and Intel Security, joining for the Terabit project.
At the ZIH TU-Dresden booth (#2323), there will be a live demonstration of project results. In addition, multiple project partners -- the High Performance Computing Center Stuttgart (HLRS) of the University of Stuttgart (booth 2749), Juelich Supercomputing Centre (booth 639), DKRZ (booth 603), Brocade (booth 2145), Bull (3331), NEC Corp. (booth 1131), SGI (booth 915), Mellanox (booth 2939) and Intel Corp./Intel Security (booth 1215 /1315) -- will be available to discuss aspects of the Terabit Demonstrator Project.
In support of the project, each partner (see partner list at the end) contributed its assets (hardware, software, manpower, and so on) to:
The partners committed to a certain roadmap, including the demonstration at SC14. After the New Orleans event, each partner is free to join for a next round with a new roadmap.
The current project follows two fundamental ideas:
Innovations to be showcased at SC14
Optical/DWDM: The Terabit Demonstrator makes use of the optical infrastructure provided by the SASER (Safe and Secure European Routing) research project (see map below). The test bed architecture follows an optical full mesh approach, using a physical optical ring to provide a logical full mesh between the edge nodes with a 400GHz band (enough for about 8 x 100Gbit/s in a 50GHz optical grid). "Routing" in this context refers to the connection of each node to any other node using a selected optical frequency or "color," something that could be achieved using SDN methods.
SASER is a multi-vendor project -- Alcatel-Lucent, Nokia and ADVA Optical Networking) -- and is partly funded by the German ministry of education and research (BMBF). For the Terabit Demonstrator, Deutsche Telekom FMED and T-Labs provide two 400GHz bands between Stuttgart and Leipzig, plus a third one for test purposes. These bands are transported via access lines to the endpoints at HLRS (High Performance Computing Center, Stuttgart) and Technical University Dresden.
The optical path between these endpoints runs to about 1,000 kilometers and is populated by 10 x 100Gbit/s transponders. The optical line amplifiers (OLAs) for the access lines and the transponders have been contributed by ADVA, the required long haul dark fibers and OLAs by Telekom Deutschland FMED and T-Labs (using Alcatel-Lucent technology) respectively. ADVA has successfully tested its latest 33GHz Grid 100Gbit/s transponders on the infrastructure, increasing the spectral efficiency of DWDM systems.
NFV: Security is a must for high-performance computing (HPC) infrastructure, especially in the case of intensive external communication. As the project follows a linear modular approach and modular Tbit/s scaling firewall appliances are not commercially available (and would be far too expensive in "real life"), the project decided to implement a scalable KVM-based approach.
The result was quite surprising: A virtual firewall from Intel Security (formerly McAfee) running on a single core (out of 40) of a Haswell dual socket server can fully serve a 40Gbit/s data stream; this has been tested only in a HPC typical scenario with 8k jumbo frames (which might not be too realistic for enterprise applications, for example).
The Haswell core processor architecture seems to have much better support for those kind of applications (virtualization and packet transport) than its predecessor, Ivy Bridge.
All NFV-related activities have been undertaken at HLRS.
SDN: Of the various aspects of SDN, an orchestration use case has been selected for the Terabit Demonstrator Project. At the Technical University Dresden, a bandwidth management application for the Terabit link between Dresden and Stuttgart has been implemented.
In the chosen setup, users (tenants) can configure their requested bandwidth via a web-based user portal. The requested bandwidth is provided for an individual user/application by automated configuration of related QoS profiles. This approach enables all users to use the Terabit link and share the available bandwidth on a best-effort basis. Any user/application with a need for a specific and guaranteed bandwidth can enter their request via a user portal, which configures a specific QoS profile on the Brocade network nodes and ensures that traffic is prioritized.
The developed application uses the Netconf interface (based on a Yang data model) on the Brocade network nodes to configure the QoS profiles: The application tracks all bandwidth reservations and avoids over-booking.
The application also contains a traffic monitoring (traffic visualization) capability to showcase the impact of traffic management and to monitor link utilization (see chart below). A live demonstration of these capabilities will be shown on booth 2323 (TU-Dresden).
Data Path: The chart below provides an overview of the complete Terabit Demonstrator topology.
Via the SASER optical network, 10 x 100G links are provided. The handover is done via 100G-SR10 optical interfaces and related MPO fiber cabling.
The core data path is established with two Brocade VDX8770 devices (high-performance chassis-based data switches), which have CLI, SNMP, NetConf and Rest API interfaces. For the Terabit Demonstrator project, a 4-slot chassis has been chosen with 2 x 6 port 100G blades and 2 x 18/27 port 40G blades in each node. These switches provide the Ethernet switching and IP routing capabilities. The 10 x 100G links form a Terabit data path using link aggregation (vLAG). Load balancing within the vLAG is flow based.
On each side there are 28 servers -- from Bull at TU-Dresden and from NEC at HLRS – that form the main computing resources of the project. The fastest and most efficient way to connect servers to Ethernet-based infrastructure is using Mellanox's 40GbE HCAs and related active cables, which the company contributed to the project.
An additional motivation for T-Systems to participate in the project is its planned cloud offering for scientific computing. For that purpose, a prototype extension of the T-Systems cloud environment was set up, based on a NUMA (nonuniform memory access) system contributed by SGI and several UCS-managed x86-nodes from Cisco, equipped with nVidia GPU-cards and a storage-solution from EMC Corp.
The aim of this extension, using a standard virtualized cloud-environment (in this case based on OpenStack), is to cover a broad variety of scientific applications that require a large amount of memory but that do not scale into the hundreds and thousands of CPU-cores. Using the cloud environment, applications that are not explicitly programmed for HPC can benefit from parallel scaling and large memories, which is very interesting for a vast variety of CAE (computer aided engineering) and Life-Science applications such as genome assembly and analysis, as well as in-memory data-analytics and geo-informatics.
Together with the extremely fast network, totally new business models can be demonstrated. For example, using current approaches in medical research, genome material is sequenced in a lab and the result (about 1TByte of data) is written on a hard disk. The disk gets transported via a parcel service and the data get processed on a server-environment. The results (assembled genome and/or analysis) are small enough for an e-mail back to the customer (hospital, pharmaceutical industry, lab etc.). With a linear scaling network the whole workflow can shrink from days to hours or minutes, which enables totally new medical applications and even save lives. An appropriate cloud infrastructure coupled with an extremely high-speed network is part of the infrastructure required for the personalized medicine of the future.
Applications: In addition to the scenarios already mentioned, the project has incorporated a broad variety of bandwidth-intensive applications that, while not each requiring a sustained 1Tbit/s pipe, could require such a connection once aggregated. These applications all benefit from remote resources such as data, display or specialized analysis hardware and can enable quantum leaps in scientific progress from the linear-scaling bandwidth approach. The SDN-based orchestration approach provides each application with the QoS profile it demands and therefore allows sharing of linear scaling network resources.
For file transfers, UFTP (UNICORE-FTP) software is used. UFTP is developed as part of the GRID-Middleware UNICORE (Uniform Interface to Computing Resources) initiative. Major features of UFTP in this context are multithreading, multistreaming, data encryption and user authentication. The exploitation of UFTP in the context of this project is the result of cooperation between T-Systems SfR GmbH, Juelich Supercomputing Centre (booth 639) and DKRZ Deutsches Klimarechenzentrum (booth 603).
Applications to be demonstrated at SC14 are:
Remote HPC Application Monitoring & Analysis (TU-Dresden, booth 2323)
Remote Rendering (HLRS, booth 2749)
Climate Computing (DKRZ, booth 603)
See you at SC14!
— Eduard Beier, Process Systems Engineer, T-Systems International