EXECUTIVE SUMMARY: FabricPath, using 16-by-10 Gigabit Ethernet links throughout the topology, forwarded 292.8Gbit/s of net traffic in the data center while also providing resiliency with sub-200-millisecond outage times and allowing for bursty traffic.
We're seeing more and more services looking to be hosted in the cloud while more and more tools are enabling those services to do so. The New York Times recently reported that, while the economy continues to be slow, data centers are booming and companies are reporting cloud-related growth each quarter. (See Cisco Sees 12-Fold Cloud Growth.) How will the infrastructure support all this growth? Virtualization is only one part of the story. What about the network? Will standard bridging, link aggregation and Spanning Tree do the trick?
Not really. Cisco and other interested parties are contributing to new standardized protocols like TRILL -- Transparent Interconnection of Lots of Links. In our test bed, we calculated massive traffic requirements to and from the virtual machines. Cisco configured its solution, FabricPath, which incorporates TRILL and other Cisco technology in order to scale the number of paths in the network, scale the bandwidth in the data center and lower out of service times in the case of a failure. Furthermore, Cisco wanted to quantify the buffering power of their latest "fabric extender," which goes hand in hand with their FabricPath architecture. We looked at each solution one at a time.
Given the scale of the test, and that there was no UCS included in the setup, we used hardware-based Ixia tools running IxNetwork to emulate all hosts. The Ixia test equipment was directly connected to Nexus 5548 switches. Each of these switches had sixteen physical connections to each upstream end of row switch -- the Nexus 7010. Most traffic was transmitted from within the data center, as would normally be the case. We ultimately emulated 14,848 hosts spread across 256 VLANs behind the four Nexus 5548 switches, transmitting a total of 273.9Gbit/s of traffic to each other in pairs, while also sending 9.2Gbit/s of traffic toward the emulated users located outside the data center (283.1Gbit/s in total), who in return sent back 9.7Gbit/s to emulate the requests and uploads. This added up to a total of 292.8Gbit/s traversing the FabricPath setup, for ten minutes, without a single lost frame.
The total FabricPath capacity per direction was 320Gbit/s, given the number of links that were hashed across. Our traffic was unable to fill the 320Gbit/s completely, but it was still indeed a hefty amount of traffic. Below we have graphed the latency as well as the load distribution within the network (as reported via Cisco’s CLI) to show how evenly the hashing algorithm distributed the load.
Now that we had measured the performance, what happened upon link failure? Cisco claimed we should see shorter outages compared to those experienced during failures in spanning tree networks. Measuring the out-of-service time from a failure in our scenario was much less than straightforward. The major testing problem was FabricPath’s strength -- traffic distribution by hashing -- which was sort of unpredictable looking from the outside. We created an additional traffic flow of minimal load -- one user -- at 10,000 frames per second, and tracked its associated physical path in the FabricPath domain. Once we found the link, we physically pulled it out, and plugged it back in, while running traffic, three times. The link failure results are shown below. When we replaced the link, in all three cases, zero frames were lost.
Finally, we wanted to validate one of Cisco’s claims regarding their Fabric Extender, or FEX, a standard part of their installation when data centers are keeping legacy Gigabit Ethernet links.
Cisco explained the FEX is an interface card that must not be located in the Nexus 5548 chassis, but rather in its own chassis -- in this case the Nexus 2248. This allows the card itself to be placed at the top of a data center rack, for example, as an extension of the end-of-row switch. This way when more ports are needed across a long distance, operators need not invest in a new top-of-rack switch, just a new card, thus extending other top-of-rack or end-of-row switches. Although this card is designed for Gigabit Ethernet-based servers, it is likely to be used in data centers that also have 10-Gigabit Ethernet. Thus, Cisco explained, it was important to design the card with large buffers, accommodating for bursty traffic coming from a 10-Gigabit Ethernet port toward a Gigabit Ethernet port.
How bursty could that traffic be in reality?
We connected the appropriate Ixia test equipment as shown in the diagram. Different burst sizes were configured on the Ixia equipment until we found the largest burst size that just passed FEX without loss. We set the inter-burst gap to a large value -- 300 milliseconds -- so we could send constant bursts, but the individual bursts would not affect each other. We repeated this procedure twice, once using IMIX (7:70, 4:512, 1:1500) frames, and once using 1,500-byte frames. The burst size that observed no loss with IMIX was 28.4 MB and 13.4 MB for the 1,500-byte frames. Both tests ran for three minutes without any loss. Latency was as high as expected, since we expected the buffers to be used: from 3.0 microseconds to 98.8 milliseconds for the IMIX bursts ranging from 3.2 microseconds to 204.5 milliseconds for the bursts of 1,500-byte frames.
Next Page: Tiered Cloud Services
Previous Page: Multi-Tenancy Isolation
Back to the Cisco Test Main Page