Virtual Machine Fabric Extender Performance
EXECUTIVE SUMMARY: The Cisco UCS’s Virtual Machine Fabric Extender (VM-FEX) offers consistently increased network performance operations compared to virtual distributed switch installations.
In a standard Local Area Network, various hosts, laptops and PCs typically connect to a Layer 2 switch that aggregates the physical stations before handing them off to a router. Communication between two hosts on the same LAN can be done directly without reaching the router. Similarly, a virtual switching instance passes traffic either to a VM sitting in the same hardware, or pushes it out the physical port. Virtual Switches (such as the Cisco Nexus 1000v or VMware’s vNetwork Distributed Switch) operations are done in software and Virtual Switches (such as the Cisco Nexus 1000v or VMware’s vNetwork Distributed Switch) operations are done in software and therefore take resources away from the virtual machines hosted on the blade. Reducing the amount of resources available to the VMs.
Cisco's Nexus 1000v has a rich set of capabilities such as VLAN aggregation, forwarding policies and security. Cisco, however, found that not all VM installations require these features, and in such cases it makes sense to save the resources taken by the virtual switch and appropriate them to the customer needs.
Cisco claimed that their Virtual Machine Fabric Extender (VM-FEX) in VMDirect mode replaces the switch and shows a significant increase in CPU performance for network intensive applications. VM-FEX, installed on VMWare ESX 5.0, enables all VM traffic to be automatically sent out on the UCS's Virtual Interface Card (VIC). This meant more traffic on the physical blade network interface, but reduced CPU usage, which is typically the VM bottleneck. To verify that the VM-FEX really frees up CPU resources, we ran a series of tests comparing a VM-FEX-enabled UCS blade to a Nexus 1000v virtual switch setup. Both UCS blade installations were identical in all aspects apart from the use of the VM-FEX in one and Nexus 1000v in the other.
We started comparing the performance between the two setups using Ixia’s virtual tools. We installed four Ixia IxNetwork VMs on each of the two UCS blades and sent 3,333 Mbit/s of traffic from each of the first three VMs, toward the fourth for 120 seconds using 1,500-byte frames. In the VM-FEX case we recorded 2.186 percent frame loss, while in the distributed switch environment we recorded 16.19 percent frame loss.
We expected loss in both cases, given the almost 10Gbit/s load we were transmitting in the virtual space. The load was required in order to really keep the CPU busy. We deduced from this initial test result that in the VM-FEX environment less resources were used, which is why the frame loss we recorded was smaller than the loss recorded in the virtual distributed switch setup.
For the next test setup we installed one IxLoad VM on each of the two blades. We configured both IxLoad VMs as HTTP clients that requested traffic from a Web server Cisco configured. The IxLoad emulated clients were configured to try and use as much bandwidth as possible by requesting 10 different objects from 10 URLs repeatedly. The VM-FEX setup reached 9.87 Gbit/s while the distributed switch reached 7.78 Gbit/s. The CPU usage was also significantly higher in the virtual distributed switch setup when compared to the VM-FEX setup.
Using the Ixia test tools we recorded the performance difference we expected. Cisco recommended that we perform a test that relies more heavily on the Storage Area Network (SAN). For this test, Cisco helped us to set up 10 VMs on each of the two setups, and install IOmeter on each virtual machine. IOmeter was configured to read blocks from an iSCSI-based SAN as fast as it possibly could. We manually started each of the twenty IOmeter instances, and after 10 minutes we manually stopped each of them. At the end, we looked at three statistics -- Input/Output Operations per Second, Data Rate, and Average Response Time -- all three averaged across the 10 VMs in each setup. The VM-FEX performance was indeed higher for all three metrics. The data is shown in the graph below:
We were still curious what the difference would be when someone is running a common task on a single VM. We wrote a script to use the open source program mplayer to encode a DVD image file that was stored in the SAN into mpeg (for private use of course). We wrote two versions of the script -- one performed an additional round of encoding. The results of this test run actually showed that the act of fetching blocks off the network-attached DVD were not too resource intensive as the VM-FEX setup required only marginally less time to perform the encoding than the virtual distributed switch setup.
Perhaps the most interesting metric was not the performance, but rather the CPU utilization. How much of the CPU was used for the operation, and how much was left over for other operations and other users? As shown below, the VM-FEX setup used far less of CPU resources in all cases. This was expected, since the CPU was skipping an entire layer of virtual switching, and this was, after all, exactly what Cisco wanted to demonstrate.
Next Page: Virtual Security Gateway
Previous Page: Unified Fabric (UF) – UCS Manager
Back to the Cisco Test Main Page