Metro Edge Router Test
Edge Router Test Highlights: * Great VPN Delivery * Great Scaleability * Slight edge to Laurel
December 10, 2002
Who says the Internet boom is over? Carriers may still be caught in a capex crunch, and Wall Street may still be nursing a nuclear hangover, but in networking the pace of innovation never slows down. A new breed of box, the metro edge router, gives providers the technology they’ll need to roll out new services and scale those services to new levels. That’s not just marketing hype: These routers actually work pretty much as promised in delivering services like MPLS VPNs, QOS enforcement, and routing scaleability.
Light Reading, along with its testing partners – Network Test of Westlake Village, Calif., a benchmarking and network design consultancy; and Spirent Communications of Calabasas, Calif., a test equipment supplier – has just completed a massive edge router trial. We pounded the Laurel Networks Inc. ST200 and Redback Networks Inc. (Nasdaq: RBAK) SmartEdge 800 in the most rigorous routing test we’ve ever conducted.
Make that ten tests: Our methodology covered everything from basic throughput to resiliency testing to MPLS Layer 2 VPNs. This sizeable undertaking was more than nine months in the making, and involved a staff of more than 40 engineers and project managers (see Thanks).
The good news? Both vendors passed with flying colors.
Among the high points:
Using Layer 3 MPLS VPNs, both boxes emulated more than 2,400 virtual routers and handled hundreds of routes per customer.
Layer 2 MPLS VPNs scaled even higher, with both products forwarding traffic through nearly 40,000 tunnels. As far as we’re aware, both boxes set new records for public tests of MPLS VPN scaleability.
Routing capacity tests for both boxes produced absurdly high numbers, typically one or even two orders of magnitude beyond today’s levels.
If there’s any bad news to be had, it’s that market leaders Cisco Systems Inc. (Nasdaq: CSCO) and Juniper Networks Inc. (Nasdaq: JNPR) didn’t put their products to the test.
We weren’t too surprised about Cisco: Judging from data sheets and anecdotal evidence, its current products wouldn’t have fared well at all with parts of our methodology. We’ve also heard there are better products in the pipeline.
Juniper had no such excuse: In fact, the data sheet for the vendor’s M40e (e as in edge) served as one of our guides in putting together a test methodology. Even more promising, Juniper bought Unisphere Networks, the leading edge router maker, while we were preparing for this project. We looked forward to testing one and possibly two entries from Junisphere.
Juniper actually did agree to take part, but the deluge began soon after it swallowed Unisphere (or was it the other way around?). First, we heard multiple reports of internal strife. Then we got word that Juniper was officially withdrawing from the test (accompanied, of course, by reassurances that all was well internally). Then followed another cycle of reconsideration, but ultimately Juniper sat this one out. Too bad: Clearly it was politics, not products, that prevented Juniper from putting its best foot forward.
We also gathered an impressive collection of excuses from the more than 20 other vendors that opted out (please see No Shows before asking why this vendor or that one isn’t represented here). Some of these vendors simply didn’t have appropriate product; our requirement for OC48 (2.5 Gbit/s) interfaces ruled out several potential players. Others had reasons of their own.
Whatever shortcomings we uncovered in the Laurel and Redback products (and there were very few), both vendors deserve major credit for their willingness to submit their products to public testing.
If we had to choose between the Laurel and Redback routers, we’d give the edge to Laurel’s ST200, while noting that good arguments can be made for either router. Laurel’s product fared better in the baseline throughput and failover tests, and it scaled much higher in the Layer 3 MPLS VPN tests. For its part, Redback’s SmartEdge router did better in some baseline latency and routing scaleability tests (though not by meaningful margins on the latter, in our view). Both boxes turned in excellent results overall.
A summary of all the tests and results are provided in the table below.
Table 1: Results in a Nutshell
Laurel ST200 | Redback SmartEdge 800 | |
---|---|---|
Throughput | ||
Percentage of practical maximum rate forwarded without loss, 40-byte IP packets | 100.00% | 98.47% |
Percentage of practical maximum rate forwarded without loss, Internet mix | 99.99% | 99.12% |
Latency | ||
Average delay in microseconds for varying loads, 40-byte IP packets | 28-31 | 18-32 |
Average delay in microseconds for varying loads, Internet mix | 48-1,759 | 27-69 |
Maximum delay in microseconds for varying loads, 40-byte IP packets | 74-99 | 153-2,435 |
Maximum delay in microseconds for varying loads, Internet mix | 637-8,716 | 554-1,452 |
Resilency | ||
Failover time for 500,000 routes, in milliseconds | 9.85 | 57.19 |
VPN Scaleability | ||
Layer 3 (RFC 2547bis) MPLS VPNs: maximum number of virtual routing and forwarding (VRF) instances | 2,420* | 2,420* |
Layer 3 (RFC 2547 bis) MPLS VPNs: maximum number of table entries per customer when supporting 2,420 virtual routers | 900 | 300 |
Layer 2 (Martini draft) MPLS VPNs: Maximum number of tunnels | 39,232* | 39,232* |
Quality of Service | ||
Rate enforcement | Passed | Passed |
Rate shaping | Passed | Passed |
Routing Performance | ||
BGP Routing Information Base (RIB) table capacity: maximum number of routes learned | 5,414,062 | 4,000,000 |
BGP Forwarding Information Base (FIB) table capacity: maximum number of routes advertised | 849,990 | 1,350,000 |
BGP peering capacity: maximum number of concurrent sessions | 1,012 | 1,408 |
OSPF capacity: maximum number of link state advertisements (LSAs) supported | 2,000,064 | 2,600,070 |
IS-IS capacity: maximum number of label switched paths (LSPs) | 3,394,050* | 3,394,050* |
Detailed results follow in subsequent pages:
The Metro Edge Router
First Things Fast
Get Stuffed
Dealing With Delay
Failover & Resiliency
MPLS: Scaling the Heights
Martini Time
QOS Matters
BGP Basics
BGP RIB Capacity
A Big FIB
BGP Peering
OSPF Capacity
IS-IS Capacity
— David Newman is president of Network Test, an independent benchmarking and network design consultancy based in Westlake Village, Calif. He likes surf guitar and bicycle racing, and his turnoffs include mean people and anonymous troll posts on Light Reading’s message boards. Newman can be reached at [email protected].Next page: The Metro Edge Router
Edge routing is necessarily a more complex topic than core routing. With core routers, handling huge routing tables and forwarding packets really fast is pretty much the name of the game. There’s a lot more going on at the edge: In a word, it’s all about services.
In addition to handling the same basic routing functions as core devices, the edge router is also the focal point for provisioning of numerous services. Just to name a few, these may include IP multicast, QOS enforcement, RFC 2457bis and Martini-draft Multiprotocol Label Switching (MPLS) VPNs, IPSec, IPv6-to-IPv4 tunneling, voice over IP, and wireless networking. As the (hopefully) apocryphal product manager once said, edge routers “provide support for a broad array of acronyms.”
Another major distinguishing feature of edge routers is the much larger variety of interfaces on offer (see Table 2). Core routers tend to have similar interfaces: All OC48, say, or all Gigabit Ethernet. It’s far more common to equip edge routers with a mix of low- and high-speed interfaces. In fact, it’s not uncommon to see edge routers with lots of DS1 (1.5 Mbit/s) or DS3 (45 Mbit/s) interfaces, perhaps paired with Ethernet, Asynchronous Transfer Mode (ATM), and Sonet (Synchronous Optical NETwork) and SDH (Synchronous Digital Hierarchy) POS interfaces, all on the same chassis.
In this project, we spec’d a rather large box for an edge router, involving a combination of 48 DS3, Gigabit Ethernet, and OC48 interfaces. It wasn’t long ago that edge routers had perhaps two, maybe four, interfaces. That’s still the case for most customer gear.
Inside service provider networks, bigger is better. One way for carriers to boost revenue is to provision more services to more customers on fewer boxes. Would the new edge routers fit the bill? That’s the question we sought to answer in all our tests.
Table 2: Edge Routing Features
Laurel Networks
Phone | 412-809-4200 | 408-750-5000 |
URL | www.laurelnetworks.com | www.redback.com |
Product | ST200 | SmartEdge 800 Router |
Software version tested | 2.4 | 2.3.3.0.105 |
RAM on system tested | 1 Gbyte | 768 Mbytes plus 1 Gbyte microdrive |
INTERFACES | ||
Line cards available | ||
10Base-T | X | |
100Base-T | X | |
Gigabit Ethernet | X | X |
ATM OC-3 | X | X |
ATM OC-12 | X | X |
ATM OC-48 | X | |
POS OC-3 | X | X |
POS OC-12 | X | X |
POS OC-48 | X | X |
POS OC-192 | ||
Serial DS-1 | X | X |
Serial DS-3 | X | X |
Other | All cards listed above support SDH equivalents | |
OC3/STM1 FR | ||
OC12/STM4 FR | ||
OC48/STM16 FR | ||
OC3/STM1 Ethernet over SONET | ||
OC12/STM4 Ethernet over SONET | ||
OC48/STM16 Ethernet over SONET | ||
OC3/STM1 Any Service Any Port | ||
OC12/STM4 Any Service Any Port | ||
OC48/STM16 Any Service Any Port | ||
OC3/STM1 channelized Any Service Any Port | ||
OC12/STM4 channelized Any Service Any Port | ||
OC48/STM16 channelized Any Service Any Port | ATM DS-3, ChE1, ChSTM1 | |
PORT DENSITY | ||
Maximum number of interfaces per 7' telco rack | ||
10Base-T | Not applicable | 576 |
100Base-T | Not applicable | 576 |
Gigabit Ethernet | 256 | 192 |
ATM OC-3 | 512 | 96 |
ATM OC-12 | 128 | 48 |
ATM OC-48 | 32 | |
POS OC-3 | 512 | 384 |
POS OC-12 | 128 | 192 |
POS OC-48 | 32 | 48 |
POS OC-192 | ||
FDDI | ||
T1/E1 | 10,752/8,064 | 16,128/9,072 |
T3/E3 | 1,536 | 1,152 |
Other | 192 ATM DS-3, 144 channelized STM-1 | |
SWITCHING SUPPORT | ||
Capacity of each switch fabric module | 40 Gbit/s | 60 Gbit/s |
Maximum switch fabric capacity per chassis | 80 Gbit/s | 60 Gbit/s |
Does local (same module) traffic cross the router's backplane? | Yes | No |
LAYER 2 SUPPORT | ||
Supported link-layer technologies | ||
802.11a WLAN | ||
802.11b WLAN | ||
ATM AAL5 | X | X |
ATM encapsulations (AAL1/2/5) | X | ATM cell mode |
CR-LDP | ||
Ethernet | X | X |
Ethernet 802.1p/q VLANs | X | X |
FDDI | ||
Frame relay | X | X |
L2TP | X | |
LDP | X | X |
Martini-draft encapsulation | X | X |
PNNI | X | |
PPP (IP-NCP) | X | X |
PPP (IPv6-NCP) | ||
PPPoE | X | |
PPPoEoA | X | |
PPVPNs | X | X |
Serial | X | |
Sonet/SDH | X | X |
xDSL | ||
Other | Ethernet over SONET (X.86), MPLS, RFC1483 routed and bridged | |
LAYER 3 SUPPORT | ||
Supported network-layer technologies | ||
BGP4 | X | X |
BGP4 extensions for IPv6 | ||
DVMRP | ||
GRE | X | |
IGMP | X | X, including IGMPv3 |
IP-in-IP | ||
IPSec AH | ||
IPSec ESP | ||
IPv6 forwarding | X | X |
IS-IS | X | X |
MBGP | X | X |
OSPF for IPv6 | ||
OSPFv2 | X | X |
PIM dense mode | ||
PIM sparse mode | X | X |
RFC 2547 VPNs | X | X |
RIPng | ||
RIPv1 | X | X |
RIPv2 | X | X |
RSVP | X | X |
VRRP | X | |
Other | LDP, draft-martini control plane | SSM |
DEVICE & USER MANAGEMENT | ||
Available management and authentication protocols | ||
CORBA | X | X |
LDAP | ||
SNMPv1 | X | X |
SNMPv2 | X | X |
SNMPv3 | X | |
Radius | X | X |
RMON I | Alarms and events groups | |
RMON II | ||
SSHv1 | X | X |
SSHv2 | X | X |
Telnet | X | X |
TFTP | X | |
DHCP | X | |
Does the router have a monitor or "spy" interface that can a capture traffic from one other interface? | Yes | Yes |
Does the router have a monitor or "spy" interface that can a capture traffic from multiple interfaces at the same time? | Yes | Yes |
HIGH AVAILABILITY | ||
Redundant devices | ||
Power supplies | X | X |
Switching fabric | X | X |
Line cards | X | X |
Routing module(s) | X | X |
Cooling fans | X | X |
Next page: First Things Fast
The first and arguably most important task in router testing is measuring its basic forwarding and delay characteristics. Baseline measurements have come to be regarded as a fundamental test for all sorts of networking devices, not just edge routers. After all, if a device can’t handle a basic job like forwarding packets at high rates while keeping latency low, its performance likely will be even worse with more advanced tasks.
For the baseline measurements, we set up a test bed representative of the way metro edge routers are deployed in production today (see Figure 1). We asked vendors to supply a device with “customer-side” and “core-side” interfaces. On the customer-facing side, we asked vendors to supply 12 Gigabit Ethernet and 32 DS3 Frame Relay interfaces. On the core-facing side, we asked vendors to outfit their routers with 4 OC48c interfaces.
Using Spirent’s Adtech AX/4000 analyzers, we offered the routers traffic patterns that involved all customer-side interfaces moving traffic to and from all core-side interfaces.
This wasn’t just a few packets here and there. Our test pattern was highly stressful on the routers in a number of ways, including packet size distribution, BGP (border gateway protocol) table size, AS (autonomous system) path length, and prefix length distribution (see Modeling Tomorrow’s Internet).
In the first test, we brought up BGP sessions on all interfaces and advertised nearly 512,000 routes. That’s more than four times the size of the entire Internet routing table today.
Then we offered traffic to every one of those routes, with traffic from all core-side interfaces headed to all customer-side interfaces and vice-versa. Our baseline test used two key metrics: throughput and latency.
Next page: Get Stuffed
In a throughput test, the goal is to find the highest load a device can handle without dropping packets. In theory, the ideal result is line-rate throughput. In practice – especially when DS3 and Sonet interfaces are involved – it may not be possible to get all the way to line rate, though both routers came close.
We used two traffic loads in our baseline tests: 40-byte IP packets, and an “Internet mix” (Imix) of three common packet sizes.
In the tests with 40-byte packets, Laurel’s ST200 had the highest throughput – 42.4 million packets per second (pps), or about 98.13 percent of line rate – compared with 41.8 million pps (about 96.6 percent of line rate) for Redback’s SmartEdge 800 (see Figure 2).
At first glance, it would appear Laurel’s router didn’t achieve true line rate in our tests. That’s only because it was impossible to get line rate in our 40-byte tests, because of a factor called “stuffing.”
Our test bed involved a mix of DS3, packet over Sonet, and Gigabit Ethernet interfaces. DS3 interfaces often use PPP (point-to-point protocol), and this adds significant overhead in the form of “bit stuffing.”
PPP uses a special sequence of five bits, all set to 1, to distinguish where one packet ends and another begins. However, the five-1s sequence can also occur naturally in other parts of the packet, such as in user data or in IP checksums. To avoid confusion with the true end of the packet, PPP inserts an extra bit set to 0 whenever it sees a five-1s pattern elsewhere. A similar issue exists on the Sonet/SDH interfaces, except in that case the interface inserts an entire byte rather than a bit into the stream.
A bit here and a byte there doesn’t sound like much, but it adds up. In our 40-byte IP packet tests we found that bit- and byte-stuffing reduced the practical maximum throughput by nearly 2 percent. The actual maximum depends on packet contents; for example, a router will bit-stuff like crazy when it sees a pattern of all 1s.
Based on our probability calculations, we believe that Laurel’s throughput of 42.4 million pps represents the practical maximum rate any device could achieve with 40-byte IP packets and the packet contents we used.
Of course, most production networks handle packet sizes other than 40 bytes. Analyses of Internet traffic show an average packet length of around 200 to 400 bytes, with packet sizes tending to cluster around a few key sizes. We picked the top three sizes (40, 1,500, and 576 bytes), and the proportion in which they occur, to create our “Internet mix” (Imix) traffic load.
If the goal of the 40-byte IP packet tests was to find the absolute limit of forwarding performance, then the goal in the Imix tests was to describe the way a router will behave when handling more typical production loads.
Laurel’s ST200 again was the faster box, delivering system throughput of 4.979 million pps with the Imix load, or 100.0 percent of line rate (see Figure 3). There is a slight difference – about 260 pps – between Laurel’s throughput and the theoretical limit, but the amount of variance is so small as be insignificant. Redback’s SmartEdge router achieved throughput of 4.936 million pps, equivalent to 99.13 percent of line rate.
In troubleshooting the baseline tests, we discovered an interesting quirk in Redback’s router: It doesn’t strip off the Ethernet padding when forwarding packets to non-Ethernet interfaces such as OC48. The minimum Ethernet frame size is 64 bytes. When an IP packet is shorter (they can be as small as 20 bytes), Ethernets insert extra “padding” bytes to make up the difference. A receiving Ethernet interface usually strips off the padding before processing an IP packet, but not the SmartEdge.
Redback says it retains padding to address two different customer problems. In the first, Redback’s routers were unable to establish BGP peering sessions with another vendor’s router because the other router was adding a proprietary signature field to packets and not incrementing the total length value in the IP header. In the other scenario, another vendor’s DSL router using RFC 1483 encapsulation sent packets larger than the expected value.
In our case, it was the Redback router forwarding packets larger than we expected. We thought for sure this practice violated either the router requirements RFC or the Ethernet standard itself, but it doesn’t. We don’t think it’s a great idea, if for no other reason than padding will degrade the SmartEdge’s throughput in some situations (though not in the configuration we used). However, we also understand the imperative of keeping customers happy, even if it means making products compatible with other vendors’ bugs.
Next page: Dealing With Delay
For many applications, latency – the delay added by the system under test – is an even more important consideration than throughput. This is especially true for real-time applications such as voice-over-IP and videoconferencing.
In our tests we measured latency at the throughput level, as mandated by the RFC that deals with such things. But latency often spikes sharply just before the throughput level, and few if any users run their routers at line rate all the time. For this project, we also studied the latency-vs-load curve by measuring delay at 10, 50, 90, and 95 percent of line rate as well as at the throughput level.
Ideally, both average and maximum delay should remain low at any load. That’s what we found in test, at least with average delay and 40-byte packets (see Figure 4). Both the Laurel and Redback routers were nice and flat for all loads – and at around 30 microseconds, well below the threshold that would degrade the performance of any application.
But we saw big differences between average and maximum delay. In the 40-byte packet tests, maximum latency for Redback’s router spiked to 2.4 milliseconds. (We’ve shown the differences with a log scale in the figure.) That’s still not enough to affect most applications by itself, but it’s definitely getting close.
The human eye can perceive degradation in video quality with delays of as little as 10 ms, and the ear can distinguish changes in audio quality with delays of as little as 50 to 70 ms. Considering that there are numerous other elements involved in a typical video or audio session (multiple routers, plus the delay added by the end-stations), a delay of 2 ms from any one router could be significant.
It was the complete opposite story with the Imix load, with Redback’s router adding substantially lower average and maximum delays than Laurel’s at the throughput level (see Figure 5). This time, it was Laurel’s average and maximum values that were up in the millisecond range, with maximum delay approaching 10 ms. Redback’s router handled the Imix load with far less variation in latency as load increased. Again, we’ve used a log scale here to show the differences between average and maximum.
We don’t believe the differences in delay between the routers are as pronounced as they might appear, for a couple of reasons. First and foremost, Laurel’s throughput level was equivalent to line rate, compared with 99.13 percent of line rate for Redback. We did measure delay at the throughput level for both routers, as RFC 2544 says we should, but comparing a 100 percent result with a 99 percent result is a bit of an apples-and-oranges situation.
Second, with the pipe completely full in Laurel’s case, the effects of bit and byte stuffing come into play. More than half the packets in our Imix load are 40 bytes long, and at that length stuffing adds a very significant amount of overhead. We believe that the extra stuffing caused buffers to fill to the limit, thus increasing delay.
To find the throughput level we used a binary search algorithm that also took measurements for loads other than those stated in the test methodology. We also measured Laurel’s delay at 99.38 percent offered load (slightly higher than Redback’s throughput level of 99.12 percent). At this level, we measured average delay of 101 microseconds and maximum delay of 1,506 microseconds. Both numbers are far lower than the measurements we observed right up at line rate, again probably because of bit and byte stuffing.
All the delay numbers we’ve discussed so far have covered the entire system, which includes a mix of interfaces. Because different interfaces add very different amounts of delay, we’ve also presented all the delay results in tabular form (see table below).
Dynamic Table: Delay by Interface Type
Select fields:
Show All Fields
VendorTraffic loadOffered load (percent)Total system minimum latency (usec)OC-48 to DS-3 minimum latency (usec)OC-48 to gig minimum latency (usec)gig to OC-48 minimum latency (usec)DS-3 to OC-48 minimum latency (usec)Total system average latency (usec)OC-48 to DS-3 average latency (usec)OC-48 to gig average latency (usec)gig to OC-48 average latency (usec)DS-3 to OC-48 average latency (usec)Total system maximum latency (usec)OC-48 to DS-3 maximum latency (usec)OC-48 to gig maximum latency (usec)gig to OC-48 maximum latency (usec)DS-3 to OC-48 maximum latency (usec)
One final bit of good news from the baseline tests has to do with packet reordering. After all the controversy from the 2001 core router test regarding reordering by Juniper’s OC192 (10 Gbit/s) cards, we made sure to check whether this year’s routers would deliver packets in the correct order. They did: We’re happy to say that of the trillions of packets we threw at both boxes, every single one of them arrived in the same order in which we sent them.
Next page: Failover & Resiliency
For virtually all Internet service providers, reliability is even more important than high throughput or low latency. After all, in the long run, keeping the network up and running remains the single best way to attract and retain customers.
In our tests, we measured the ability of the router to move traffic to a backup interface upon failure of a primary circuit, even when that traffic is destined for a large number of routes.
In our setup, we connected to three interfaces on the routers (see Figure 6). On the core side of the test bed, we configured one interface on the Adtech analyzer to bring up an I-BGP (internal BGP) session with the router, and offered 500,000 routes, or nearly five times the size of today’s full Internet table.
On the core and customer sides of the router, we also brought up OSPF adjacencies on three interfaces. In this case, the optimal route to the Internet was behind our primary core-side interface. Then, we set up the Adtech analyzer on the "customer" side of the test bed to offer traffic to all 500,000 routes at a rate of 100,000 packets per second. At that rate, each lost packet would equate to 10 microseconds of downtime.
At least 10 seconds into our test, we pulled the cable attached to the primary core-side interface, forcing the router to "fail over" and use a secondary interface.
The failure of a primary interface forces the device to rerun the SPF (shortest path first) algorithm for the OSPF routes and recalculate all the BGP routes before a single packet can be forwarded over the secondary interface. Our test measured how long this took.
The results were impressive. Redback’s router recomputed more than 500,000 routes and redirected traffic to them in less than 60 milliseconds. Laurel was faster still: It took just 9.85 ms to do the job (see Figure 7). That’s over five times faster than the 50ms threshold of Sonet’s automatic protection switching (APS).
In a sense, both vendors’ results are actually much better than the APS cutover time. APS only handles the physical-layer cutover, while our tests showed both routers dealing with failures at both the physical and network layers.
The results also compare favorably with failover times in previous router tests Network Test has conducted. In previous tests, we saw failover times of around 2 seconds when using OSPF’s equal cost multipath (ECMP), and as long as 45 seconds when using Layer 2 spanning tree bridging.
Next page: MPLS: Scaling the Heights
MPLS-based VPNs represent one of the most promising applications for edge routers. The technology is intended to give customers their “own” networks, even though all traffic rides on a common infrastructure. Edge routers are the focal point for MPLS VPNs; this is where things get virtual, with one box serving many customers. Because MPLS VPNs are still relatively new, there are oft-raised questions as to whether the technology can scale to any interesting level.
To find out, we assessed two of the most common methods for building MPLS VPNs. One method employs BGP to give each customer the appearance of a routed IP network, even though the actual transport involves Layer 2 circuit switching. This method is described in RFC 2547 and a follow-on draft known as RFC 2547bis.
The other method, the so-called Martini-draft VPN, uses Layer 2 criteria rather than IP routing to set up virtual tunnels across public networks. The IETF is considering multiple drafts describing signaling, encapsulation, and transport using this method.
Let’s start with RFC 2547bis MPLS VPNs. Our tests sought to determine how many virtual routers – or “virtual routing and forwarding (VRF) instances,” in MPLS-speak – a single router could support. We also sought to measure how large a routing table each VRF instance could hold. In simpler terms, we were looking to measure how many customers a router could support, and how big each customer’s network could grow.
MPLS terminology distinguishes between “customer edge” (CE), “provider edge” (PE), and “provider core” routers (these last are known simply as “P” boxes, perhaps to avoid being politically correct). While the router under test was a PE box, what we really measured was CE-to-CE scaleability – how many customers could talk to how many other customers. There are also situations where it’s desirable to know the number of CE-to-PE connections a router can handle, but that was out of scope for this project.
We began by asking each vendor to “pre-declare” the number of VRF instances it wished to support. We then held the VRF number constant while scaling the number of routes we advertised.
Both Laurel and Redback said their routers could handle 2,420 VRF instances, the maximum our test bed could support. For both vendors, we started by bringing up 2,420 OSPF sessions on the customer-facing interfaces. Because we used Gigabit Ethernet interfaces in this test, we distinguished different customers’ traffic through the use of 802.1q VLAN tags (see Figure 8).
On the core side of the network, we then brought up an OSPF session and one session using LDP (label distribution protocol) – the means by which the router propagates MPLS labels. For these two routing sessions, the Adtech AX/4000 analyzer emulated a P router. The same analyzer also emulated a remote PE router, and to this PE box we brought up a multiprotocol BGP (MP-BGP) session over which IP routing information is forwarded.
Once all the various routing handshaking had been completed, we had the customer-side analyzers advertise 100 OSPF routes into each VRF instance. We expected the router to export the routes into MP-BGP, and then use MP-BGP to propagate the routes to the remote PE and CEs emulated by the Adtech analyzers. On the receiving side, the analyzers ran an analysis of the received routes to validate they were the same ones that had been advertised.
Both vendors’ routers successfully propagated all 242,000 routes from 2,420 virtual routers, so we aimed higher. We flushed the routing tables and tried again, offering progressively larger numbers of OSPF LSAs until the number propagated by the router no longer equaled the number advertised.
Here we saw perhaps the biggest difference of the entire test (see Figure 9). While both the Laurel and Redback devices successfully set up 2,420 VRF instances, Laurel’s ST200 built tables of 900 entries per customer, compared with 300 for Redback.
The actual limit may be even higher for Laurel’s router. While the ST200 device failed to set up 1,000 routes per VRF instance, we did not try any number in between 900 and 1,000.
Adding up all the routes involved yields some impressive numbers for both vendors’ devices. Even Redback’s SE800, with only a third the capacity of the Laurel ST200, managed to put 2,420 virtual routers and 726,000 routes on one edge device. For Laurel, the aggregate number of LSAs is an even more staggering 2.2 million. Considering that the total number of OSPF routes in the entire networks of many Tier 1 providers is well under 100,000, these are impressive feats indeed.
Next page: Martini Time
Martini-draft VPNs hold special interest for service providers because they support the use of Layer 2 technologies like Ethernet, Frame Relay, and ATM. Martini-draft VPNs (named for the primary author of the IETF drafts, Luca Martini of Level 3 Communications Inc.) give carriers the ability to virtualize their vast Layer 2 infrastructures – and bill customers as if they alone were using that infrastructure.
Our tests had three goals: To determine the maximum number of Martini virtual circuits (VCs) a single router could establish; to set up Layer 2 tunnels between customer interfaces using those VCs; and to verify that we could actually forward traffic over every tunnel.
Unlike the RFC 2547bis tests, where we used only Gigabit Ethernet, we used a combination of Gigabit Ethernet and Frame relay DS3 interfaces in the Martini tests. This shows off one of Martini’s strengths: Its ability to classify customers with hooks into whatever Layer 2 technology they use. On the Ethernet interfaces, we uniquely identified customers with 802.1q VLAN tags. On the Frame Relay side, we used a different data link connection identifier (DLCI) for each customer.
Another difference from the RFC 2547 tests was that there was no IP routing involved with customers. Instead, we simply established physical links on all the customer-side Gigabit Ethernet and Frame Relay interfaces. Then, on the core side, we brought one OSPF adjacency and one LDP session on each interface and used LDP to distribute labels to all customers. Once the tunnels were all established, we offered traffic at a relatively light load to verify that all tunnels could be used.
Both the Laurel and Redback routers maxed out at 39,232 Martini-draft tunnels, the maximum our test bed could support (see Figure 10). Further, both vendors passed traffic over all tunnels.
Initially, we intended to run binary searches to find the throughput level for all tunnels, but time constraints prevented us from completing this. As it was, we validated nearly 40,000 tunnels on Laurel’s router using an offered load of 50 percent of line rate, and an offered load of 1 percent with the Redback device. Readers shouldn’t make too much of the difference in offered loads: the Redback box may well have handled a much higher load, but time constraints prevented us from determining this.
Even without the throughput tests, the Martini numbers for both vendors are very impressive. As far as we’re aware, they represent a record for a public demonstration of Martini-draft scaleability. The numbers are also far in excess of the 2,420 tunnels we set up with BGP-based VLANs; and we achieved these results over two different types of Layer 2 transports. The Martini results clearly demonstrate that MPLS VPNs can scale to support large numbers of customers.
Next page: QOS Matters
It’s become a truism that few customers actually use quality-of-service features, but that’s beginning to change. Enterprise customers are beginning to pay providers a premium for QOS offerings. In our tests, we examined two key QOS functions: rate limiting and rate shaping.
Although most routers support at least eight QOS levels (and sometimes many more), in practice few customers need that level of granularity. For our tests, we used just three different traffic classes, which we dubbed Gold, Silver, and – noting the care with which service providers handle best-effort traffic – Particle Board.
Our Silver-class traffic was IP multicast, such as might be used in a streaming video application. The Gold and Particle Board services used unicast traffic.
We ran four QOS tests in all. First, we ran a baseline test to see whether each of 40 “customers” on 10 interfaces would receive the same mix of multicast (Silver) and unicast (Particle Board) traffic, all at the same rates. In this test, we did not overload the customer-side interfaces, so in theory all traffic should have been forwarded without loss.
Second, we assessed the routers’ per-customer queuing capabilities by firing heavier bursts of traffic at select customers. We noted whether the routers limited rates of the bursty traffic so there would be no impact on other customers.
Third, we offered higher amounts of Gold traffic to two sets of premium customers. This test measured the ability of the routers to “shape” bandwidth so that the premium customers would always receive a guaranteed bandwidth allotment, even during periods of congestion.
Finally, as a sanity check, we reran the baseline test to verify that vendors did not simply use fixed bandwidth allocations. Using “nailed-up” bandwidth is a good way to ace QOS tests, but it’s not a very efficient use of the pipe, since other traffic can’t use the pipe, even when there is no high-priority traffic to send.
In the baseline tests, we configured the core-side Adtech analyzers so that each of 40 customers would join four multicast groups. We then generated enough multicast to use about 10 percent of bandwidth available to each customer. At the same time, we generated enough Particle Board traffic to fill up the remaining 90 percent of the pipe.
We measured forwarding rates for each customer to verify the routers forwarded traffic in a 10:90 ratio. That’s what we saw: Both vendors’ routers hit the target packet rates spot-on. In fact, both the Laurel and Redback boxes came within a single packet (per second) of the theoretical perfect rates.
In the rate limiting tests, we offered the same traffic as in the baselines, but with a twist. This time, we selected four customers out of 40 and fired particle board traffic at a high enough rate to consume all available bandwidth. Here the goal was to see whether all customers still received the same 10:90 mix of Silver and Particle Board traffic, even when some customers tried to hog the pipe by attempting to retrieve data at higher rates.
Here again, both the Redback and Laurel routers did a good job (see Figure 11). Both restricted the maximum rates for customers getting the burst of Particle Board traffic. At the same time, all other customers received both Silver and Particle Board traffic at virtually the same rates as the baseline tests. This demonstrates both routers’ ability to buffer traffic and set policies on a per-customer basis.
The next test involved rate shaping, or allocating different amounts of bandwidth to different customers. As in the previous two tests, we generated a mix of two traffic classes for most customers. This time, however, we also defined two groups of premium customers: Each of four “Gold A” customers was to receive a fixed allotment of 30 Mbit/s for Gold traffic, while each of four “Gold B” customers would receive an allotment of 15 Mbit/s for Gold traffic.
To create congestion, we offered Gold traffic to both Gold A and B customers at a rate of 50 Mbit/s per customer. This forced the routers to shape bandwidth so that the premium customers received their desired allotments, while, at the same time, other customers should still have received the same 10:90 mix as before.
That’s exactly what we saw (see Figure 12). Both the Laurel and Redback routers delivered traffic to customers in the Gold A group at a packet rate equivalent to 30 Mbit/s, and to Gold B customers at 15 Mbit/s. Rates for Particle Board traffic headed to other customers were lower than in the baseline tests, but both vendors’ routers ensured that this traffic did use all available bandwidth.
Results from the final “sanity check” were essentially identical to the QOS baselines. This demonstrated that both routers could limit or shape traffic when needed – but then make bandwidth available to other traffic after congestion subsided.
Next page: BGP Basics
For ISPs, BGP (border gateway protocol) is easily the single most important networking protocol after IP itself. It is the method by which different ISPs exchange information about how and where their customers can be reached. BGP is also the basis for one of the major forms of MPLS VPNs, and a variation of BGP is essential for scaling multicast services.
Our tests measured three aspects of BGP scaleability: Peering capacity and the sizes of the BGP routing and forwarding information tables.
Before we delve into the details of testing BGP, it’s helpful to understand a bit of what BGP does. Before one ISP can exchange traffic with another, one of its routers first brings up a BGP session with a router at the other ISP. These two “peers” are then able to exchange routing information with each other. A “peering session” must exist for any two ISPs to exchange routing data.
Once a peering session is up, the first ISP’s router announces (or “advertises,” in BGP-speak) the various IP networks within the ISP’s administrative or technical control. Each advertisement also contains a single number, called the autonomous system number (ASN), that uniquely identifies the ISP.
The receiving ISP then propagates the first ISP’s advertisements, but not before inserting its own ASN before the first ISP’s. A third ISP would then insert its ASN at the front of the list, and so on. If a route advertisement traversed the boundaries of 20 ISPs, it would already have a string of 19 ASNs inserted. The 20th ISP then uses this string to find its way back to the first one.
Let’s say the 20th ISP has a packet to send to the first ISP. The first thing it will do is consult its routing table, called the “routing information base” (RIB). This table will indicate the path to a network in ISP 1 via ISP 19, then 18, and so on.
But the RIB is only a table of logical correspondences, not physical ones; it doesn’t say “to get to a network at ISP 1 use the first OC48 interface in slot 7 because it is attached to ISP 19.” For that, a second table called the forwarding information base (FIB) is needed. Entries in the FIB do correlate a specific network with a specific interface on that router.
This highly simplified description of BGP operations omits a lot, but it does show three critical requirements for testing: RIB capacity, FIB capacity, and peering.
Next page: BGP RIB Capacity
Routing information base capacity is essential, because it’s the bedrock measure of what a router does: As the number of networks attached to the Internet grows, so too must the size of the BGP table, so a router knows how to reach those networks.
Those tables are growing fast. Even with the dotcom bust, the number of Internet-attached networks has shot up from around 65,000 entries in late 1999 to around 115,000 today – a jump of more than 75 percent.
The growth rate has slowed somewhat in 2001 and 2002. Even so, if BGP table size were a publicly traded company, it would have handily outperformed any of the companies in the Light Reading Index (or the Nasdaq or any other major index, for that matter) over the same period.
BGP experts may say a large RIB capacity is far more important in core devices, not edge routers, but that’s only partially true. Certainly a core router must be able to hold the full Internet table, and core routers can use filtering and other techniques to reduce the size of the RIB on edge routers.
But by far the biggest reason RIB capacity is growing is that more and more enterprises are “multihoming” – attaching their private networks to multiple ISPs. And multihoming takes place on edge devices, not at the core.
The other issue to consider for any kind of router is RIB stability. As we saw in the 2001 core router test, some boxes simply stop working when dealing with large numbers of RIB entries. Obviously, customers won’t pay for a service based on a box that dies when the service provider becomes too successful.
To measure RIB capacity, we brought up BGP sessions on two Gigabit Ethernet interfaces of each router. Then we advertised a fixed number of routes to one interface, and listened on the other interface to ensure that the router correctly propagated everything we’d offered. If the test was successful, we’d try again with progressively larger numbers of routes until the number we offered was greater than the number the router could propagate.
We did hit the limits of the Laurel and Redback boxes – but those limits are far beyond the size of the Internet, not just today but probably for years to come (see Figure 13). The RIB capacity of Laurel’s ST200 topped out at 5.4 million routes, an impressive figure that gives the vendor bragging rights for having the largest RIB capacity we’ve ever measured. But let’s put the numbers in perspective. Redback’s SmartEdge 800, which put “only” 4 million entries into its RIB, would be able to build a RIB around 35 times the size of the entire Internet today.
Both vendors’ achievements are all the more impressive considering these are edge devices and not core routers. The largest RIB number from the 2001 core router test was 2.4 million, achieved by the Juniper M160, at the time the company’s flagship model. Both Laurel and Redback beat that number going away.
Next page: A Big FIB
As numerous Light Reading message boarders reminded us after the core router test, RIB capacity alone does not a BGP router make. Both vendors in this test demonstrated that it’s possible to get huge RIB numbers by equipping a device with big hard disks. But for actually moving packets from place to place, routers also need to build forwarding information base tables; how high would either vendor be able to scale these?
To find out, we added a FIB capacity measurement that tallies the number of routes over which a device can actually send traffic. This new test is a hybrid involving both a router’s control plane (we advertise a large number of routes) and data plane (we send traffic to each route to validate that the route is usable).
FIB tables require very fast access, and that in turn requires very expensive memory. Unlike RIB tables, which can be stored on relatively cheap hard drives, a FIB table may be called on to make as many as 60,000 lookups per second even on a network loaded at just 1 percent utilization.
To understand how fast the FIB memory must work, imagine trying to retrieve 60,000 files from a hard disk every second. And that’s a light load – in our baseline tests, the number of lookups reached more like six million per second on each of four OC48 interfaces. With lookup requirements like this, it’s easy to understand why FIB capacity is a key indicator of a device’s BGP performance.
As in the RIB tests, we brought up peering sessions on two Gigabit Ethernet interfaces of the router under test. Then we advertised a fixed number of routes and verified all of them were propagated. Once the control plane was set up, we offered a stream of 40-byte IP packets destined to each of the routes we advertised. We used an “easy” load of 1 percent utilization for this test – on Gigabit Ethernet, around 15,000 packets per second.
If the router passed all traffic without loss, we increased the number of routes advertised, until eventually the router would drop some packets. We noted the highest no-loss routing level as the device’s FIB capacity.
Redback earned the bragging rights in the FIB test (see Figure 14). Its SmartEdge 800 router forwarded traffic to 1.35 million routes, while Laurel’s ST200 moved traffic concurrently to “only” 849,990 routes.
Here again, perspective matters. The current full table of around 115,000 entries is the total number of networks attached directly to the Internet’s core. Even core routers seldom (if ever) move traffic to all 115,000 networks at the same time. Our test is a useful description of the limits of these devices, but in comparing the Laurel and Redback devices it’s fair to say that either one could handle essentially any FIB in use today, or far into the future.
Next page: BGP Peering
The third major metric for BGP routers, after RIB and FIB sizes, is peering capacity – the number of simultaneous BGP sessions one router can carry on.
Peering capacity is clearly important for core routers and at major ISP exchange points, but it’s also an essential measure for edge routers. That’s because large numbers of enterprise customers now use BGP, and the edge router is the focal point of those BGP connections. Enterprises may not advertise a large number routes over their peering connections, but one edge router may handle BGP sessions from a relatively large number of enterprises.
To determine how many concurrent BGP sessions the routers could handle, we used a total of 12 Gigabit Ethernet interfaces. On the first interface, we brought up one session using I-BGP (interior-BGP, the version used to distribute BGP information within a single ISP’s network). Over this I-BGP session we distributed 100,000 routes, similar to the way a router would distribute the full Internet table.
Then, on each of the remaining 11 interfaces, we used the Adtech analyzers to bring up a fixed number E-BGP (exterior-BGP) sessions, each representing one customer connection. Over these connections, we advertised 100 routes per customer. We continued to bring up BGP sessions from the customer side until the router would accept no new sessions.
In this test, Redback’s SE800 established a total of 1,408 BGP peers, or 128 peers per Gigabit Ethernet interface (see Figure 15). Laurel’s ST200 came in with 1,012 total peers and 92 peers per interface. The vendor says the 1,012-peer limit is hard-coded in the version we tested.
As with the RIB and FIB tests, some perspective is in order. Even very large exchange points involve at most only 100 to 200 peers (with 25 to 50 a far more common range). And at the edge there aren’t (yet) many providers able to sign up 1,000 customers running BGP off a single router. That’s a sales issue, and not a technical limitation: Both the Laurel and Redback boxes are up to the task.
Next page: OSPF Capacity
We also measured each router’s capacity to handle interior routing protocols like OSPF (open shortest path first) and IS-IS. BGP is a so-called exterior gateway protocol used for getting routing information between ISPs; for moving routing updates within an ISP, an interior gateway protocol like OSPF is needed.
OSPF is the most commonly used interior gateway protocol. It’s a very powerful protocol, with its ability to subdivide networks into multiple “areas” and to build topology databases of all areas and link states in real time. But OSPF’s power comes at a cost; it is a highly complex protocol, and it requires very significant computing power on routers.
To benchmark each router’s OSPF capacity, we modeled OSPF traffic using statistics from a Tier 1 provider. OSPF announces routes using link-state advertisements (LSAs), with different types of LSAs used to denote different entities in the network.
Inside area 0 of our Tier 1 provider, 78.13 percent of all LSAs were so-called Type 3, meaning they summarized routing information from other areas within the same domain. The breakdown also included 0.39 percent for Type 1 LSAs (intra-area router links); 1.95 percent were Type 2 LSAs (intra-area network links); and 9.77 percent each were Type 4 (external summary links) and Type 5 (external links). Incidentially, the breakdown was roughly the same for the Tier 1 provider's other OSPF areas. The only major difference in point of presence (POP) areas was that the provider had 41,000 Type 3 LSAs, vs. 40,000 in area 0.
With these percentages in hand, we constructed a capacity test similar to the BGP RIB test: First we’d establish an OSPF session (called an “adjacency” in OSPF terminology) and then use the Adtech analyzers to offer LSAs in the given percentages. If the router managed to correctly propagate all the updates and the routing topology database, we’d repeat the exercise with a larger number of LSAs.
As with all three BGP tests, this was again an exercise in excess capacity (see Figure 16). Redback’s SE800 learned and propagated around 25 percent more Type 3 LSAs than Laurel’s SE200, but the distinction is really academic.
Even the largest OSPF networks in the world today are tiny compared with the numbers achieved by both boxes. For example, the Tier 1 provider that supplied the LSA type breakdown reported that its network has around 40,000 Type 3 LSAs – or around 2.5 percent of the number achieved by the Laurel device. Here again, either device will be able to handle even the largest OSPF networks for years to come.
Next page: IS-IS Capacity
In a wonderfully cranky book on Unix, Evi Nemeth jokes that if BGP is used by ISPs and OSPF by enterprises, then IS-IS is found in insane asylums. If so, some of the world’s largest ISPs indeed qualify as houses of bedlam.
IS-IS (intermediate system to intermediate system) was developed in the 1980s as the interior gateway routing component of the OSI protocol suite. Governments and standards bodies loved OSI’s carefully architected, top-down nature, but users shunned its complexity and went with the far simpler TCP/IP protocol suite instead.
Even so, IS-IS gained traction among ISPs. Internet lore holds that the UUNet part of WorldCom (still the world’s largest ISP, even with the financial woes of its parent) adopted IS-IS more than a decade ago because early versions of OSPF were so buggy.
Besides UUNet, major IS-IS users include Cable & Wireless, NTT/Verio, Qwest, Sprint, and Telia, among others. And IS-IS may grow still more: Some network designers say it will be far easier to adapt IS-IS than OSPF for use in IPv6 networks.
At least at a high level, our IS-IS tests sought to answer the same sorts of questions as our OSPF tests: How large a network could the edge router see?
To find out, we used the Adtech analyzer to advertise IS-IS LSPs, or link-state PDUs (protocol data units), to one Gigabit Ethernet interface of the router. We then expected the router to propagate these LSPs to 11 other Gigabit Ethernet interfaces.
Like OSPF, IS-IS has a hierarchy of routes, called “levels”; we used all level-1 LSPs in our test, a topology used by at least one of the major ISPs running IS-IS.
Both the Laurel and Redback devices successfully propagated 3.4 million routes, the maximum we could advertise (see Figure 17).
We could only offer that many routes because that’s all the protocol would allow us to do with 12 interfaces. All the same, it’s important to put that 3.4 million number in perspective: It is two orders of magnitude larger than what even the world’s largest – and craziest – ISPs carry today.
You May Also Like