x
Optical/IP

Network Processors

We will now take a look at the main network processor architectures.

Generic RISC processors

These devices are typically based on a core licensed from MIPS Technologies Inc. (Nasdaq: MIPS; OTC: MIPBV) with proprietary enhancements. Ideal for the control plane and real-time operating systems (RTOSs), these devices may also be used for other applications such as high-speed printer engines:

With clock rates of 400MHz to 1GHz these devices can also be used for Layer 3 forwarding at up to 2 Gbit/s or for more complex functions at lower rates. Newer devices have high-speed packet interfaces; and devices from Broadcom Corp. (Nasdaq: BRCM), such as the BCM1250, include Gigabit MACs (see Broadcom Enhances Evaluation). The main manufacturers are Broadcom, PMC-Sierra Inc. (Nasdaq: PMCS), and SandCraft Inc., with companies like Fulcrum Microsystems Inc. and Intrinsity Inc. developing speedier components with clock rates of 2GHz and above.

RISC-based network processors

This was the earliest of the network processor architectures:

The process of extracting keys from the header and modifying fields in the packet to be forwarded is a significant part of the work for a network processor – so fast bit-manipulation is a key to performance. A generic RISC core is not ideal for header processing, largely due to the slow bit manipulation; so several companies have taken a generic RISC core design and added hardware for high-speed bit manipulation.

In most designs, the cores are arranged in parallel and a packet header is assigned to a single processor, or thread, which then completes all the processing required on that packet. This model is often referred to as ”run to completion.” The main exception to this architecture is the Intel IXP1200, where the cores appear to be in parallel but are actually arranged in a pipeline with headers passing through more than one core.

The packets are usually stored in an external memory while the header is being processed. In many designs, this packet buffer is shared with the traffic manager.

The RISC cores are easy to program, with mature C compilers available from either the vendor or third-party suppliers.

A major issue with first-generation network processors has been predicting performance. With a random stream of incoming packets and the shared hardware engines for functions such as lookup and statistics, simple changes to the code can have a significant impact on throughput. Many vendors have now addressed this issue by reducing internal dependencies or by simply increasing the clock frequency to allow adequate headroom.

The manufacturers with this basic architecture include: Applied Micro Circuits Corp. (AMCC) (Nasdaq: AMCC), IBM Corp. (NYSE: IBM), Intel Corp. (Nasdaq: INTC), Internet Machines Corp., Motorola, Silicon Access Networks Inc., and Vitesse.

VLIW-based network processors

The third major network processor architecture is based on VLIW (very long instruction word) engines arranged in a pipeline:

In this architecture, the header – and in some cases the whole packet – passes through all the engines in the pipeline. The stages of the pipe may all be the same or they may be quite varied, with engines designed for very specific functions such as TTL (time to live) decrement. With many different engines, this architecture can be difficult to program. Each engine may have its own instructions and its own compiler.

Agere Systems (NYSE: AGR) is an exception to this programming issue. Although its processors do not support C, they do have their own classification language called FPL. For non-classification functions they have a C-like language.

The main advantage of the VLIW-based architecture is that performance is deterministic. The number of clock cycles is fixed for each stage, and the maximum packet rate is derived from this.

The major disadvantage is that because the device has a fixed number of pipeline stages and specific engines at each stage, the system is less flexible than with a fully programmable RISC-based network processor.

The main manufacturers with this architecture are Agere, Bay Microsystems Inc. (see Bay Joins the Big Leagues), EZchip Technologies (see EZchip Sallies Fourth and EZchip Redoes It ), Sandburst (see Intel Backs Another Switch Chip), and Xelerated AB (see Xelerated Touts 40-Gig Toolbox and Swedes Claim Processor Advance).

Previous Page
5 of 11
Next Page
dlazar 12/4/2012 | 9:58:50 PM
re: Network Processors In June, The Tolly Group teamed up with IBM to educate systems developers about the emerging market for network processors and how they stand to radically alter the R&D landscape.

The test results and hour long audio cast with power point slide show are available online at The Tolly Group Web site: http://www.tolly.com/networkpr...

Scientist 12/4/2012 | 9:58:01 PM
re: Network Processors Assuring wire speeds at OC-48c, OC-192c and beyond is not casual in nature, nor are the design parameters. These are microwaves, and as such require talents that aren't realized until after the second or third spin of circuitpacks and backplane. Such costs are prohibitive to start-ups.
I recently looked at the task of breaking out a 10Gbps cml flipchip 1mm BGA to traverse circuitpack, conn., backplane, conn., circuitpack. There is a limit! Your MPLS system may not be possible. There is a maximum distance in the best coax. PCBs reduce this distance even more. By adding in (x)jitter/inch, rise time, atten., and losses, there goes your signal, eye pattern and SONET spec.
The talent domain must be generated/expanded. The NP mfgr doesn't make money until the customers' systems are working. Setting up Telecom/Datacom or its' vendors for a hard lesson at this time is not adviseable.
mrcasual 12/4/2012 | 9:57:46 PM
re: Network Processors Assuring wire speeds at OC-48c, OC-192c and beyond is not casual in nature, nor are the design parameters. These are microwaves, and as such require talents that aren't realized until after the second or third spin of circuitpacks and backplane. Such costs are prohibitive to start-ups.
I recently looked at the task of breaking out a 10Gbps cml flipchip 1mm BGA to traverse circuitpack, conn., backplane, conn., circuitpack. There is a limit! Your MPLS system may not be possible. There is a maximum distance in the best coax. PCBs reduce this distance even more. By adding in (x)jitter/inch, rise time, atten., and losses, there goes your signal, eye pattern and SONET spec.


I'm assuming here that you are referring to carrying a 10Gbps stream on a single trace (or differential pair). This isn't really an issue.

All NPUs/framers/MACs/etc in the 10G space use either SPI4.2 or SPI4.1/CSIX interfaces (or their SFI cousins) to go between chips. Refer to page 4 of the article.

These interfaces are all parallel interfaces. In the case of SPI4.2 specifically it is 16 data lines (differential) each running at ~700Mbps.

While not trivial to design, this is relatively stable technology and is even available in FPGAs today as both Xilinx and Altera offer SPI4.2 cores.

The next generation of NPUs, assuming anyone wants them, that targets 40Gbps rates will use SPI5/SFI5 as the data interfaces. Again these are parallel interfaces with the individual bit lanes carrying data at ~3Gbps.

HOME
Sign In
SEARCH
CLOSE
MORE
CLOSE