A guide to chips that might power tomorrow’s routers * What they are * Why they’re hot * Who makes them

August 8, 2002

34 Min Read
Network Processors

Every new technology seems to go through a rollercoaster ride in its early years. It’s hailed as a huge breakthrough; big disappointments follow; and then, sometimes, second- or third-generation products emerge, and serious interest – from engineers rather than marketeers – begins to build.

Network processors – chips that aim to make it easier for system vendors to develop packet-processing equipment like routers and application-aware switches – have now arrived at the interest-from-engineers stage. The first network processors, which operated at speeds of 1 and 2.5 Gbit/s, met with mixed success. But lessons have been learned, and the next generation of network processors are now arriving – ones that solve many of the early problems and also operate at faster speeds, up to 10 Gbit/s. As a result, they’re starting to gain some traction with systems manufacturers.

Scoping out this emerging market isn’t easy, however, because there’s a wide assortment of network processors targeting various applications. And there’s a swarm of players – most of the heavyweight manufacturers of communications chips, plus plenty of startups, although some of those have already been absorbed by the big vendors.

This report aims to make it easier to understand what’s going on. It first sets the scene by reviewing what network processors are and the purposes they’re used for in telecom equipment. It then identifies the main types of network processor and co-processor, pinpointing the key features to look for. Finally, it reviews the network processor market and compares and contrasts the characteristics of network processors from leading chip vendors for each application area.

Here’s a hyperlinked summary:

  • Where to Use Network Processors

  • What Is a Network Processor?

  • Network Processor Chipsets

  • Network Processor Architectures

  • The Network Processor Market

  • Enterprise and Ethernet Access Network Processors

  • Multiservice Access Network Processors

  • Carrier-Class Metro and Core Router Network Processors

  • General Network Processors

  • Network and Packet Processor Tables

A preview of this report, in the form in an online presentation by the author, Simon Stanley, Principal Consultant at Earlswood Marketing Ltd., can be downloaded free of charge from Light Reading’s Webinar archive.

Introduction by Peter Heywood, Founding Editor, Light Reading
http://www.lightreading.com

Want to know more? The big cheeses of the optical networking industry will be discussing this very topic at Opticon 2002, Light Reading’s annual conference, being held in San Jose, California, August 19-22. Check it out at Opticon 2002.

Register now and save $500 off the registration fee. Just use the VIP Code C2PT1LHT on your registration form, and deduct $500 from the published conference fee. It's that simple!

In small networks with one protocol processing packets, the classification and forwarding of packets or cells is simple. With large networks and multiple protocols, including Internet Protocol (IP) – IPv4 and IPv6 – and Multiprotocol Label Switching (MPLS), as well as Asynchronous Transfer Mode (ATM) and legacy protocols such as Frame Relay, the packet processing function is much more complex and requires a flexible solution.

With data rates continuing to increase exponentially, this packet processing function needs to be fast, with data rates of 2.5 Gbit/s to 10 Gbit/s and above. The network processor is the key to implementing the fast, flexible packet processing function that is required throughout these networks.

The key application areas are:

  • Enterprise and Ethernet Access Systems

  • Multiservice Access Network Boxes

  • Carrier-Class Metro and Core Routers

Towards the end of this report we will examine the requirements for these three application areas and list the network processors optimized for each.

A further area for network processors is expected to be storage networks, with Adaptec Inc. (Nasdaq: ADPT), Silverback Systems Inc., and Trebia Networks Inc. each working to be first with a specialized storage processor.

Most networking boxes fit in one of two categories:

  • Chassis-based systems

  • Integrated systems or ”Pizza Boxes”

Chassis-based systems

In a chassis-based system we have a number of line cards plugged into the chassis, with either a distributed or a centralized switch fabric. The physical ports are mounted on the front edge of the line card.

19354_1.gifHere there are one or more network processors on each line card. The aggregate throughput of the box could be in the Terabit range, with the throughput per line card growing from 10 Gbit/s to 40 Gbit/s in the next couple of years.

Integrated systems

The other extreme is the integrated system or ”Pizza Box”:

Here we have one or more network processors on the main board. If there is a switch fabric here, it will be very small and simple. The physical ports are either mounted on the main board or on small physical-layer cards plugged into the main board. The throughput of the box is in the range of 10 Gbit/s to 80 Gbit/s.

A network processor is a programmable device that processes packets or cells at rates of 1 Gbit/s and above.

The very first routers used generic processors to classify packets based on their headers and then forward them to the appropriate destinations.

As speeds increased, companies used hardwired ASICs (application-specific integrated circuits) to implement the simple classification and forwarding functions. Generic processors were retained to handle exceptions and to manage the routing tables – a function now referred to as the ”Control Plane.” The ASIC works the ”Data Plane” or fast path.

Network processors combine the programmability of the generic processor with the speed of an ASIC.

Why use network processors?

The main reasons for using a network processor are time to market (TTM) and time in market (TIM).First, by using off-the-shelf network processors you can rapidly develop a working system without the time and expense required to develop complex ASICs, thereby speeding your TTM.

Second, by having a programmable device you can rapidly respond to changed customer demands while the system is in development, once the product is shipping, or when its deployed in the field. This dramatically increases the TIM for the product, enhancing your company’s competitiveness.

Network processors can provide the best of both worlds: the flexibility of a processor and the speed of hardwired ASIC solutions.

Network processor functions

The main network processor functions are: header classification; deep packet analysis; packet processing; policing and statistics; and traffic management.

  • Header classification

    • To extract bits from the packet headers and do a lookup in the routing tables to determine the requisite actions. There may be several nested headers requiring a number of lookups. Some network processors use relatively expensive and power-hungry TCAM (ternary content addressable memory). Others use hierarchical data structures in SRAM (static random access memory) or even DRAM (dynamic RAM).

      Third-party classifiers and TCAMs are available from several companies, including Integrated Device Technology Inc. (IDT) (Nasdaq: IDTI) (see IDT Intros ICs), SiberCore Technologies (see SiberCore Intensifies Searches), and Solidum Systems Corp. (see Solidum Ships 2.5-Gig Processor and Solidum Debuts Processor Family).

  • Deep packet analysis

    • If you want to route or queue packets based on Layer 7 (application) information, you need to search for strings within the packet body. For example, you may need to search for particular “http://” headers. Several companies, such as Hifn Inc. (Nasdaq: HIFN) (see Agere, Hifn Team on Security) and Raqia Networks Inc. (see Raqia Unveils Deep Packet Processor), are developing deep packet co-processors for use with any network processor chipset.

  • Packet processing

    • Packet forwarding and modification. Once the packet has been classified and the lookup results are obtained, the packet can be forwarded to its destination. The packet may be modified slightly – for example, by adding a tag for the traffic manager or switch fabric – or a more major modification may be required, such as encapsulating the packet inside another to be tunneled to the destination.

  • Policing and statistics

    • You may need to filter packets based on SLAs (service-level agreements) and the classification results. To ensure you can bill and manage your network you may need to collect statistics on a per-flow basis. The per-flow information is stored in TCAM, SRAM, or DRAM.

  • Traffic management

    • This is the scheduling and queuing of packets according to priority. The level of functionality required is heavily dependent on the application. Most traffic managers require DRAM to store the queued packets and SRAM for the flow and queue status. Third-party traffic management devices are available from Azanda Network Devices (see Azanda Flips Its Chips), Vitesse Semiconductor Corp. (Nasdaq: VTSS) (see Vitesse Improves Traffic Manager), and ZettaCom Inc. (see ZettaCom Intros OC192 Solution).

    Most network processor chipsets support all the functions listed above. In some cases they are all integrated into a single device; in many cases the functions are integrated into a number of devices; and in a few cases the designer must develop additional ASICs. Some chipsets need a third-party co-processor for functions such as classification, deep packet analysis, and traffic management, especially if the throughput required is close to the capacity of the network processor. A few network processors also integrate a generic or control plane processor, but all offer the option to use a separate control plane processor if required.

    Routing tables are usually stored in SRAM or TCAM, although one or two network processors also support DRAM. Packets, on the other hand, are either stored internally or in an external DRAM – as the chipset table at the end of the report shows, every standard DRAM type is used by at least one network processor family.

    To connect network processors to co-processors, as well as to the framer and switch fabric devices, there are a number of standard interfaces that are supported by most of the devices entering the market now:

    19354_3.gifThe SPI-3 and SPI-4 interfaces, from the Optical Internetworking Forum (OIF), are supported by most framer devices and several switch fabrics (see OIF Gives 40 Gig a Boost). The Network Processing Forum (NPF) has published the CSIX interface, for connecting network processors to switch fabrics, and a Look-Aside interface, for connecting to co-processors that are not in the main packet flow.

    Most of today’s co-processors connect through simple SRAM interfaces. Control plane processors usually connect through PCI or generic bus interfaces.

    Software modelIn a typical setup, the host network processor, host operating system, higher-layer software, and low-level API (application program interface) run on the control plane processor. The fast-path code runs on the network processor.

    Most network processor vendors supply reference fast-path code and low-level APIs, but this is likely to require a significant rewrite for production systems. A recent development is the announcement from Sandburst Corp. that it will supply firmware for a specific application, removing the opportunity (as well as the cost) for customers to customize the code. Although fast-path code can be a relatively small number of instructions, the availability of a high-level language such as C, or production-ready application code, can significantly reduce development time.

    “You get the performance, flexibility, and time-to-market benefits by using a high-level language,” says Bob Gohn, director of product marketing at Motorola Inc. (NYSE: MOT).

    We will now take a look at the main network processor architectures.

    Generic RISC processors

    These devices are typically based on a core licensed from MIPS Technologies Inc. (Nasdaq: MIPS; OTC: MIPBV) with proprietary enhancements. Ideal for the control plane and real-time operating systems (RTOSs), these devices may also be used for other applications such as high-speed printer engines:

    19354_4.gifWith clock rates of 400MHz to 1GHz these devices can also be used for Layer 3 forwarding at up to 2 Gbit/s or for more complex functions at lower rates. Newer devices have high-speed packet interfaces; and devices from Broadcom Corp. (Nasdaq: BRCM), such as the BCM1250, include Gigabit MACs (see Broadcom Enhances Evaluation). The main manufacturers are Broadcom, PMC-Sierra Inc. (Nasdaq: PMCS), and SandCraft Inc., with companies like Fulcrum Microsystems Inc. and Intrinsity Inc. developing speedier components with clock rates of 2GHz and above.

    RISC-based network processors

    This was the earliest of the network processor architectures:

    19354_5.gifThe process of extracting keys from the header and modifying fields in the packet to be forwarded is a significant part of the work for a network processor – so fast bit-manipulation is a key to performance. A generic RISC core is not ideal for header processing, largely due to the slow bit manipulation; so several companies have taken a generic RISC core design and added hardware for high-speed bit manipulation.

    In most designs, the cores are arranged in parallel and a packet header is assigned to a single processor, or thread, which then completes all the processing required on that packet. This model is often referred to as ”run to completion.” The main exception to this architecture is the Intel IXP1200, where the cores appear to be in parallel but are actually arranged in a pipeline with headers passing through more than one core.

    The packets are usually stored in an external memory while the header is being processed. In many designs, this packet buffer is shared with the traffic manager.

    The RISC cores are easy to program, with mature C compilers available from either the vendor or third-party suppliers.

    A major issue with first-generation network processors has been predicting performance. With a random stream of incoming packets and the shared hardware engines for functions such as lookup and statistics, simple changes to the code can have a significant impact on throughput. Many vendors have now addressed this issue by reducing internal dependencies or by simply increasing the clock frequency to allow adequate headroom.

    The manufacturers with this basic architecture include: Applied Micro Circuits Corp. (AMCC) (Nasdaq: AMCC), IBM Corp. (NYSE: IBM), Intel Corp. (Nasdaq: INTC), Internet Machines Corp., Motorola, Silicon Access Networks Inc., and Vitesse.

    VLIW-based network processors

    The third major network processor architecture is based on VLIW (very long instruction word) engines arranged in a pipeline:

    19354_6.gifIn this architecture, the header – and in some cases the whole packet – passes through all the engines in the pipeline. The stages of the pipe may all be the same or they may be quite varied, with engines designed for very specific functions such as TTL (time to live) decrement. With many different engines, this architecture can be difficult to program. Each engine may have its own instructions and its own compiler.

    Agere Systems (NYSE: AGR) is an exception to this programming issue. Although its processors do not support C, they do have their own classification language called FPL. For non-classification functions they have a C-like language.

    The main advantage of the VLIW-based architecture is that performance is deterministic. The number of clock cycles is fixed for each stage, and the maximum packet rate is derived from this.

    The major disadvantage is that because the device has a fixed number of pipeline stages and specific engines at each stage, the system is less flexible than with a fully programmable RISC-based network processor.

    The main manufacturers with this architecture are Agere, Bay Microsystems Inc. (see Bay Joins the Big Leagues), EZchip Technologies (see EZchip Sallies Fourth and EZchip Redoes It ), Sandburst (see Intel Backs Another Switch Chip), and Xelerated AB (see Xelerated Touts 40-Gig Toolbox and Swedes Claim Processor Advance).

    The network processor market is starting to mature, with over 600 design wins. Current NPU shipments are dominated by the big six; AMCC, IBM, Intel, Agere, Motorola, and Vitesse. All these companies have second-generation devices sampling or in development, and most will soon have both 2.5-Gbit/s and 10-Gbit/s chipsets (see OC192 Processors: Who's First? and RHK Reports Packet Silicon Revenues).

    A couple of established startups, such as Silicon Access and EZChip, are finally bringing their complete chipsets to the market to give designers new alternatives to the incumbents. Several newer startups, such as Bay Microsystems, Cognigine Corp. (see Startup Spins Novel Network Processor and Cognigine Unveils Network Processor), Internet Machines (see Internet Machines Takes Aim at Zettacom), Sandburst, and Xelerated have new and innovative architectures that may take some time to be adopted by developers.

    “If you go out and talk to customers, then the question was: ‘Well, what is a network processor, and why do I need one?’ Now, I think that question doesn’t come up any more. So now the question is: ‘Why do I need your network processor, as opposed to someone else’s?’ ” says Bill Mello, marketing programs manager at Intel.

    Going forward, generic RISC processors will get faster and be used increasingly for low bandwidth, fast-path applications of 1 Gbit/s or less, in addition to control plane applications. Network processors will continue to develop, integrating more functionality and reducing the system cost and power consumption. The performance of network processors will increase, supporting up to four 10-Gbit/s interfaces on a line card or system. The market is likely to split into two, between those offering low cost, upgradeable functionality in firmware and those providing fully programmable solutions.

    With so many competitors, especially in the multiservice access area, there is likely to be significant consolidation as the market matures. We have already seen this happening, with several startups winding up and Vitesse calling a halt to further network processor development.

    The least demanding application area for network processors is the enterprise and Ethernet access market. Here low cost is vital, so highly integrated single-chip solutions in the $100 to $200 range (excluding memory) for 1 Gbit/s and 2.5 Gbit/s are the key to success. Ten-Gbit/s chipsets are currently about $1,000 excluding memory.

    There is no requirement for advanced traffic management, but integrated control plane processing and integrated Ethernet MACs are a major advantage. Power could be a major issue for 40- to 80-Gbit/s integrated systems.

    EZchip Technologies

    The NP1 from EZChip is a full duplex, 7-Layer, 10-Gbit/s network processor based on a four-stage VLIW pipeline. At each stage there are multiple processors in parallel. The device integrates the classification search engines on-chip and uses on-chip memory for small lookup tables, extendable to large tables using external DRAM.

    The device is sampling, and a traffic management device, the QX-1, is expected to be available at the end of the third quarter this year. The device includes eight 1-Gbit/s and one 10-Gbit/s Ethernet MACs to support eight 1-Gbit/s interfaces, a single 10-Gbit/s interface, or a single SPI4.2 interface.

    Intel Corp.

    The 200MHz IXP1200 is a basic network processor capable of 1 to 2 Gbit/s. The device does not support standard interfaces but uses Intel’s proprietary IXP bus and integrates a StrongARM generic processor. The device is also available in 133MHz and 266MHz versions. With the full weight of Intel behind, it the team has built up an impressive array of third-party suppliers to support the architecture (see Intel: The Prince of Processors?).

    The IXP1200 is in production, and Intel has announced two further devices taking advantage of Intel’s silicon technology. The IXP-2400 will be a 2.5-Gbit/s device for edge applications, with the IXP-2800 for 10-Gbit/s edge aggregation and core applications. These new devices will run at 600MHz and 1GHz respectively, with 400MHz and 1.4GHz versions planned.

    Paion

    Paion is a startup company spun out of Samsung Electronics. The GEP2C02 packet processor, expected to sample in the fourth quarter 2002, has integrated Gigabit Ethernet MACs supporting two Gigabit channels or 16 10/100MHz channels. The initial device can only be used with the Paion switch fabric, but future devices may support standardized interfaces.

    Sandburst Corp.

    Announced in June this year, the 10-Gbit/s HiBeam chipset – including packet processor, traffic management, and switch fabric functions – takes a slightly different approach to network processors. The FE-1000, expected to sample in the fourth quarter, is supplied with fast-path firmware optimized for VPN (virtual private network) applications. Traffic management is implemented in the QE-1000 queuing engine, which is part of the HiBeam switch fabric chipset.

    In the Multiservice access application, many ports are aggregated to a single network processor. Low power and the integration of Layer 2 functionality – including ATM SAR, Ethernet MAC, Packet over Sonet, and Frame Relay – are key advantages. These network processors cover a broad range of cost/performance, running from $250 to more than $1,000.

    By including policing and per-flow statistics, as well as fine-grained traffic management with up to 64,000 queues, the right network processor can provide the core of a very competitive and flexible multiservice solution.

    Applied Micro Circuits Corp. (AMCC)

    Having been the first to use the term “network processor,” AMCC continues to be the company with the largest shipments of them. The np7250 and np7510 network processors are developed from the packet and ATM switching technology acquired when AMCC purchased MMC. Based on an in-house RISC core optimized for header processing, the devices use external TCAMs for classification.

    The 2.5-Gbit/s np7250 and the 10-Gbit/s np7510 both interface to the 10-Gbit/s nPX5710/20 traffic management chipset. This two-device chipset can be used with two np7250 devices, giving a full duplex 5-Gbit/s solution; or two chipsets (four devices) can be used to support a full duplex 10-Gbit/s system with two np7510 network processors. The nP5710/20 also includes a virtual SAR (segmentation and reassembly) function.

    The np7250 is in full production, with the np7510 currently sampling. AMCC's roadmap includes a single-chip, full-duplex, second-generation, 10-Gbit/s NPU.

    IBM Corp.

    The IBM NP4GS3 is the only network processor developed in-house by a major semiconductor company. The 4-Gbit/s NP4GS3 is in production, as is a less expensive 2-Gbit/s version, the NP2G. IBM is also developing an enhanced version, the NP4GS4 with standardized interfaces and double the performance; it is expected to sample in the first quarter of 2003.

    The NP4GS3 consists of an input block, an output block with traffic manager, and a processing block. The processing block contains 16 RISC-based processing engines, each of which processes a complete packet header. The packets are stored in external DRAM. The size of the leader is programmable, so each processor could process complete packets. For a 10-Gbit/s chipset planned for the end of 2003, IBM will put the three blocks in separate devices. The devices support 2,000 queues internally. This can be increased on the NP4GS4 to 256k with an external scheduler.

    Motorola Inc.

    With integrated Layer 2 controllers, the C-5 2.5-Gbit/s processor is ideal for access applications. The 16 RISC cores can be configured in parallel or in a pipeline. The device is in production and will support concatenated interfaces up to OC12c (622 Mbit/s) and a single OC48c (2.488 Gbit/s) interface when used with the M-5 channel adapter sampling soon. The C-5 includes basic policing and traffic management, supporting up to 512 queues.

    Motorola has just announced the C-5e and C-3e, due to sample in the third quarter 2002. The C-5e is an enhanced version of the C-5 with lower power and higher performance, supporting a CSIX switch interface and up to four Gigabit Ethernet interfaces. The C-5e traffic management can be extended to support 256k queues, using the Q5 traffic management co-processor. For lower bandwidth applications Motorola will also be sampling the C-3e network processor and traffic management co-processor (see Motorola Has a Roadmap).

    Vitesse Semiconductor Corp.

    The IQ2000 2.5-Gbit/s network processor, now in production, integrates two Gigabit Ethernet MACs with policing and traffic management. An enhanced version, the IQ2200, is sampling now, with 400MHz RISC cores and power consumption typically reduced to 4W. The IQ2200 supports both the proprietary interfaces on the IQ2000 and a CSIX switch interface.

    Vitesse has recently announced that it’s stopping development of both future network processors and their PaceMaker traffic management devices (see Vitesse Drops Some Packets).

    For the carrier-class metro and core router market, high bandwidth and high availability are key requirements. By supporting redundant interfaces and error correction on memory interfaces, these network processors are suitable for use in carrier-class systems requiring 99.999 percent availability.

    With the need for large routing tables – with at least 1 million entries – and advanced traffic management, this is a very challenging application area. Chipset prices run as high as $3,000.

    Agere Systems

    Agere started out as a network processor company before being purchased by Lucent Technologies Inc. (NYSE: LU) and then spun out along with the rest of Lucent’s semiconductor business. Agere Systems is now a leader in optical components and integrated circuits for communications networks.

    Agere’s original 2.5-Gbit/s PayloadPlus chipset, now in production, was built around FPL, a high-level classification language. The 133MHz PayloadPlus uses low-cost memory for tables and integrates policing and statistics along with a traffic manager supporting 64,000 queues. Agere also has an AAL2 SAR device called the VPP.

    In July 2002 Agere announced the INP5, a 266MHz single-chip integration that reduces the full duplex chipset from six devices to one, halving the chipset cost and power while increasing the number of queues supported to 256k. “We have the full network processor functionality, including classification, policing, statistics, queuing, scheduling, shaping, traffic management, segmentation, reassembly, modification. All the things you need in a carrier-grade system,” says Rob Munoz, product marketing manager with Agere. The INP5 is expected to sample towards the end of 2002.

    The two-device PayloadPlus 10G chipset and a lower-cost, 133MHz, version of the INP5 will also be available towards the end of 2002.

    Silicon Access Networks Inc.

    The iFlow chipset has been introduced over the last year with all the devices now sampling. Each of the devices can be used individually, or the whole chipset can be used to provide full duplex 10-Gbit/s network processing capability (see Silicon Access Launches Billing Chip).

    The iPP packet processor has 32 300MHz custom RISC cores (Atoms) and can be programmed using assembler or a subset of C developed for the Atoms. A key technology in the chipset is embedded RAMs for tables and counters, with these taking the place of third-party RAMs used in competing products. The chipset, which includes a dedicated policing and statistics device, can be used with third-party traffic managers from companies like Zettacom.

    The remaining companies covered in this report have devices with attributes for more than one, or even all, of the application areas we have discussed. The services processor from Fast-Chip is not a true network processor but can be used for much of the classification and modification functions required of a network processor. Coupled with a control plane processor, this device could serve as a packet processing device without the cost of developing fast-path code.

    Bay Microsystems Inc.

    Bay’s Montego, currently sampling, is a highly integrated 10-Gbit/s network processor. The device can be used in carrier-class systems from access to long haul and supports 1 to 4,000 channels with an integrated traffic manager handling 64,000 queues. Two devices are required for a full duplex system (see Bay Joins the Big Leagues).

    The Montego architecture is a parallel pipeline of VLIW engines. The incoming packet is split across a number of engines arranged in parallel and is then clocked through the pipeline. At each stage, the parallel engines run specific instructions dependent on the packet type and location in the pipeline. “The machine was developed to be completely deterministic, so it is a fixed execution cycle through the entire pipe,” says Chuck Gershman senior VP of marketing and sales at Bay Microsystems.

    Broadcom Corp.

    Broadcom supplies a large range of system-on-a-chip solutions for the broadband communications markets. The BCM1250 was developed by SiByte before its acquisition by Broadcom.

    The BCM1250, BCM1125, and BCM1125H are based on licensed cores from MIPS Technology. The devices have one or two cores scaleable from 400MHz to 1GHz. The devices include Gigabit Ethernet MACs and support GMII and Hypertransport interfaces as well as PCI.

    Cognigine Corp.

    The Cognigine CGN16100 is a generic full-duplex 10-Gbit/s network processor with 16 VISC (Variable Instruction Set Communications Architecture) cores connected by an on-chip crosspoint switch. The incoming packets are split into fixed size blocks, which are then processed by the cores. The cores can be programmed for classification and forwarding as well as policing and traffic management. “The efficiency we get is from memory latency and cycle count. We have been developing this architecture for about four years now,” says Nick Kucharewski, president and CEO (see Startup Spins Novel Network Processor).

    Tables are stored in SRAM, DRAM, or TCAM, depending on the performance required. The peak throughput is 25 Mpps (million packets per second) full duplex. Samples are expected in the third quarter of 2003.

    Fast-Chip Inc.

    The PolicyEdge Services Processor is a device that deterministically classifies, edits, and polices packets. Each service operation (one clock) consists of an arbitrary key extraction, search, and multiple actions – which can include packet modification, forwarding, policing, statistics, and branching to further service operations. The RouteExpand companion chip extends the policy database to millions of entries over 128,000 tables. Both 2.5-Gbit/s and 10-Gbit/s PolicyEdge devices are sampling.

    The PolicyEdge devices can be used to extend LAN switch functionality, to offload NPUs, to add new capabilities to hardwired ASICs, or with a third-party traffic manager. “If network processors are going to be successful, the guys designing with ASIC today do not want to replace their 18 months of ASIC design with 18 months of software development. The bottom line is time to functionality,” says Charlie Jenkins, VP for marketing and business development (see Third-Time Lucky for Fast-Chip?).

    Internet Machines Corp.

    Internet Machines is developing a complete 10-Gbit/s chipset, including network processor, traffic manager, and switch fabric, which is expected to sample in the third quarter 2002. The network processor has 64 333MHz RISC cores licensed from ARC International plc (ARC Cores) (London: ARK).

    Classification, policing, and statistics all run in software, with external TCAM required for routing tables. Programming is supported through third-party C compiler and development tools.

    PMC-Sierra Inc.

    PMC-Sierra has been shipping general purpose processors based on a licensed MIPS core since 1998. The devices are primarily used in high-end printers and networking boxes.

    The R9000X2, which has two 1GHz cores, will be sampling in the third quarter. The device has interfaces for Hypertransport and MIPS SysAd bus. It can be connected through a third-party SysAd controller to a number of standardized interfaces, including PCI.

    SandCraft Inc.

    The Sandcraft SR71010A is a 600MHz processor with a MIPS licensed core. The device supports various standardized interfaces, including PCI, through the SysAd bus and third-party controllers.

    Xelerated AB

    The X10 network processor and T10 traffic manager are based on a 200-stage VLIW pipeline, with the program moving along the pipeline together with the packet data. The pipeline is split into 10 sections with a FIFO between each one. At the end of a section the program can issue a co-processor or memory access and the packet waits in the FIFO until the result is returned. The T10 traffic manager supports policing and statistics as well as ATM SAR.

    There will be three versions of the X10 with one, two, or four 10-Gbit/s ports and two versions of the T10 with one or two ports. The first X10 is expected to sample in the third quarter 2002, with the T10 following in the next year’s first quarter.

    This report concludes with two tables. Where information is undisclosed or unknown this is signified by “?”; an irrelevant entry is shown by ””. Network Processor Chipset Table

    The main network processor chipsets, detailing the number of devices in a full duplex line-rate design, the device required for each function, and external memory requirement. If the packet processor includes other functions, these are shown as ”integrated” in the relevant column. If the memory required for a specific function is on-chip then this is shown as ”internal.”

    This table does not include the RISC processors from Broadcom, PMC-Sierra, and SandCraft. These devices are only shown in the Packet Processor table.

  • Company

    • The semiconductor vendor.

  • Name

    • The name of the chipset or, where this is a single chip solution, the name or number of the network processor.

  • Full Duplex Chipset

    • For a full duplex network processor solution we list:

    • Number of devices: This is the number of devices from the network processor chipset required for a full duplex solution. It does not include general purpose memory devices. The number of memory devices is heavily dependent on the application. For an indication of the number of external memory subsystems required look at the memory details under the individual function columns. For example, the Silicon Access iFlow chipset, which includes a number of specialized memory devices, has a high device count but requires few additional memory devices.

    • Full Duplex Power: This is the power (maximum or typical) for the full chipset (excluding memory). In some cases only the power for part of the chipset has been released.

    • Full Duplex Price: This is the price in production quantities for the full chipset (excluding memory). In some cases only the price for part of the chipset has been released.

    • Sample Availability: The quarter that samples are expected to be available. Where the device is already available we state whether the device is sampling or in production.

  • Throughput

    • The uni-directional throughput of the full duplex chipset (e.g., a device supporting a single, full duplex OC192 interface is 10 Gbit/s, not 20 Gbit/s).

    • I/O Throughput: The bandwidth of the chipset interfaces.

    • Packet Throughput: The theoretical maximum packet processing rate (millions of packets per second). This rate may not be achieved for more complex packet processing.



  • CPU

    • This is the type of Control Plane Processor integrated into the chipset. Not all chipsets include this function.

  • Header Classifier

    • Classifier Device: The device that performs header classification. In many chipsets this function is either integrated into the packet processor or is implemented in software.

    • Classifier Memory: The memory subsystem required to support header classification. This may consist of one or more individual devices. To increase the performance or reduce the cost of some solutions one can use a third-party classification co-processor to replace some or all of the memory.

  • Deep Packet Classifier

    • Deep Packet Device: The device that performs deep packet string searches. In a few chipsets this function is either integrated into the packet processor or is implemented in software.

    • Deep Packet Memory: The memory subsystem required to support deep packet string searches. This may consist of one or more individual devices. With most chipsets you can use a third-party deep packet co-processor.

  • Packet Processor

    • Packet Processor Device: This is the core of the chipset and in many cases is a single-chip network processor solution. The packet processor uses the classification results to modify and forward the packets or cells. The FastChip PolicyEdge chipset does not include a packet processor but does support packet editing in the classifier.

    • Packet Processor Memory: The main memory subsystem for the packet processor. This may consist of one or more individual devices. In some cases the device will only require internal memory for some applications. Some devices require additional memory for instruction store.

  • Policing/Statistics

    • Policing/Statistics Device: Some chipsets support policing and per-flow statistics. This may be a separate device, integrated into the packet processor or traffic manager or implemented in software.

    • Policing/Statistics Memory: The memory subsystem required to support policing and per-flow statistics. This may consist of one or more individual devices.

  • Traffic Manager

    • Traffic Manager Device: Some chipsets include a traffic manager. This may be one or two devices, integrated into the packet processor or implemented in software.

    • Traffic Manager Memory: The memory subsystem for the traffic manager. This may consist of one or more individual devices. Some devices require both DRAM for packets and SRAM for queue information.

    • Traffic Manager Queues: The number of queues gives a rough guide to the capability of the traffic manager. Some devices such as the IBM NP4GS4 and the Motorola C-5e support a small number of queues internally and a large number with the addition of a traffic management co-processor.



    Dynamic Table: Network Processor Chipsets

    Select fields:
    Show All Fields
    Company Name Number of devices Full Duplex Power Full Duplex Price Sample Availability I/O throughput Packet throughput Control Processor Classifier Device Classifier Memory Deep Packet Device Deep Packet Memory Packet Processor Device Packet Processor Memory Policing/Statistics Device Policing/Statistics Memory Traffic Manager Device Traffic Manager Memory Traffic Manager Queues

    * * * * *Packet Processor Table

    This table provides details of number, architecture, and types of processing engines, interfaces, and integrated functions, such as Ethernet MAC. Many of these packet processors are full network processors. This table does not include the FastChip PolicyEdge, which does not include a packet processor.

  • Company

    • The semiconductor vendor.

  • Device

    • The name or number of the packet processing device.

  • Processing Engines

    • PE Type: The type of packet processing engines used. These are either RISC or VLIW. Where these are licensed cores this is indicated.

    • PE Speed: The clock frequency of the packet processing engines for the device listed in the chipset table. Other versions may have different clock rates.

    • PE #: The number of packet processing engines.

    • PE Configuration: Are the packet processing engines arranged in parallel, a pipeline, or a parallel pipeline? The Motorola devices can be configured in parallel or a pipeline.

  • General Purpose CPU

    • Several devices include a general purpose CPU. This is used for housekeeping and, if required, for the control plane.

    • CPU Type: The type of general purpose cores used. Where these are licensed cores this is indicated.

    • CPU Speed: The clock frequency of the general purpose cores for the device listed in the chipset table. Other versions may have different clock rates.

    • CPU #: The number of general purpose cores.

  • Host Interface

    • All the devices provide a host interface to connect to an external control plane processor. Most of these interfaces support 32bit or 64bit 66MHz PCI.

  • Switch Interface

    • Most of the devices will interface directly to a switch fabric. This section details only the main interface supported. Many devices support several interface types, including standardized and proprietary interfaces. In general, later devices support a superset of the older devices.

    • Switch I/F: The main switch interfaces. Many support the OIF SPI-3 or SPI-4 interfaces or the Network Processing Forum (NPF) CSIX-L1 switch interface. Many companies have their own proprietary switch interfaces that are supported on older devices or support the ATM Forum UTOPIA Levels 2 and 3 interfaces (UL2, UL3) for slower devices. Proprietary switch interfaces include AMCC ViX, IBM DASL, Vitesse FOCUS.

    • Switch B/W: The bandwidth of the main switch interfaces.

  • PHY Interface

    • Most of the devices will interface directly to a framer. This section details only the main interface supported. Many devices support several interface types, including standardized and proprietary interfaces. In general, later devices support a superset of the older devices.

    • PHY Type: The main framer interfaces. Many support the OIF SPI-3 or SPI-4 interfaces. Many companies have their own proprietary switch interfaces that are supported on older devices or support the ATM forum UTOPIA Levels 2 and 3 interfaces (UL2, UL3) for slower devices. Proprietary framer interfaces include AMCC ViX, IBM SELECT Bus, Vitesse FOCUS.

    • PHY B/W: The bandwidth of the main framer interfaces.

  • Ethernet MAC

    • Does the device include one or more Ethernet MACs? These may be 100 Mbit/s (FE), 1 Gbit/s (1GE), or 10 Gbit/s (10GE).

  • ATM SAR

    • Does the device include an ATM SAR? Xelerated integrate this function into the traffic manager not the packet processor.

  • High-Level Language

    • What high-level language can be used to program the packet processor? Although this will accelerate development, most devices will run faster if programmed in assembly.



    Dynamic Table: Packet Processors

    Select fields:
    Show All Fields
    Company Device PE Type PE Speed PE # PE Configuration CPU Type CPU Speed CPU # Host Interface Switch I/F Switch B/W PHY I/F PHY B/W Ethernet MAC ATM SAR High-Level Language

Subscribe and receive the latest news from the industry.
Join 62,000+ members. Yes it's completely free.

You May Also Like