Net Search Coprocessor Offloads Search Overheads To Accelerate Systems

Oct. 14, 2002
In large network routers and other equipment that must handle ever increasing amounts of packet traffic, the ability to perform many large table searches very quickly is key to the system's overall performance. By offloading the search...

In large network routers and other equipment that must handle ever increasing amounts of packet traffic, the ability to perform many large table searches very quickly is key to the system's overall performance.

By offloading the search function from the packet processors, the Vichara 81000 search supervisory coprocessor can streamline system design and significantly improve search performance. It can handle seven-tuple flow tables for IPv4/v6, L2/L3 lookups with aging support and basic Internet-Protocol routing table management (store and lookups), to list just a few applications.

Also, the Vichara coprocessor can relieve the bottleneck delays that occur when multiple packet processors simultaneously request searches. The coprocessor coordinates all search requests and runs search-related instructions on-board. It then works with network search engines (NSEs) and SRAM to return only the final result to the packet processors.

To offload the packet processors, the Vichara coprocessor downloads the instruction set from the packet processor and recursively completes the search on-board, communicating with the packet processor(s) only to return the result. The packet processors can then operate at optimal efficiency.

Able to operate at 266 MHz, the Vichara chip can handle multiple network packet processor instruction sets and perform search key extraction. Inside the coprocessor is support to handle multiple lookups, conditional branching, aging and policing, reading associated data, and manipulating associated data. A maximum latency of just 150 ns per command is incurred.

The search coprocessor works with Cypress' NSEs (the NSE 70000/AYAMA 10000) through a wide, dedicated, 72-bit interface. The processor can then accelerate multiple lookups and classification. This design is optimal since Vichara can offload instructions and data over the narrow network-processor unit bus (for example, the LA-1 interface is only 18 bits wide) while completing the searches over wider buses tying Vichara to NSEs and SRAMs.

In addition to connecting to the packet processors and managing the search subsystem, Vichara works in tandem with the control-plane processor to provide further headroom to the data-plane packet processor. Since table updates and routing algorithms are run in the control plane, the control-plane CPU is responsible for running flow statistics such as flow metering and advanced functions like aging.

Currently, the control CPU uses the processing power of the packet processor and the data stored in the NSEs and SRAM. At higher line speeds (10 Gbits/s and beyond), packet processors aren't only stretched for processing power. They also add to the latency of flows controlled by the control-plane processor. In such cases, a solution like Vichara can directly connect with the control-plane CPU and SRAMs to support flow statistics and aging.

In applications that require multiple lookups per packet (such as one IPv4 32-bit lookup, an ingress access control list (ACL) and an egress ACL lookup, an ingress quality-of-service (QoS) and an egress QoS lookup, and a flow ID lookup), as well as two SRAM operations (read and write SRAM entries), the coprocessor can hold the bus usage to 30% in OC-192 systems and to less than 10% in OC-48 systems. The low bus usage gives back bandwidth to the shared LA-1/QDR-II memory bus, providing plenty of bandwidth for typical QDR-II memory bus functions such as segmentation and reassembly pointers, congestion management through WRED tables, and statistics for policing and metering.

Samples of the Vichara CYNCP81000 will be available in the third quarter of 2003. In the first quarter, Cypress will release a cycle-accurate C-model as well as Verilog/VHDL models for designers who want to start a system design.

The chip will be available in two pinout versions. One has a dual network/packet processor interface and is housed in a 484-contact BGA package. The second version, able to handle up to four network/packet processors, comes in an 800-contact BGA package. In lots of 1000, the 484-contact version costs $250 and the 800-contact version costs $300.

Cypress Semiconductor Corp., www.cypress.com; (408) 943-2600.

See associated figure

About the Author

Dave Bursky | Technologist

Dave Bursky, the founder of New Ideas in Communications, a publication website featuring the blog column Chipnastics – the Art and Science of Chip Design. He is also president of PRN Engineering, a technical writing and market consulting company. Prior to these organizations, he spent about a dozen years as a contributing editor to Chip Design magazine. Concurrent with Chip Design, he was also the technical editorial manager at Maxim Integrated Products, and prior to Maxim, Dave spent over 35 years working as an engineer for the U.S. Army Electronics Command and an editor with Electronic Design Magazine.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!