Data-transfer rates are continually increasing, carrying more data from node to node within a network, or from point to point within a system. As a result, generic field-programmable logic arrays cannot meet the performance demanded by high-speed serial data interfaces.
Standalone serializer/deserializer (SERDES) chips can provide the interface between the multigigabit serial channels and the parallel digital logic. Yet these circuits consume extra board space and power while offering a fixed-function solution. With all of the evolving networking, telecommunications, and data communications standards, designers would opt for a more flexible solution. They would prefer to "tune" the interface at the last possible point, just prior to shipping products.
To meet such demanding requirements, QuickLogic has combined the high-speed programmable-logic architecture of its recently released Eclipse FPGA family with dedicated, on-chip, configurable SERDES blocks. The result of this work is the QuickSD family of antifuse-based FPGAs, which the company refers to as embedded-system platforms (ESPs).
There will be three chips in the initial family, the QL81SD, 82SD, and 84SD. The QL81SD has six SERDES channels and 334k system gates, while the QL82SD has eight SERDES ports and 536 kgates. Available with the QL84SD are eight SERDES ports and 658 kgates. Each SERDES port can operate at data speeds of up to 1 Gbit/s and is OC-12 (622 MHz) compatible for use in SONET applications.
In addition to the high-speed serial ports, the QuickSD chips come with two programmable SERDES clock circuits that can provide high-speed clocks if timing information is not embedded in the data. Also included on the chips are two high-speed programmable PLLs, 24 to 36 blocks (2304 bits/block) of dual-ported SRAM, and 12 to 18 quad-port multiplier-accumulator (QMAC) blocks (Fig. 1). The QMAC blocks are part of the intellectual property QuickLogic developed for its QuickDSP FPGA family. They can greatly accelerate DSP-type computations such as those found in wireless basestations and many other applications.
The resulting aggregate bandwidth on the SERDES interfaces totals 8 Gbits/s. Such high-performance, flexible circuitry can readily tackle the data movement and processing needs of systems performing data transmission, image processing, and other tasks. It can even form the heart of a virtual backplane or crosspoint switch to replace wide, high-speed buses.
To achieve high data rates with minimal noise, the fast, custom-designed serial interfaces employ low-voltage differential signaling. The LVDS interfaces on the SERDES ports can actually reach data rates well in excess of the 622 Mbits/s specified by the OC-12 standard.
At top speed, the LVDS I/O pins can operate at 1 Gbit/s, which pushes the overall total data throughput far above 8 Gbits/s. To keep pace with the recovered/transmitted data, the on-chip memory blocks have to be fast. They're specified for operation at access rates of up to 300 MHz.
In a typical system handling multiple data communications channels, the SERDES blocks would talk to the high-speed data channels. As part of the interface between the SERDES blocks and the programmable logic, the dual-ported RAM can be configured to appear as multiple asynchronous FIFO buffers. On the other side of the buffers, the FPGA logic would be used to perform functions such as encoding, decoding, dc balancing, ATM/SDH packet framing, and memory control. It could even be used to implement an interface—PCI, Utopia, 10/100 Ethernet, or a proprietary control/interface block (Fig. 2).
Each SERDES block provides transceiver logic as well as parallel-to-serial and serial-to-parallel conversion logic. (Fig. 3). Also included with SERDES is a programmable timing-and-control block along with a PLL to recover incoming clocks and precisely control the clock timing. Test circuits, in the form of JTAG, built-in self-test, and a loopback capability, allow designers to check out the circuit functions before, during, and after all the circuits have been configured.
The SERDES blocks can operate with the included clock and data circuitry to recover clock signals embedded in the data stream, or with a separately transmitted clock. This aids in producing very stable and wide timing margins. Consequently, it's easier for the designer to build the rest of the system. Also included is pre-emphasis equalization and dc balancing to help optimize the interface performance.
With dc balancing, the SERDES helps compensate for positive charge buildup on the data channel when long strings of "1s" are sent. The balancing scheme changes the data polarity so that a long string of "1s" will actually look like a short string of "1s" followed by a similar-sized string of "0s." Another string of "1s" follows and so on, until something breaks the long string of "1s." This breaks up the dc charge buildup caused when only "1s" are sent. And, that eliminates the chance that a single-bit "0" in a long string of "1s" will go undetected. Without balancing, the single-bit "0" may not be able to drop the accumulated charge below the level that the detector will recognize as a zero.
The LVDS interface can be used in point-to-point, multipoint, and multidrop bus configurations. That enables the system designer to employ the bus configuration that best matches the system. The LVDS bus interface on the FPGAs allows for an embedded clock in the data—a single "1" start bit and a single "0" stop bit for one byte of data. Moreover, the circuitry can lock into the data stream within 1024 cycles.
Internally, the FPGA side of the SERDES has a programmable parallel interface (configurable for 1 to 20 bits) that can be connected to the FPGA logic and memory. Designers can then configure the system so that the number of parallel bus lines per SERDES can be optimized.
One SERDES channel can be programmed to serialize 1, 4, 7, 8, 10, or 20 parallel bus lines. That permits the SERDES blocks to take large internal buses and "fracture" them to maximize or minimize the use of the on-chip resources. Data on the internal parallel bus can therefore be transferred to an external bus of similar width or to another internal bus on another QuickSD. This virtual-backplane capability has been offered by several component suppliers as a means to reduce the number of pins and wires needed to move large amounts of data quickly.
As mentioned earlier, two programmable PLL-based clock channels are incorporated in the SERDES support system. Every clock channel can operate at up to 500 MHz and employs a programmable timing register. Through that register they can be programmed to handle any data channel width. These clocks can be programmed to be associated with any SERDES channel. Furthermore, they provide system timing information for applications in which the data does not contain an embedded clock (such as in digital flat-panel interfaces, which use a separate clock signal).
Two additional programmable PLLs are also available for the FPGA system logic portion of the chip. These PLLs allow clock division ratios of 1×, 2×, 4×, ×/2, and ×/4. The PLLS have a capture-and-lock range spanning 25 to 250 MHz and can lock to the desired frequency in less than 10 µs. Additionally, the clock signals have a jitter of less than 200 ps. An early clock option is also available to handle applications that require a clock-to-output time (Tco) of less than 2.5 ns. The flexibility of the timing circuits allows the serial interfaces to operate at high speeds while the actual system logic runs at a significantly slower pace.
The PLL clock circuits in the FPGA portion can feed timing signals to nine global clock/control networks that are distributed in the FPGA portion of the chip. One of the clock networks is guaranteed to have a skew of less than 150 ps. This clock can be used for the portion of the system with the tightest timing margins. The remaining eight clock networks are programmable and can be driven by the PLLs as well. They can obtain clocks from either off-chip or on-chip sources.
The clock grid on the FPGA is further divided. Its array has four quadrants, each of which contains five distribution subnets. Additionally, there are 12 I/O clock/control networks (two for each of the six I/O pin banks). This fine-grained clock distribution is key to keeping timing margins tight and maximizing the available bandwidth for the data.
One application that's starting to demand tight timing and interface flexibility is the digital data interface to the latest high-resolution flat-panel displays. The wide cables commonly used in the panels are eliminated and replaced with lower-cost serial interfaces. As a result, displays with XGA resolution typically require a data bandwidth of about 1.56 Gbits/s.
At the extreme other end, a QXGA screen with 2048- by 1536-pixel resolution demands a serial bandwidth of about 5 Gbits to refresh the screen 60 times/s, with 8-bit RGB color. With their 1920- by 1080-pixel images, HDTV displays require a bandwidth of about 3.34 Gbits. The QuickSD chips with their 8-Gbit/s aggregate bandwidth can readily handle these display interface requirements.
The basic logic cell in the FPGA is very similar to the cell used in previous QuickLogic FPGA families. It has six independent outputs and a very wide fan-in that allows multiple logic functions to be implemented in the cell simultaneously (Fig. 4a).
Furthermore, two output registers allow the cell to handle pipelined and asynchronous operations. An extra multiplexer and a second flip-flop were added to the cell, in comparison to cells used in the company's previous FPGA families. The extra flip-flop supports high-speed register-to-register operations, while the extra multiplexer further enhances the cell's ability to handle wide-input logic functions.
Since the cells are based on the same 0.25-µm, five-layer metal process employed by the Eclipse FPGA family, they have similar performance. Internal register-to-register speeds of up to 600 MHz and chip-to-chip frequencies of over 225 MHz are possible. While internal-logic cell delays are less than 1.05 ns, clock-to-output delays are less than 3 ns. This allows the arrays to implement complex, high-speed functions with minimal tweaking of the logic.
Designers at QuickLogic also borrowed from the company's QuickDSP series when they added QMAC blocks to the array. All QMAC blocks consist of an integrated multiply, add, and accumulate function. Those functions are implemented with an 8- by 8-bit multiplier and a 16-bit adder and register (Fig. 4b).
Each block can perform multiplications at 220 MHz and addition/accumulate operations at up to 350 MHz. These speeds are significantly faster than what can be achieved using the basic logic in the FPGA portion of the chip. The high throughput possible with QMAC blocks could greatly accelerate many of the DSP computations that might be encountered if the QuickSDs are used in communications and imaging applications.
Complementing the logic and math cells are 24 to 36 blocks of SRAM modules. These are integrated in the various FPGA family members, providing the chips with 55 to 83 kbits of RAM. Each block contains 2304 bits. To form larger memories, the blocks can be concatenated.
Individually, each block can be configured to appear as a 128-word by 18-bit, 256-word by 9-bit, 512-word by 4-bit, or 1024-word by 2-bit memory, all with completely independent read and write ports. The memory also can function as single-ported, as dual-ported, as a FIFO, or as ROM.
Along with the high-performance SERDES I/O blocks on the chip, QuickLogic has added an enhanced digital I/O capability. It comprises six independent I/O banks, each capable of independently being configured for 2.5- or 3.3-V interfaces (separate from the internal logic-array operating voltage). Moreover, a voltage-reference pin lets the I/O lines support differential interfaces.
The I/O pins can be singularly configured for LVTTL, LVCMOS, PCI, GTL+, SSTL2, SSTL3, LVPECL, or LVDS interface levels. All I/O buffers include programmable slew-rate control and a programmable weak pull-down to reduce the chance of a pin floating high and providing incorrect data. Each I/O block also contains three registers to support input and output, and to enable control and pipelining.
Design tools in the QuickWorks version 9.1 release can handle system designs for both the QuickSD family and the company's recently released Eclipse FPGA family. The tools can also be used with all previous FPGAs offered by the company.
Price & Availability
The QuickSD family of SERDES/FPGAs will be available in several packaging options. For low pin-count requirements, the arrays can be obtained in 208- and 280-lead PQFPs. For high-I/O applications, the larger versions come in 484- and 672-contact BGA packages. In lots of 10,000 units, prices start at less than $25 apiece for the QL81SD in the 208-lead PQFP. Samples are available now. The QuickWorks 9.1 tool suite sells for $1685.
QuickLogic Corp., 1277 Orleans Dr., Sunnyvale, CA 94089-1138; Kevin Lee, (408) 990-4000; www.quicklogic.com.