For fast learning and recognition with artificial neural networks, NC3001 VLSI parallel processor is based on an architecture optimized for the implementation of the learning algorithm, a competitive alternative to back-propagation that is said to lead to very low cost, compact VSLI implementations. Its high processing power, low power dissipation and limited chip size make it well-suited for embedded neural, fuzzy and general filtering applications. The device provides 32 fixed-point, fully parallel digital multiply-and-accumulate processors (MACs) operating in parallel with three-stage pipeline and on-chip weight memory. It also offers a simple chip interface for coprocessor operation in µP systems. Performance is 1000 MOPS with a 30-MHz clock, processing efficiency is 30 MOPS/W, and power consumption is 1W at 30 MHz in a 132-pin PGA package. Evaluation boards are available for various systems.