The 8-bit Intel 8080 was the central component of a multiprocessor system called the Hypercube back in the 1970s. The theory was that multiple processors could tackle large problems using parallel processing.
Massively parallel processing architectures like Connection Machines' CM5 also came and went as single-core processors continued to track Moore's Law in terms of performance (see the figure). Unfortunately, it now seems that parallel processing may have to be the answer. Even Intel's latest Teraflop Research Processor hosts 80 processing cores (also known as tiles) interconnecting in a 2D mesh. Some form of multicore processing will address the general computing environment, though there's still a question as to when the scaling of the current symmetrical multiprocessor (SMP) approach will end.
In any case, the harder challenge is likely to be the software. A great deal of research has been going on in this area for decades, but it has yet to make a dent in the life of the average programmer. In fact, managers who deal with multicore platforms with only a few cores have to cope with the average programmer's lack of background and education in parallel programming. This is one reason why tools like Intel's Thread Building Blocks are popular with developers (see "Multiple Threads Make Chunk Change" at www.electronicdesign.com, ED Online 13645). It hides the parallel programming complexity behind libraries.
Another low-impact alternative is enhancing programming language such as C, much like Connection Machines did with its C* (C Star) implementation. A more recent extension of C++ for current multicore platforms is CodePlay's Sieve C++ Parallel Programming System (see "Go Multicore With Sieve" at ED Online 15744). The extensions add hints to the compiler, allowing optimizations to exploit SIMD-style parallelism.
While the various approaches have been around in academia, not much has flowed into engineering and programming. Few programmers even consider multithreading within applications except when handling very basic things such as user-interface support. In fact, many event-driven frameworks try to minimize the programmer's view of parallelism, sometimes leading to convoluted programming techniques.
It's still unclear whether extending languages like C will provide the kind of programming environment that can properly take advantage of systems where there are thousands of processors, especially when the computing environment is limited and communication is hardwired.
Consider IBM's Cell processor, which lies at the heart of the Sony PlayStation 3 where the processing elements are limited to 256 kbytes of RAM (see "Games Flourish In A Parallel Universe," ED Online 15745). C pointers can address the RAM, but programmers will need to address considerations beyond that. C* is definitely not the answer, but looking at history and current research will help. Software remains the key as well as the limiting factor, even if multicore designs can scale.