In the semiconductor industry, the time between new generations of chips is stretching out. The task of etching smaller transistors onto silicon has grown vastly more difficult, while shrinking transistors near atomic limits has introduced problems with keeping them cool. The manufacturing costs have also stopped falling.
Now, internet giants like Google and Facebook—which are shoveling millions of chips into their data centers—are looking for ways around the industry’s slowing pace. They have taken to adding special chips that better handle certain aspects of computing, like those related to artificial intelligence. That shift has seen standard chips combined with everything from graphics processors to more exotic chips that mimic the human brain.
As companies are buying more of these accelerator chips, an alliance of seven chipmakers is trying to ensure that they can all coexist. The goal is to create a special kind of interconnect that can seamlessly share data between processors, accelerators, and memory caches.
The companies working on the project are all trying to build more profitable data center businesses: Advanced Micro Devices, ARM, Huawei, IBM, Mellanox, Qualcomm Technologies, and Xilinx. The companies will pursue a cache-coherent fabric to ensure that everyone’s chips can share the same memory—without stepping on any toes to access that data.
The project, also known as the Cache-Coherent Interconnect for Accelerators (CCIX), is significant because it will bridge some of the major set instruction architectures—the part of the processor visible to programmers—in the industry. That includes IBM’s Power architecture as well as ARM and x86 designs.
The new interconnect will primarily help to increase the speed at which information moves within data centers. Today's computers are sharing data with accelerators and memory using interconnects not designed for high bandwidth, low latency applications. That creates a bottleneck for software that learns to find patterns in large datasets, manage wireless networks in the cloud, or provide internet services like search.
To help mitigate that problem, software engineers are usually forced into some heavy lifting. They depend on extremely complex programming to stitch together different chip architectures in the same system. For data center operators, another option to just buy chips from a small set of manufacturers.
Not everyone thinks that last option is healthy for the industry. The “one-size-fits-all architecture approach to data center workloads does not deliver the required performance and efficiency,” said Lakshmi Mandyam, director of server systems at ARM.
Ultimately, the new project is aiming to replace PCI Express (PCIe) interconnect upon which most data centers depend. PCIe is the connective tissue that links accelerators with the processors from ARM, x86, or Power. Engineers have applied that formula for more than a decade, but PCIe was not designed for the fast and efficient transfers between accelerators and processors.
In the last year, the practice of using accelerators to improve efficiency has spread all over the technology industry. Last month, for instance, Google revealed its own custom chips that would run machine learning in its search engine and other services.
Facebook has opened the architecture behind its Big Sur graphics chips, while China’s Baidu and Microsoft have been experimenting with FGPA chips that increase the speed of their cloud services. Nvidia's latest architecture for graphics processing units, Pascal, was designed specifically for machine learning.
Some companies have already tackled the interconnect problem. IBM has developed the Cache-Coherent Accelerator Processor Interconnect, or CAPI, but it is only supported by IBM and a handful of partners. Nvidia has invested in its own technology called NVLink, which provides faster connections between its graphics chips and IBM’s Power chips.
Having an interface between different chip architectures could give data center operators a wider range of options for buying chips. It could also prop open the door to greater competition and enable operators to update existing equipment, rather than adding more servers or investing in complex programming.
From that perspective, Intel’s exclusion from the new project speaks volumes, according to industry analysts. The world's largest server chipmaker, Intel has its own alternative to the PCIe standard called the QuickPath Interconnect. Intel is running that technology in server chips that combines its CPUs with FPGA accelerators.
Intel’s data center business has been growing rapidly as the company attempts to transition out of personal computers, which have been on the decline for years. The data center business posted revenue of $4 billion in the first quarter of 2016, up 9% from the first quarter last year. That was nearly 30% of the company’s $13.7 billion in revenue during the first quarter.
The CCIX project is one attempt at putting up a fight against Intel. Qualcomm, the biggest maker of mobile phone chips, has developed a server variant of its Snapdragon processor to rival Intel, which has undercut some of Qualcomm’s mobile business. Like Intel, the chipmaker is experiencing some growing pains, working through a major restructuring that will cut up to 15% of the workforce.
ARM, which expects to start challenging Intel in server chips, is known for its mobile chip designs used by Qualcomm, but has also been contributing its designs to AMD Opteron server chips and graphics accelerators. Huawei, which operates a fabless semiconductor business HiSilicon, also depends on ARM designs.
“There is no doubt that the computing industry will laud their efforts as the only sensible path to provide an alternative to an all-Intel computing world,” wrote Karl Freund, a senior analyst with Moor Insights and Strategy, in a recent Forbes article about the project’s impact.
The group, which only has a one-page website with little information, has yet to release any technical or financial details about the project. Freund is predicting the fruits of the partnership will not ripen until around 2019 or 2020.