Electronicdesign 29755 Carracing 1143851175
Electronicdesign 29755 Carracing 1143851175
Electronicdesign 29755 Carracing 1143851175
Electronicdesign 29755 Carracing 1143851175
Electronicdesign 29755 Carracing 1143851175

Groq’s AI Accelerator Eyes Hyperscalers, Autonomous Vehicles

Nov. 18, 2019
This machine-learning acceleration chip zeros in on applications such as autonomous vehicles, the cloud, and the enterprise.

A machine-learning acceleration chip developed by startup Groq is set to take on a range of high-performance applications from autonomous vehicles to the cloud and the enterprise. Its Tensor Streaming Processor (TSP) is a single-threaded machine that handles integer (INT8) and floating-point models. Among the features is a global shared memory bandwidth of over 60 TB/s using on-chip memory. The three-quarter length PCI Express (PCIe) card has a x16 PCIe Gen 4 interface.

The TSP uses a different compute layout (see figure) designed to deliver high performance. The architecture is designed for compiler-orchestrated execution. This allows for deterministic operation, since there’s no caching system involved. Runtime performance and power requirements are known at compile time. This also avoids potential hacks like Spectre and Meltdown, which take advantage of hardware speculation features in conventional processors that don’t exist in the TSP.

The Tensor Streaming Processor (TSP) provides deterministic performance and eliminates tail latency.

The TSP is optimized for a batch size of one that’s preferable for applications like self-driving cars and financial applications. The system also eliminates tail latencies that can impact enterprise operation in large servers.

The chip is single threaded, but it can quickly switch between models and layers. As a result, the chip can handle multiple models without a lot of data movement that can otherwise reduce performance.

Groq isn’t announcing performance numbers yet, but the company’s collection talent bodes well for a high-end machine. Jonathan Ross is Groq’s technical founder and CEO who also worked on Google’s TPU effort as a 20% project. He designed and implemented the core elements of the original chip. Dinesh Masheshwari is CTO. He worked on multi-threaded multi-processor systems in the mid-1980s. Dinesh served as CVP, CTO of the Memory Division at Cypress Semiconductor. And Michelle Tomasko, Vice President of Engineering, worked Nvidia, Google Consumer HW, and Transmeta.

Sponsored Recommendations

Board-Mount DC/DC Converters in Medical Applications

March 27, 2024
AC/DC or board-mount DC/DC converters provide power for medical devices. This article explains why isolation might be needed and which safety standards apply.

Use Rugged Multiband Antennas to Solve the Mobile Connectivity Challenge

March 27, 2024
Selecting and using antennas for mobile applications requires attention to electrical, mechanical, and environmental characteristics: TE modules can help.

Out-of-the-box Cellular and Wi-Fi connectivity with AWS IoT ExpressLink

March 27, 2024
This demo shows how to enroll LTE-M and Wi-Fi evaluation boards with AWS IoT Core, set up a Connected Health Solution as well as AWS AT commands and AWS IoT ExpressLink security...

How to Quickly Leverage Bluetooth AoA and AoD for Indoor Logistics Tracking

March 27, 2024
Real-time asset tracking is an important aspect of Industry 4.0. Various technologies are available for deploying Real-Time Location.

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!