Since Google announced that it would install custom chips for machine learning in its data centers, the semiconductor industry has wondered how many Nvidia GPUs they would replace. While Google still has not answered that question, the company keeps on buying the parallel processors, which can accelerate training operations in machine learning.
Google recently started to offer Nvidia’s latest product, the Tesla V100, over its cloud on a limited basis before general availability. The chip is based on Nvidia’s Volta architecture, which uses custom tensor cores for accelerating deep neural networks, and succeeds its previous Pascal architecture, which was introduced about two years ago.
Currently, Amazon Web Services, the leader in the cloud computing space, offers the new hardware to customers. For now, Google charges less to rent its custom accelerators over the cloud than Nvidia’s Tesla V100. The cost of Google’s Cloud TPU — which puts four chips together to provide 180 trillion operations per second — is $6.88 per hour.
Nvidia's Tesla V100 can handle more operations per second than the discrete chips inside Google’s Cloud TPU, so the price per trillion operations favors Nvidia. The graphics processor – which was introduced by the company's chief executive Jensen Huang last year and can provide 125 trillion operations per second – costs $2.48 per hour on Google’s cloud.
But without common benchmarks, there are limits to comparing Nvidia’s chips with Google’s system. Last month, researchers reported that the Cloud TPU could train an image recognition algorithm slightly faster than four Tesla V100s. They added that using Google’s four TPUs is cheaper than renting an equivalent number of Tesla V100s on Amazon’s cloud.
Nvidia recently said that researchers had trained an image recognition algorithm on ResNet-50, a massive collection of images widely used to measure the performance of machine learning hardware and software, in record time using a single Tesla accelerator. They added the Tesla V100 was four times faster than chips based on Nvidia’s Pascal architecture.
The researchers also said that Nvidia’s DGX appliance, powered by eight Tesla V100s linked with high-speed interconnects, could train the algorithm in less than three hours three times faster than Google’s Cloud TPU system. Nvidia said that software improvements had almost doubled its performance with ResNet-50 over the last year.
Google continues to promote the difference between Nvidia’s chips and its own hardware when running DawnBench, a machine learning benchmark created by Stanford University researchers. And Jeff Dean, head of Google’s artificial intelligence unit, said that software improvements have accelerated training on Google’s chips by about 45 percent over the last six months.