NVIDIA’s Ampere A100 becomes the fastest GPU ever recorded


While we wait for consumer graphics cards based on the Ampere GPU architecture to launch, NVIDIA’s flagship Ampere chip, the A100, continues to break world records. The world’s largest graphics chip based on the 7nm process node was introduced in May and has huge numbers to back it up in terms of specs and performance. The Ampere A100 Tensor Core Accelerator has become the fastest GPU ever recorded on OctaBench.

NVIDIA Ampere A100 HPC Tensor Core GPU becomes the fastest GPU ever recorded on Octa Bench, delivers 43% better performance than Turing with RTX disabled

The feat was shared by OTOY CEO Jules Urbach. OTOY are the developers behind Octa Bench, which is a benchmark tool that allows users to evaluate GPU performance using the Octane Renderer. OctaneRenderer is a GPU rendering engine that supports NVIDIA RTX ray tracing hardware acceleration to deliver sharp rendered scenes.

Arm – the British chip designer – can be acquired by NVIDIA

According to Jules, the NVIDIA A100 Tensor Core GPU recorded a score of 446 on the OctaBench. It also claims that this score is on average 43% faster than the Turing GPU in OctaneRender even with RTX disabled. The Turing results compared here use RTX and, unlike games that cause a significant drop in frame rate, enabling RTX within the OctaRenderer leads to better performance as scenes can be rendered and finished faster with hardware ray tracing available.

It is not stated which exact Turing GPU was used for comparison with the NVIDIA Ampere A100 GPU, but looking at the full average banks of all tested cards we see a range of interesting results. On average, the Tesla V100, the predecessor A100, is roughly 20% slower, but for some strange reason, the Titan V is only 11% slower, which is surprising considering the Titan RTX is 38 % slower than the A100 GPU.

The main answer to this could be the fact that the Titan V takes advantage of the same GV100 GPU as the Tesla V100, which could be more optimized for this cloud-scale data center and benchmark, while Turing GPUs are more optimized for gaming and GP-GPU use. But then again, the company’s CEO claims this is the fastest GPU ever recorded on specific workload, which is a great feat for NVIDIA’s A100 GPU accelerator.

NVIDIA GeForce RTX 3080 Ampere Gaming graphics card supposedly 20% faster than GeForce RTX 2080 Ti

The NVIDIA A100 is by far the largest 7nm chip produced to date, with a whopping 54 billion transistors packed in a single die. The A100 comes in a very low configuration due to initial performance, but like the Tesla V100, we could see a taller container version with more cores once the performance improves and that would further increase performance at this point in time. specific reference.

The full implementation of the NVIDIA Ampere GA100 GPU includes the following units:

  • 8 GPC, 8 TPC / GPC, 2 SM / TPC, 16 SM / GPC, 128 SM per full GPU
  • 64 FP32 CUDA / SM cores, 8192 FP32 CUDA cores per full GPU
  • 4 3rd Generation / SM Tensor Cores, 512 3rd Generation Tensor Cores per Full GPU
  • 6 HBM2 batteries, 12 512-bit memory controllers

The implementation of the A100 Tensor Core GPU of the NVIDIA Ampere GA100 GPU includes the following units:

  • 7 GPC, 7 or 8 TPC / GPC, 2 SM / TPC, up to 16 SM / GPC, 108 SM
  • 64 FP32 CUDA / SM cores, 6912 FP32 CUDA cores per GPU
  • 4 third generation / SM tensor cores, 432 third generation GPU cores per GPU
  • 5 HBM2 batteries, 10 512-bit memory controllers

One can only imagine what the performance metrics would be once Ampere cards with RTX enabled are released. If this specific benchmark has anything to do with it, then we can see that the Ampere GeForce RTX 30 series cards easily come close to their HPC counterparts.