Head over to our on-demand library to view periods from VB Rework 2023. Register Right here
At this time is a busy day of stories from Nvidia because the AI chief takes the wraps off a sequence of recent developments on the annual SIGGRAPH convention.
On the {hardware} entrance, one of many largest developments from the corporate is the announcement of a brand new model of the GH200 Grace Hopper platform, powered with next-generation HBM3e reminiscence expertise. The GH200 introduced right this moment is an replace to the prevailing GH200 chip introduced on the Computex present in Taiwan in Could.
“We introduced Grace Hopper not too long ago a number of months in the past, and right this moment we’re asserting that we’re going to offer it a lift,” Nvidia founder and CEO Jensen Huang mentioned throughout his keynote at SIGGRAPH.
What’s inside the brand new GH200
The Grace Hopper Superchip has been an enormous matter for Nvidia’s CEO since not less than 2021 when the corporate revealed preliminary particulars.
The Superchip relies on an Arm structure, which is extensively utilized in cellular gadgets and aggressive with x86-based silicon from Intel and AMD. Nvidia calls it a “superchip” because it combines the Arm-based Nvidia Grace CPU with the Hopper GPU structure.
With the brand new model of the GH200, the Grace Hopper Superchip will get a lift from the world’s quickest reminiscence: HBM3e. In response to Nvidia, the HBM3e reminiscence is as much as 50% quicker than the HBM3 expertise inside the present technology of the GH200.
Nvidia additionally claims that HBM3e reminiscence will permit the next-generation GH200 to run AI fashions 3.5 instances quicker than the present mannequin.
“We’re very enthusiastic about this new GH200. It’ll characteristic 141 gigabytes of HBM3e reminiscence,” Ian Buck, VP and normal supervisor, hyperscale and HPC at Nvidia, mentioned throughout a gathering with press and analysts. “HBM3e not solely will increase the capability and quantity of reminiscence connected to our GPUs, but additionally is far quicker.”
Sooner silicon means quicker, bigger AI software inference and coaching
Nvidia isn’t simply making quicker silicon, it’s additionally scaling it in a brand new server design.
Buck mentioned that Nvidia is creating a brand new dual-GH200-based Nvidia MGX server system that may combine two of the next-generation Grace Hopper Superchips. He defined that the brand new GH200 might be linked with NVLink, Nvidia’s interconnect expertise.
With NVLink within the new dual-GH200 server, each CPUs and GPUs within the system might be linked with a totally coherent reminiscence interconnect.
“CPUs can see different CPUs’ reminiscence, GPUs can see different GPU reminiscence, and naturally the GPU can see CPU reminiscence,” Buck mentioned. “In consequence, the mixed supersized super-GPU can function as one, offering a mixed 144 Grace CPU cores over 8 petaflops of compute efficiency with 282 gigabytes of HBM3e reminiscence.”
Whereas the brand new Nvidia Grace Hopper Superchip is quick, it should take a little bit of time till it’s truly obtainable for manufacturing use circumstances. The following technology GH200 is anticipated to be obtainable within the second quarter of 2024.