Nvidia announced its latest generation of artificial intelligence chips and software at its developer’s conference in San Jose on Monday. This move aims to cement the company’s leading position as the preferred supplier for AI-focused businesses.
The surge in Nvidia’s share price, increasing fivefold, and its total sales more than tripling since the onset of the AI boom in late 2022, primarily fueled by innovations like OpenAI’s ChatGPT, underscores the critical role of Nvidia’s high-end server GPUs in training and deploying large AI models.
Major players like Microsoft and Meta have invested billions in procuring these chips.
The new iteration of AI graphics processors, dubbed Blackwell, debuts with its inaugural chip, the GB200, slated for release later this year. Nvidia is enticing customers with more potent chips to stimulate fresh orders, particularly as demand remains high for the current “Hopper” H100s and similar offerings.
During the developer conference in California on Monday “Hopper is fantastic, but we need larger GPUs,” said Nvidia CEO Jensen Huang.
Despite this announcement, Nvidia shares experienced a slight dip of over 1% in extended trading on Monday.
Additionally, the company introduced revenue-generating software named NIM, aimed at simplifying AI deployment, providing another incentive for customers to opt for Nvidia chips amidst a growing pool of competitors.
Nvidia’s executives emphasize a strategic shift from merely providing chips to offering a comprehensive platform akin to industry giants like Microsoft or Apple, fostering an ecosystem where other companies can develop software.
“Blackwell isn’t just a chip; it represents a platform,” noted Huang.
“The GPU was the sellable commercial product, and the software aimed to facilitate its diverse applications,” explained Nvidia enterprise VP Manuvir Das in an interview. “While we continue in that vein, the significant evolution lies in our burgeoning commercial software business.”
Das said Nvidia’s new software makes it easier to run programs on all Nvidia GPUs, including older ones mainly used for regular tasks rather than AI work.
“If you’re a developer with a good model, putting it in a NIM makes sure it works on all our GPUs, reaching as many people as possible,” Das explained.
Introducing Blackwell, the next evolution after Hopper, Nvidia’s GB200 Grace Blackwell Superchip boasts two B200 graphics processors and an Arm-based central processor.
Nvidia periodically updates its GPU architecture, delivering significant performance leaps. Many recent AI models were trained on Nvidia’s Hopper architecture, epitomized by chips like the H100, first introduced in 2022.
According to Nvidia, processors based on the Blackwell architecture, such as the GB200, promise a substantial performance boost for AI enterprises, offering 20 petaflops in AI performance compared to the H100’s 4 petaflops. Nvidia says this improved processing power will help AI companies train bigger and more complicated models.
The chip features a specialized “transformer engine” tailored for running transformers-based AI, a foundational technology behind ChatGPT.
The Blackwell GPU, a sizable unit, integrates two separately manufactured dies into a single chip produced by TSMC. It will also be available as a complete server known as the GB200 NVLink 2, housing 72 Blackwell GPUs along with other Nvidia components tailored for training AI models.
Leading tech giants including Amazon, Google, Microsoft, and Oracle will offer access to the GB200 through their cloud services. The GB200 setup combines two B200 Blackwell GPUs with an Arm-based Grace CPU. Nvidia disclosed that Amazon Web Services plans to construct a server cluster comprising 20,000 GB200 chips.
Nvidia claims that the system can deploy a model with 27 trillion parameters, surpassing even the largest existing models like GPT-4, which reportedly has 1.7 trillion parameters. AI researchers believe that bigger models with more parameters and data could lead to new capabilities.
Nvidia didn’t say how much the new GB200 chip or the systems it’s part of will cost.
Analyst estimates suggest that Nvidia’s Hopper-based H100 ranges from $25,000 to $40,000 per chip, with entire systems priced as high as $200,000.
Revamp the Nvidia Inference Microservice
Nvidia has introduced a new addition to its Nvidia enterprise software subscription called NIM, short for Nvidia Inference Microservice.
NIM simplifies the utilization of older Nvidia GPUs for inference, the process of executing AI software. This innovation enables companies to leverage their existing vast pool of Nvidia GPUs for ongoing AI operations.
During inference, an AI model requires less computational power compared to the initial training phase. With NIM, companies can deploy their AI models instead of relying on AI services provided by entities like OpenAI.
The overarching strategy aims to entice customers who invest in Nvidia-based servers to subscribe to Nvidia Enterprise, which carries a license fee of $4,500 per GPU per year.
Nvidia will collaborate with AI firms such as Microsoft and Hugging Face to ensure seamless compatibility of their AI models with all compatible Nvidia chips. Subsequently, developers can effortlessly deploy these models on their servers or Nvidia’s cloud-based servers using NIM, eliminating the need for extensive configuration processes.
“Instead of calling OpenAI in my code, I would now just change one line of code to point it to the NIM from Nvidia,” Das explained.
Nvidia anticipates that the software will facilitate AI operations on GPU-equipped laptops, shifting from reliance on cloud-based server infrastructure.