Nvidia will sell $1 trillion of its AI chips by 2027, launches inference rack

The Groq 3 LPU has only 500 MB of memory, but it’s SRAM flying at 150 TB/s. (Picture: Nvidia)

The Blackwell and Rubin series of chips are selling like hotcakes, the Nvidia CEO says at the Games Developer Conference, as he doubles the previous guidance of $500 billion in sales and justifies a market valuation topping $4 trillion.

Huang’s most interesting offering at the show was the new Groq 3 LPX, a custom rack made for inference loads.

Nvidia says this server, made up of 72 Vera Rubin chips and 256 Language Processing Units (LPU) made by recently acquired Groq, can handle 700 million tokens per second — or 350 times as much as the Hopper platform, The Wall Street Journal reports.

— The inference inflection has arrived, Huang said, — This is the secret sauce.

Nvidia’s offerings have so far been geared toward powerful, and power consuming, chips for model training, and by adding an inference-optimized server — for workloads that come after training — they are closing a gap that was previously filled by custom chips from the likes of Google, Meta and Amazon.

The Groq 3 LPX should also lay the groundwork for «intelligent agentic swarms,» Nvidia says — by significantly boosting high-token-volume tasks at low latency, having an SRAM bandwidth of a whopping 150 TB/s.

Nvidia will sell $1 trillion of its AI chips by 2027, launches inference rack

GPT-5.5 found on par with Mythos, with OpenAI to limit access to Cyber version

The Oscars move to protect human authorship in new rules targeting AI use

OpenAI confirms and explains GPT’s affinity for mentioning goblins

OpenAI debuts «pets» for Codex — and they are actually useful

The White House reportedly discussing vetting AI models ahead of release

The European Union starts process to open up Android to AI competitors