Published 6 days ago • loading... • Updated 4 days ago

Nvidia slaps Groq into new LPX racks for faster AI response

On Monday, Nvidia announced at GTC that it will integrate Groq 3 LPUs into its Vera Rubin NVL72 rack system, saying 'We're in production with the Groq chip'.
To speed decoding, Nvidia pairs Groq 3 LPUs as decode accelerators with Rubin GPUs so the systems jointly compute every layer for each output token, using SRAM's higher bandwidth and deploying many chips due to low per-chip capacity.
Each Groq 3 LPU delivers 1.2 petaFLOPS and 500 MB of memory, and Nvidia plans LPX racks with 256 LPUs, 128GB on-chip SRAM, 640TB/s bandwidth, with Ian Buck saying 'The tokens per second per chip, is actually quite low'.
Given steep per-chip costs, the systems are likely to be adopted first by major AI companies such as OpenAI, Anthropic, and Meta, while Nvidia wagers inference providers could charge $45 per million tokens.
Because LPUs have limited on-chip memory, Nvidia plans to ship these systems later this year with Samsung manufacturing the LPUs, and about a thousand LPUs are needed for 1 trillion-parameter models.

Insights by Ground AI

15 Articles

Can Nvidia’s inference push at GTC help or hinder China’s catch-up efforts?

Nvidia’s Groq 3 LPU chip widens the AI gap with China, but offers Chinese firms niche inference market opportunities, analysts say.

6 days ago·Hong Kong

Read Full Article

PC Mag

Lean Left

Nvidia to Upgrade AI Chatbot Performance With New 'LPU' Chip

At GTC, Nvidia announced the Groq 3 LPU chip, which uses tech licensed from the AI company Groq. The LPU was part of seven upcoming data center chips intended to supercharge AI.

6 days ago·United States

Read Full Article

The Register

Center

Nvidia slaps Groq into new LPX racks for faster AI response

GTC: GPUzilla's $20B acquihire paves to way to AI agents that halucinate faster than ever

6 days ago

Read Full Article

StorageReview.com

NVIDIA Groq 3 LPX: Everything we know

NVIDIA's Groq 3 LPX pairs 256 LP30 LPUs with Vera Rubin NVL72, delivering up to 35x higher inference throughput per megawatt.

4 days ago

Read Full Article

technewstube.com

Analysis: Is Nvidia's Groq deal the endgame for AI chip startups?

At its 2026 GTC conference, Nvidia not only unveiled its Vera CPU but also officially launched the Groq 3 LPU chip, developed through a prior technology licensing arrangement with Groq and brought into its own ecosystem. Alongside it, Nvidia introduced the Groq 3 LPX platform - a server rack composed of 128 Groq 3 LPUs that can be directly integrated with the Vera Rubin solution. The move signals that Nvidia has successfully absorbed Groq's tech…

5 days ago

Read Full Article

ServeTheHome

Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference

With its upcoming Vera Rubin rackscale architecture, NVIDIA is going to be integrating LPUs from acquihire Groq, marking a major expansion beyond using GPUs alone for AI inference The post Decoding the Future of Inference At NVIDIA: Groq LPUs Join Vera Rubin Platform For Low-Latency Inference appeared first on ServeTheHome.

6 days ago

Read Full Article

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/year