Groq

Groq is the American AI inference company founded in 2016 by former Google TPU architect Jonathan Ross, developer of the Language Processing Unit (LPU) custom AI accelerator and the GroqCloud inference platform, valued at $2.8 billion after the August 2024 Series D.
Groq

Groq

Groq is an American artificial intelligence inference company headquartered in Mountain View, California, founded in 2016 by Jonathan Ross and Doug Wightman. Ross had been one of the principal engineers on the original Google Tensor Processing Unit (TPU) program at Google, where he served as a 20-percent project that became one of Google's structurally consequential AI hardware investments; he left Google in 2016 to found Groq with the explicit thesis that an AI-inference-specialized custom processor architecture could deliver substantially better latency and throughput than the GPU-based alternatives that have dominated AI compute since the 2012 deep-learning wave. Groq develops the Language Processing Unit (LPU) — a custom deterministic-execution AI inference accelerator architecture — alongside the GroqCloud inference platform that provides API access to leading open-weights foundation models running on LPU infrastructure at substantially lower latency than GPU-based alternatives. The company's commercial-distribution emphasis through 2023 to 2026 has shifted from on-premise LPU hardware sales toward GroqCloud API delivery, with the platform reaching over 2 million developers as of late 2024 according to company-reported figures. As of April 2026, Groq is one of the principal commercial AI inference companies globally and one of the few credible non-NVIDIA commercial AI accelerator companies at scale.

At a glance

  • Founded: 2016 in Mountain View, California, by Jonathan Ross and Doug Wightman.
  • Status: Private. Series D in August 2024 at approximately $2.8 billion valuation. Subsequent strategic-partnership rounds and reported funding through 2025 with valuations approaching multi-billion-dollar ranges.
  • Funding: Approximately $1 billion-plus cumulative private capital. Series D of $640 million in August 2024 led by BlackRock with Cisco Investments, KDDI, Samsung Catalyst Fund, and existing investors. Earlier Series C of $300 million in April 2021 led by Tiger Global and D1 Capital. Additional strategic-partnership financing from Saudi Arabia (HUMAIN partnership) reported through 2025.
  • CEO: Jonathan Ross, Co-Founder and Chief Executive Officer. Former engineer on the original Google TPU program; left Google in 2016 to found Groq.
  • Other notable leadership: Doug Wightman, Co-Founder. Senior engineering leadership recruited from Google, NVIDIA, and adjacent semiconductor and AI infrastructure organizations through 2017 to 2026.
  • Open weights: N/A. Groq is an inference platform that runs third-party open-weights models (Meta Llama, Mistral, DeepSeek, Qwen, and adjacent open-weights model lines).
  • Flagship products: GroqCloud (developer API platform serving open-weights models on LPU infrastructure); LPU on-premise hardware deployments for enterprise customers; the GroqRack and GroqNode reference-architecture systems. Approximately 2 million developers reported on GroqCloud as of late 2024.

Origins

Groq was founded in 2016 in Mountain View by Jonathan Ross and Doug Wightman, with Ross having departed Google in 2016 after his role as one of the principal engineers on the original Google TPU program. The Google TPU program, which Ross had begun as a 20-percent project at Google, had grown into one of Google's structurally consequential AI hardware investments by the mid-2010s, and Ross's experience with the TPU program convinced him that AI-inference-specialized custom processor architecture could deliver substantially better latency-and-throughput characteristics than GPU-based alternatives for AI inference workloads.

The Groq founding thesis was specifically oriented around inference rather than training — the bet being that AI training would remain GPU-dominated (with NVIDIA's CUDA software ecosystem providing structural advantages that custom-silicon alternatives could not match), while AI inference (where deterministic-execution latency and throughput characteristics matter more than software-ecosystem breadth) presented a structural opportunity for custom-processor alternatives. The Language Processing Unit (LPU) architecture that Groq developed reflected this thesis: deterministic-execution silicon optimized for inference-style workloads with predictable low-latency response characteristics.

The 2017 to 2022 period was the engineering-development and commercial-foundation phase. Groq raised approximately $300 million in cumulative private capital across seed, Series A, Series B, and Series C rounds. The April 2021 Series C of $300 million was led by Tiger Global and D1 Capital, providing growth-equity capital for engineering scale-up. The principal commercial focus during this period was on-premise LPU hardware sales to enterprise and government customers, including selected US Department of Defense and intelligence-community deployments.

The 2023 to 2024 period was the commercial inflection. The post-ChatGPT generative-AI wave produced substantial commercial demand for AI inference at scales and latencies that GPU-based alternatives could not economically match for high-throughput inference applications. Groq's launch of GroqCloud in 2023 — a developer API platform serving open-weights models on LPU infrastructure — produced rapid developer adoption, with the company reporting approximately 2 million developers on GroqCloud as of late 2024. The August 2024 Series D of $640 million at $2.8 billion valuation was led by BlackRock with Cisco Investments, KDDI (the Japanese telecommunications operator), Samsung Catalyst Fund, and existing investors. The Series D was characterized in industry coverage as a defining commercial inflection for the post-NVIDIA AI inference category.

The 2025 period brought substantial strategic-partnership development. The May 2025 Saudi-US AI infrastructure announcements coincided with reported Groq partnership commitments under the broader HUMAIN initiative, with industry coverage characterizing the Saudi cooperation as one of the larger non-NVIDIA AI compute deployment commitments globally. Continued GroqCloud commercial expansion alongside the strategic-partnership development has anchored Groq's positioning as one of the principal commercial alternatives to NVIDIA-based AI inference infrastructure.

The 2025 to 2026 period has continued GroqCloud commercial expansion, LPU architecture iteration, and strategic-partnership development across global AI infrastructure deployments.

Mission and strategy

Groq's stated mission is to provide the fastest AI inference globally, with the LPU architectural focus on deterministic-execution low-latency inference as the principal commercial differentiator against GPU-based alternatives. The strategy combines three threads. First, the LPU custom-silicon architecture with continued generation-over-generation iteration on the deterministic-execution inference architecture. Second, the GroqCloud developer-API platform that anchors developer adoption and the principal commercial channel. Third, strategic-partnership deployments — including the Saudi HUMAIN cooperation and adjacent global AI infrastructure customers — that anchor large-scale commercial deployments at scales beyond what GroqCloud-developer-self-serve channels can produce.

The competitive premise is that AI inference is structurally a different commercial problem from AI training, that NVIDIA's CUDA software ecosystem advantages that dominate training do not translate fully into inference-workload advantages, and that custom-silicon alternatives optimized for inference-specific requirements (deterministic-execution latency, predictable throughput, energy efficiency) can capture material commercial market share in the AI inference category as it scales.

Models and products

  • GroqCloud. Developer API platform serving open-weights models (Meta Llama 3.1, Llama 3.3, Mistral, DeepSeek, Qwen 2.5, and adjacent open-weights model lines) on LPU infrastructure at low latency. Approximately 2 million developers reported as of late 2024.
  • LPU (Language Processing Unit). Groq's custom deterministic-execution AI inference accelerator architecture. Multiple generations released through 2017 to 2026.
  • GroqRack and GroqNode. Reference-architecture systems for on-premise LPU deployment.
  • Strategic-partnership AI infrastructure deployments. Including the Saudi HUMAIN cooperation under the May 2025 partnership announcements and adjacent global infrastructure customers.
  • Enterprise and government on-premise sales. Including selected US Department of Defense and intelligence-community deployments.

Distribution channels combine the GroqCloud developer-API self-serve platform, direct enterprise sales for on-premise hardware, and strategic-partnership deployments at scale.

Benchmarks and standing

Groq's evaluation framework focuses on inference latency-and-throughput benchmarks, with the company reporting and the broader AI infrastructure community confirming substantial latency advantages over GPU-based alternatives on selected open-weights model inference workloads. Industry coverage has consistently characterized Groq as one of the principal commercial AI inference companies globally, with the LPU architectural innovation, the GroqCloud developer adoption, and the strategic-partnership deployments as principal validating data points.

Industry coverage has grouped Groq with Cerebras (the wafer-scale-engine peer with a related but distinct architectural approach), Fireworks AI (the inference platform peer using GPU-based infrastructure), Together AI, SambaNova Systems, Tenstorrent, and adjacent commercial AI inference and accelerator organizations as the principal non-NVIDIA AI compute alternatives. Among that group, Groq's distinguishing positioning is the LPU custom-silicon architecture and the developer-self-serve GroqCloud commercial channel.

Leadership

As of April 2026, Groq's senior leadership includes:

  • Jonathan Ross, Co-Founder and Chief Executive Officer. Former Google TPU engineer.
  • Doug Wightman, Co-Founder.
  • Senior engineering leadership recruited from Google, NVIDIA, and adjacent semiconductor and AI infrastructure organizations.

Funding and backers

  • Earlier rounds (2017 to 2021): Multiple seed and Series A through C rounds, totaling approximately $300 million across the period. Series C of $300 million in April 2021 led by Tiger Global and D1 Capital.
  • Series D (August 2024): $640 million at $2.8 billion valuation led by BlackRock with Cisco Investments, KDDI, Samsung Catalyst Fund, and existing investors.
  • Subsequent strategic-partnership financing (2025 to 2026): Including reported Saudi cooperation through the HUMAIN initiative.
  • Cumulative capital approximately $1 billion-plus as of April 2026.

Industry position

Groq occupies a distinctive position as one of the principal commercial AI inference companies globally and as one of the few credible non-NVIDIA commercial AI accelerator companies at scale. The LPU custom-silicon architecture, the GroqCloud developer-API platform with approximately 2 million developers, and the strategic-partnership deployments anchor the company's structural competitive positioning. Industry coverage has consistently characterized Groq as one of the structurally consequential AI infrastructure companies of the post-2022 commercial-inference era.

The structural risks are two. First, NVIDIA's continued architectural and software-ecosystem advances have closed some of the inference-latency advantage that the LPU architecture initially established, particularly with NVIDIA's Blackwell-generation inference-optimized configurations. Second, the AI inference category has attracted substantial commercial competition through 2024 to 2026, with hyperscale-cloud providers (AWS Trainium and Inferentia, Google TPU, Microsoft Maia), AI inference platform peers (Fireworks AI, Together AI, Cerebras), and frontier-AI labs' own deployment infrastructure all competing for the same enterprise-AI-inference customer base.

Competitive landscape

  • NVIDIA Research / NVIDIA AI infrastructure. Principal commercial competitor across both training and inference. NVIDIA's CUDA software ecosystem and Blackwell-generation inference advances are the principal structural competitive factors.
  • Cerebras. Wafer-scale-engine peer with a related but distinct architectural approach. Direct non-NVIDIA AI accelerator competitor.
  • Fireworks AI, Together AI. AI inference platform peers using GPU-based infrastructure with adjacent commercial positioning.
  • SambaNova Systems, Tenstorrent. Custom-silicon AI accelerator peers with smaller commercial scale.
  • AWS Trainium and Inferentia, Google TPU, Microsoft Maia. Hyperscale-cloud first-party AI accelerator alternatives.
  • CoreWeave, Lambda Labs. AI-specialized cloud peers using GPU-based infrastructure.
  • HUMAIN, Core42. Strategic-partnership customers and adjacent sovereign-AI infrastructure peers.

Outlook

  • Continued GroqCloud commercial expansion and LPU architectural iteration through 2026 to 2027.
  • The competitive dynamic with NVIDIA Blackwell-generation inference-optimized infrastructure.
  • Continued strategic-partnership development including the Saudi HUMAIN cooperation expansion.
  • Potential additional fundraising at higher valuations or a path toward eventual public-market exit.
  • The AI inference category commercial dynamics as the broader AI compute build-out continues.

Sources

About the author
Nextomoro

AI Research Lab Intelligence

Keep track of what's happening from cutting edge AI Research institutions.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.