Fireworks AI

Fireworks AI is an American artificial intelligence inference platform company headquartered in Redwood City, California, founded in 2022 by Lin Qiao (former Senior Director of Engineering at Meta AI / FAIR, where she led the PyTorch open-source machine-learning framework engineering organization), Dmytro Dzhulgakov (former Meta PyTorch lead and one of the principal PyTorch maintainers), James Reed, and adjacent former Meta engineers. The company operates the Fireworks AI inference platform, providing fast and cost-efficient inference of open-weights foundation models (Meta Llama, Mistral, DeepSeek, Qwen, and adjacent open-weights model lines) through GPU-based infrastructure with custom inference-stack optimizations rather than the custom-silicon approach that Groq and Cerebras have pursued. Fireworks AI's distinguishing positioning combines the founder-team PyTorch credibility with the GPU-based inference architecture, with the bet being that frontier GPU hardware (NVIDIA H100, H200, B200) combined with custom inference-stack optimization can deliver competitive inference economics for open-weights foundation models without requiring custom-silicon investment. As of April 2026, Fireworks AI is one of the principal commercial AI inference platforms globally, with $250 million Series C in July 2025 at $4.5 billion valuation, over $200 million in annualized recurring revenue reported in late-2025 industry coverage, and substantial enterprise-customer traction across AI Insurgents and frontier-AI-application organizations.

At a glance

Founded: 2022 in Redwood City, California, by Lin Qiao, Dmytro Dzhulgakov, James Reed, and adjacent former Meta PyTorch engineering leadership.
Status: Private. Series C in July 2025 at $4.5 billion valuation.
Funding: Approximately $325 million in cumulative private capital. Series C of $250 million in July 2025 at $4.5 billion valuation led by Sequoia Capital with Index Ventures, Felicis, NVIDIA, MongoDB Ventures, AMD Ventures, and existing investors. Series B of $52 million in July 2024 at $552 million valuation led by Sequoia. Earlier seed and Series A financing.
CEO: Lin Qiao, Co-Founder and Chief Executive Officer. PhD computer science (UC Santa Barbara); former Senior Director of Engineering at Meta AI / FAIR leading the PyTorch open-source machine-learning framework engineering organization (2018 to 2022).
Other notable leadership: Dmytro Dzhulgakov, Co-Founder and Chief Technology Officer. Former Meta PyTorch lead and one of the principal PyTorch maintainers. James Reed, Co-Founder. Senior engineering leadership recruited from Meta, Google, NVIDIA, and adjacent ML-infrastructure organizations.
Open weights: N/A. Fireworks AI is an inference platform that runs third-party open-weights models. The platform contributes substantial open-source inference-stack tooling through the broader PyTorch ecosystem.
Flagship products: Fireworks AI inference platform serving over 100 open-weights foundation models including Meta Llama 3.x and 4 series, Mistral, DeepSeek, Qwen 2.5 and 3, and adjacent open-weights model lines. FireFunction (function-calling specialized models trained for agentic-application use cases). FireOptimizer (custom inference-stack optimization tooling). On-premise enterprise deployment options for customers requiring sovereign-cloud or on-premise inference.

Origins

Fireworks AI was founded in 2022 in Redwood City by Lin Qiao, Dmytro Dzhulgakov, James Reed, and adjacent former Meta PyTorch engineering leadership. Qiao had spent 2018 to 2022 as Senior Director of Engineering at Meta leading the PyTorch open-source machine-learning framework engineering organization, in which capacity she had overseen the engineering scale-up of PyTorch from a research-oriented framework into the principal industrial machine-learning framework globally (alongside Google's TensorFlow and JAX). The PyTorch engineering experience anchored both the founder-team credibility within the broader ML-infrastructure community and the technical thesis that GPU-based inference combined with custom inference-stack optimization could deliver competitive inference economics for open-weights foundation models without requiring custom-silicon investment.

The 2022 to 2024 founding period built the Fireworks AI inference platform under what was at that time a nascent commercial category. The post-ChatGPT generative-AI wave produced substantial commercial demand for AI inference, but the principal commercial-distribution channels through 2022 to 2023 were either frontier-AI-lab APIs (OpenAI, Anthropic, Google DeepMind) running closed-weights models or hyperscale-cloud providers running broader-platform offerings. Fireworks AI's positioning as a developer-focused open-weights-model inference platform with PyTorch-engineering-team credibility produced substantial early-stage traction with AI-application developers building on Meta Llama, Mistral, and adjacent open-weights model lines.

The July 2024 Series B of $52 million at $552 million valuation, led by Sequoia Capital, validated the commercial trajectory. Industry coverage characterized the round as one of the principal AI inference platform Series B rounds of the period. The 2024 to 2025 period saw substantial enterprise-customer expansion alongside continued platform-engineering investment, with industry coverage reporting customer traction across AI Insurgents, frontier-AI-application companies, and selected enterprise organizations.

The July 2025 Series C of $250 million at $4.5 billion valuation, led by Sequoia Capital with Index Ventures, Felicis, NVIDIA, MongoDB Ventures, AMD Ventures, and existing investors, was a defining commercial inflection. Industry coverage reported approximately $200 million in annualized recurring revenue at the time of the Series C, representing rapid year-over-year growth. The round was characterized in industry coverage as confirming Fireworks AI's positioning as one of the principal commercial AI inference platforms globally and as a structurally consequential vehicle for open-weights-model commercial deployment.

The 2025 to 2026 period has continued enterprise-customer expansion alongside continued platform-engineering investment. Industry coverage has reported substantial customer-base growth and continued commercial-product expansion including FireFunction (function-calling specialized models for agentic-application use cases) and FireOptimizer (custom inference-stack optimization tooling).

Mission and strategy

Fireworks AI's stated mission is to provide the fastest and most cost-efficient inference platform for open-weights foundation models, with developer-experience and inference-stack-optimization as the principal commercial differentiators. The strategy combines three threads. First, the Fireworks AI inference platform serving over 100 open-weights foundation models with developer-self-serve API access. Second, custom inference-stack optimization (FireOptimizer) and specialized model variants (FireFunction for function-calling) that anchor commercial differentiation against direct inference-platform peers. Third, on-premise enterprise deployment options for customers requiring sovereign-cloud or on-premise inference.

The competitive premise is that AI inference is a structurally large commercial category as foundation-model-application deployment scales globally, that GPU-based infrastructure with custom inference-stack optimization can deliver competitive inference economics without requiring custom-silicon investment, and that PyTorch-engineering-team credibility combined with developer-self-serve commercial structure produces structural differentiation against both custom-silicon competitors (Groq, Cerebras) and hyperscale-cloud first-party inference services (AWS Bedrock, Google Vertex AI, Azure AI Foundry).

Models and products

Fireworks AI inference platform. Developer-self-serve API platform serving over 100 open-weights foundation models including Meta Llama 3.x and 4 series, Mistral, DeepSeek (V3, R1, V3.1), Qwen (2.5, 3), and adjacent open-weights model lines.
FireFunction. Specialized function-calling models trained for agentic-application use cases.
FireOptimizer. Custom inference-stack optimization tooling for enterprise customers requiring tuned-throughput characteristics.
On-premise enterprise deployment. Customer-managed inference deployment for organizations requiring sovereign-cloud or on-premise inference.
Continued PyTorch ecosystem contributions. Through the broader PyTorch open-source community via founder-team continued engagement.

Distribution channels include developer-self-serve API access through fireworks.ai, direct enterprise sales for on-premise and customer-managed deployments, and strategic-partner relationships with infrastructure providers including NVIDIA and AMD.

Benchmarks and standing

Fireworks AI's evaluation framework focuses on inference latency-and-throughput benchmarks (with the company publishing comparison studies positioning Fireworks AI competitively against direct inference-platform peers and hyperscale-cloud first-party inference services), commercial metrics (customer count, annualized recurring revenue), and the developer-community engagement.

Industry coverage has consistently characterized Fireworks AI as one of the principal commercial AI inference platforms globally, with the founder-team PyTorch credibility, the rapid commercial growth, and the July 2025 Series C valuation as principal validating data points.

Industry coverage has grouped Fireworks AI with Together AI (the principal direct GPU-based inference platform peer), Anyscale (the Ray-based AI infrastructure peer with adjacent inference offerings), Replicate, OpenRouter, and adjacent commercial AI inference platforms. Among that group, Fireworks AI's distinguishing positioning is the founder-team PyTorch credibility, the developer-self-serve commercial structure, and the integrated inference-stack optimization tooling.

Leadership

As of April 2026, Fireworks AI's senior leadership includes:

Lin Qiao, Co-Founder and Chief Executive Officer.
Dmytro Dzhulgakov, Co-Founder and Chief Technology Officer.
James Reed, Co-Founder.
Senior engineering leadership recruited from Meta, Google, NVIDIA, and adjacent ML-infrastructure organizations.

The founding cohort has remained intact through the company's commercial-growth period.

Funding and backers

Earlier rounds (2022 to 2023): Approximately $25 million across seed and Series A.
Series B (July 2024): $52 million at $552 million valuation led by Sequoia Capital.
Series C (July 2025): $250 million at $4.5 billion valuation led by Sequoia Capital with Index Ventures, Felicis, NVIDIA, MongoDB Ventures, AMD Ventures, and existing investors.
Cumulative capital approximately $325 million as of April 2026.

Industry position

Fireworks AI occupies a distinctive position as one of the principal commercial AI inference platforms globally, with the founder-team PyTorch credibility, the rapid commercial-growth trajectory (over $200 million in annualized recurring revenue reported in late 2025), the developer-self-serve commercial structure, and the GPU-based inference architecture combined with custom inference-stack optimization. Industry coverage has consistently characterized Fireworks AI as one of the structurally consequential AI inference platforms of the post-2022 commercial-inference era.

The structural risks are two. First, the AI inference category has attracted substantial commercial competition through 2024 to 2026, with hyperscale-cloud providers (AWS Bedrock, Google Vertex AI, Azure AI Foundry), inference-platform peers (Together AI, OpenRouter, Replicate), custom-silicon competitors (Groq, Cerebras), and frontier-AI labs' own deployment infrastructure all competing for the same enterprise-AI-inference customer base. Second, the open-weights model commercial dynamics depend on the continued availability of competitive open-weights model releases from frontier-AI labs (Meta Llama, Mistral, DeepSeek, Qwen, and adjacent) — any structural shift toward fewer or less-competitive open-weights releases would directly affect the inference-platform commercial opportunity.

Competitive landscape

Together AI. Direct GPU-based AI inference platform peer. Principal direct competitor with similar commercial positioning.
Groq, Cerebras. Custom-silicon AI accelerator peers with adjacent commercial positioning. Different architectural approach.
Anyscale. Ray-based AI infrastructure peer with adjacent inference offerings.
AWS Bedrock, Google Vertex AI, Azure AI Foundry. Hyperscale-cloud first-party AI inference services.
Replicate, OpenRouter, Modal, Baseten. Adjacent commercial AI inference platforms with developer-self-serve commercial structures.
OpenAI, Anthropic, Google DeepMind, Mistral AI APIs. Frontier-AI-lab APIs offering closed-weights model alternatives.
Meta AI / FAIR, DeepSeek, Alibaba Qwen / DAMO. Open-weights model providers whose models Fireworks AI runs.

Outlook

Continued enterprise-customer expansion and annualized recurring revenue growth through 2026 to 2027.
Continued FireFunction and FireOptimizer commercial-product iteration.
The competitive dynamic with Together AI, hyperscale-cloud first-party inference services, and custom-silicon competitors.
Potential additional fundraising at higher valuations or a path toward eventual public-market exit.
Continued cooperation with NVIDIA and AMD as strategic-partner infrastructure providers.

Sources

Fireworks AI official site. Company reference.
Lin Qiao LinkedIn. Co-Founder and CEO reference.
Dmytro Dzhulgakov LinkedIn. Co-Founder and CTO reference.
Series C announcement (July 2025). Funding-history reference.
PyTorch. Founder-team prior open-source project.

Fireworks AI

Fireworks AI

At a glance

Origins

Mission and strategy

Models and products

Benchmarks and standing

Leadership

Funding and backers

Industry position

Competitive landscape

Outlook

Sources

Nextomoro

AI Research Lab Intelligence

Fireworks AI

Fireworks AI

At a glance

Origins

Mission and strategy

Models and products

Benchmarks and standing

Leadership

Funding and backers

Industry position

Competitive landscape

Outlook

Sources

Nextomoro

Eleven v3

Suno v4

Whisper

Seedance 2.0

Veo 3

AI Research Lab Intelligence