Stable Diffusion 3.5

Stable Diffusion 3.5 is Stability AI's latest generation of the Stable Diffusion open-weights image-generation model line, released in October 2024 in three variants: Large (8 billion parameters), Large Turbo, and Medium (2.5 billion parameters). The model generates images from text prompts and is available for download on Hugging Face, through the Stability AI API, and via the broad ecosystem of self-hosting tools including ComfyUI, Automatic1111, and the Hugging Face Diffusers library. As of April 2026, Stable Diffusion 3.5 is the most widely deployed open-weights image-generation model by active installation count, even as it no longer occupies the frontier of open-weights image quality, a position now held by Black Forest Labs' FLUX.2.

At a glance

Lab: Stability AI
Released: October 2024
Modality: Image (text-to-image generation)
Open weights: Yes. Stability AI Community License: free for non-commercial use; free for commercial use for organizations with under $1 million in annual revenue; Enterprise license required for organizations above the $1 million revenue threshold.
Variants: Large (8B parameters), Large Turbo (distilled for faster inference), Medium (2.5B parameters)
Output resolution: Up to 1024x1024 pixels natively; output resolution can be varied through diffusion sampling settings in self-hosted environments
Pricing: Free for non-commercial use; free for commercial use under $1M annual revenue (Community License); Enterprise license for over $1M annual revenue organizations (contact Stability AI); Stability AI API at https://platform.stability.ai charges per-image rates; partner platform pricing varies by provider
Distribution channels: Hugging Face stabilityai organization (https://huggingface.co/stabilityai), Stability AI API (https://platform.stability.ai), GitHub (https://github.com/Stability-AI), Replicate, fal.ai, Together AI, self-hosting via Diffusers / ComfyUI / Automatic1111

Origins

The Stable Diffusion name traces to August 2022, when Stability AI released Stable Diffusion 1.0, the open-weights text-to-image model developed primarily at the University of Munich Computer Vision Group (CompVis) by Robin Rombach, Andreas Blattmann, Patrick Esser, Dominik Lorenz, and collaborators, with Stability AI providing compute infrastructure and commercial support. The August 2022 release was a watershed moment: it was the first high-quality open-weights image-generation model available without restrictions. Within months, a broad ecosystem had formed around it, including LoRA (Low-Rank Adaptation) fine-tuning workflows, custom checkpoints, and interfaces such as Automatic1111's stable-diffusion-webui, which became the dominant self-hosting interface for non-developer users.

Subsequent releases extended the family. Stable Diffusion 1.5 (October 2022) became the de facto foundation checkpoint for the LoRA fine-tuning community, a position it retained years after release. SDXL (2023) introduced 3.5 billion parameters and substantially better output quality. Stable Diffusion 3 (early 2024) adopted a multimodal diffusion transformer (MMDiT) architecture in place of the earlier UNet backbone.

Through 2023 and into 2024, Stability AI faced financial difficulties, criticism of founder Emad Mostaque's management, and researcher departures. In March 2024, Mostaque resigned as CEO. In August 2024, Robin Rombach, Andreas Blattmann, Patrick Esser, and Dominik Lorenz, the principal original Stable Diffusion researchers, left Stability AI to found Black Forest Labs, backed by Andreessen Horowitz. Black Forest Labs released FLUX.1 simultaneously with its launch, immediately establishing a competing open-weights lineage from the same team that built Stable Diffusion.

Stability AI completed a leadership reset in June 2024 with Prem Akkaraju (former WETA Digital CEO) as CEO and Sean Parker (Napster co-founder) as Executive Chairman, paired with approximately $80 million in fresh capital and debt restructuring. SD 3.5, released in October 2024, was the first major model release under the new leadership, representing the post-restructuring continuation of the Stable Diffusion line by a team that no longer included the model's original creators.

Capabilities

Stable Diffusion 3.5 introduces three variant configurations addressing different deployment trade-offs. SD 3.5 Large (8 billion parameters) is the quality-maximizing variant, retaining the MMDiT architecture introduced in SD 3 that jointly processes text and image representations through transformer layers. SD 3.5 Large Turbo applies distillation to reduce sampling steps, trading some quality for lower latency and inference cost. SD 3.5 Medium (2.5 billion parameters) targets consumer GPU hardware (8-16 GB VRAM) and is the most widely used variant for community fine-tuning and LoRA training because its smaller size reduces the compute required for fine-tuning experiments.

Across the family, strengths include prompt adherence for single-subject and moderate-complexity descriptions, stylistic versatility, and compatibility with the extensive SD tooling ecosystem. The Diffusers library, ComfyUI, and Automatic1111 support all three variants through standard checkpoint-loading workflows.

SD 3.5 does not match the image quality of FLUX.2 on most composite evaluation categories. The departure of the original SD team to found Black Forest Labs created a research continuity gap: the researchers who built SDXL and SD 3 are now directing a competing model line, and SD 3.5 represents the work of the remaining Stability AI team. This context is relevant to interpreting the quality trajectory relative to FLUX.

Where SD 3.5 retains distinctive advantages is ecosystem breadth. Years of community development have produced an unmatched tooling inventory: LoRA training infrastructure, ControlNet integrations, custom scheduler implementations, inpainting workflows, and thousands of community-trained fine-tune checkpoints. SD 3.5 variants are compatible with much of this ecosystem in a way that newer architectures are not.

Benchmarks and standing

Image-generation benchmarking is substantially less standardized than text-model benchmarking. There is no widely adopted composite leaderboard equivalent to the Artificial Analysis Intelligence Index. Evaluations combine human-preference side-by-side comparisons, FID (Frechet Inception Distance) scores, and capability-specific tests covering prompt adherence, text rendering, and photorealism.

The Hugging Face Text-to-Image Leaderboard shows SD 3.5 Large performing competitively on prompt adherence for moderate-complexity prompts, consistent with the MMDiT architecture's text conditioning. On composite quality and photorealism, SD 3.5 trails FLUX.2, Midjourney v7, and Imagen 4. On LMArena's image arena, SD 3.5's placement is below FLUX.2 and Midjourney v7 on most quality dimensions; DALL-E 3 leads SD 3.5 on several task categories.

Text rendering within generated images is a relative weakness. The MMDiT architecture improved text handling compared to SD 1.x and SDXL, but DALL-E 3, FLUX.2, and Ideogram are generally rated stronger on legible text-within-image tasks.

SD 3.5's benchmark position is best understood in two frames simultaneously: mid-field in direct quality comparisons against current systems, and among the most-deployed image-generation models by installation volume owing to the LoRA community's continued investment and the ComfyUI/Automatic1111 tooling compatibility.

Benchmark leadership in image generation is point-in-time and prompt-category-dependent; methodologies are not standardized.

Access and pricing

All three SD 3.5 variants are published through the stabilityai organization on Hugging Face at https://huggingface.co/stabilityai. Downloading requires accepting the Community License terms on the model card. The Community License is free for non-commercial use and free for commercial use by organizations with under $1 million in annual revenue. Organizations above that threshold must contact Stability AI for an Enterprise license.

Self-hosting is the primary deployment mode. The Hugging Face Diffusers library (https://huggingface.co/docs/diffusers) provides the Python API for running SD 3.5 programmatically. ComfyUI (https://github.com/comfyanonymous/ComfyUI) is the leading node-based workflow interface, with community-built nodes for advanced sampling, ControlNet conditioning, and multi-step pipelines. Automatic1111's stable-diffusion-webui provides a browser-based interface for non-developer users.

The Stability AI API at https://platform.stability.ai offers hosted access on a per-image credit basis for developers who prefer a managed endpoint over self-hosting. Partner platforms including Replicate (https://replicate.com), fal.ai (https://fal.ai), and Together AI (https://www.together.ai) also host SD 3.5 variants through their own inference APIs with their own pricing structures. Source code and reference implementations are available on GitHub at https://github.com/Stability-AI.

Comparison

The direct peer set for Stable Diffusion 3.5 in April 2026 is the leading text-to-image generation systems:

FLUX.2 (Black Forest Labs). The leading open-weights-derived image-generation system and SD 3.5's most direct structural competitor. FLUX.2 was created by the same researchers who built Stable Diffusion, who left Stability AI in August 2024 to found Black Forest Labs. On most composite quality benchmarks, FLUX.2 leads SD 3.5 on prompt fidelity, photorealism, and text rendering. FLUX.2's Schnell variant is Apache-2.0-licensed for commercial use; its Dev variant is non-commercial open weights; its Pro variant is closed-weights commercial. SD 3.5's advantage relative to FLUX is ecosystem incumbency: the SD tooling ecosystem (ComfyUI nodes, LoRA checkpoints, Automatic1111 plugins) has a multi-year head start on any FLUX-equivalent tooling, though the FLUX community is building toward parity rapidly.
DALL-E 3 (OpenAI). The image-generation model with the largest consumer user base, distributed through ChatGPT and Microsoft Bing Image Creator. DALL-E 3 leads SD 3.5 on text rendering and accessibility for non-technical users; SD 3.5's advantage is that it is freely downloadable and self-hostable under the Community License, with no per-image cost for organizations under the revenue threshold.
Imagen 4 (Google DeepMind). Google's fourth-generation image-generation model, available through Vertex AI and the Gemini app. Imagen 4 Ultra leads SD 3.5 on photorealism and text rendering, with the structural advantage of Google Cloud enterprise distribution and SLA commitments. SD 3.5's advantage is open-weights access without a commercial API requirement for in-budget organizations.
Midjourney v7. The dominant prosumer image-generation system with approximately $500 million in annual revenue. Midjourney v7 is closed-weights, distributed through Discord and the Midjourney web app. Midjourney leads SD 3.5 on aesthetic quality, particularly for stylized, artistic, and creative-professional outputs. SD 3.5's advantage is open-weights access and the self-hosting flexibility that Midjourney, which has no open-weights release, does not offer.

Stable Diffusion 3.5's distinctive position across this peer set is ecosystem incumbency combined with the Community License's revenue-threshold structure. For creators, researchers, and businesses under the $1 million revenue threshold, SD 3.5 provides capable image generation at no license cost with full self-hosting flexibility. The LoRA fine-tuning community, the ComfyUI node ecosystem, and the Automatic1111 plugin library represent a level of tooling investment that no competing open-weights model line has yet replicated.

Outlook

Several open questions shape Stable Diffusion 3.5's trajectory through 2026 and into 2027:

Stable Diffusion 4 timeline and capability profile. Whether Stability AI ships a Stable Diffusion 4 generation, and on what timeline, is the central unknown. The post-restructuring team has not publicly committed to a release schedule. A generational quality step comparable to the SDXL-to-SD3 transition would be needed to close the gap to FLUX.2 on composite quality benchmarks.
Stability AI's commercial sustainability post-Mostaque. The Akkaraju-Parker-Cameron leadership reset stabilized the organization through 2024, but whether Stability AI can generate sufficient enterprise revenue from its creative-industry positioning to fund frontier model development remains open. The Community License's revenue-threshold structure provides a clear monetization mechanism for enterprise customers, but the enterprise pipeline's scale is not publicly characterized.
The SD ecosystem's adaptability to FLUX-based alternatives. Whether the ComfyUI, LoRA, and Automatic1111 communities migrate toward FLUX.2 and future Black Forest Labs releases, or continue to anchor on Stable Diffusion checkpoints, is an open question. If FLUX-compatible LoRA training becomes as accessible as SD-compatible fine-tuning, one of SD 3.5's principal advantages narrows.
Community License enforcement and competitor licensing. The $1 million revenue threshold is an unusual structure in the open-weights space. How Stability AI enforces it in practice, and whether competitors adopt similar revenue-gated structures or more permissive Apache-style licensing, will affect the competitive positioning of SD 3.5 and its successors.
Stability AI's creative-industry strategy. The James Cameron board appointment and the Akkaraju-Parker leadership represent a strategic bet on film, music, and visual-effects customers. Whether Stability AI secures meaningful integrations that generate enterprise license revenue is a key indicator for the lab's post-restructuring trajectory.

Sources

Stability AI: Stable Diffusion 3.5 announcement. Official October 2024 release announcement covering the Large, Large Turbo, and Medium variants.
Stable Diffusion 3.5 Large on Hugging Face. Model card, weights, and Community License terms for SD 3.5 Large.
Stability AI on GitHub. Source code, reference implementations, and documentation.
Stability AI API platform. Hosted access, per-image pricing, and developer documentation.
Wikipedia: Stable Diffusion. Comprehensive model family history from 2022 through the SD 3 and SD 3.5 generations.
Wikipedia: Emad Mostaque. Founder biographical reference and Stability AI organizational history.
VentureBeat: Stable Diffusion creators launch Black Forest Labs. Context on the August 2024 founding team departure from Stability AI.
Deadline: Former WETA Digital CEO Prem Akkaraju, Sean Parker Join Stability AI. June 2024 leadership reset coverage.

Stable Diffusion 3.5

Stable Diffusion 3.5

At a glance

Origins

Capabilities

Benchmarks and standing

Access and pricing

Comparison

Outlook

Sources

Nextomoro

AI Research Lab Intelligence

Stable Diffusion 3.5

Stable Diffusion 3.5

At a glance

Origins

Capabilities

Benchmarks and standing

Access and pricing

Comparison

Outlook

Sources

Nextomoro

Eleven v3

Suno v4

Whisper

Seedance 2.0

Veo 3

AI Research Lab Intelligence