Qwen 3

Qwen 3 is the open-weights large language model family released by Alibaba's Qwen team in 2025, introducing a hybrid thinking architecture across a broad range of sizes and featuring Apache 2.0 licensing for most variants.
Qwen 3

Qwen 3

Qwen 3 is the third major generation of the Qwen large language model family, developed by Alibaba Qwen / DAMO and released across 2025 with variants ranging from 0.6 billion to 235 billion parameters, the largest of which uses a mixture-of-experts architecture with 22 billion active parameters. Most variants are distributed under the Apache 2.0 open-source license through Hugging Face, ModelScope, and GitHub, with commercial API access available through Alibaba Cloud's DashScope platform. As of April 2026, Qwen 3 represents the broadest open-weights model family from any single lab in terms of size coverage and modality-specific specialization, with distinct sub-lines for text reasoning, vision, audio, coding, and mathematics.

At a glance

  • Lab: Alibaba Qwen / DAMO
  • Released: 2025 (Qwen3-235B-A22B and flagship variants; smaller variants released throughout the year)
  • Modality: Text and multimodal (text reasoning, vision via Qwen-VL, audio via Qwen-Audio, coding via Qwen-Coder, mathematics via Qwen-Math)
  • Open weights: Yes for most variants. Apache 2.0 license covers the standard text and most multimodal variants. Some preview-tier releases have separate licensing; the 2026 Qwen3.6-Max-Preview is closed-weights and API-only, while Qwen 3 proper remains open-weights.
  • Context window: Up to 128,000 tokens for the flagship Qwen3-235B-A22B; varies by variant
  • Pricing: Free for self-hosted open-weights variants; Alibaba Cloud DashScope API pricing applies for managed inference (pricing listed at dashscope.aliyuncs.com)
  • Distribution channels: Hugging Face Qwen organization, ModelScope, GitHub QwenLM, Alibaba Cloud DashScope, Alibaba Cloud Model Studio / Bailian

Origins

Alibaba's DAMO Academy, established in October 2017 as the company's primary fundamental-research organization with an announced $15 billion three-year commitment, had published early pre-trained language models including StructBERT by 2019. The Tongyi Qianwen program -- internationally known as Qwen -- launched in April 2023, with the first open-weights release, Qwen-7B, arriving in September 2023 under Apache 2.0 licensing.

The first-generation Qwen family expanded rapidly across 2024. Qwen-14B, Qwen-72B, and Qwen-VL (vision-language), Qwen-Audio, Qwen-Coder, and Qwen-Math followed within months of the initial release, establishing a pattern of parallel development across both scale and modality. By the time Qwen2 was released in mid-2024, the Qwen family had accumulated more than 45 distinct model variants across text, vision, audio, and specialized domains, and had emerged as one of the most-downloaded open-weights model series globally.

Qwen 2.5, released in late 2024, refined the text flagship and brought the coding and math sub-lines to competitive standing with specialist models from other labs. The series was also notable for multilingual coverage: Qwen models trained on substantially more Chinese-language data than Western open-weights peers, making them the default option for developers building Chinese-language applications.

Qwen 3 launched in 2025 with a hybrid thinking architecture that allowed the same model to operate in a non-thinking mode for low-latency tasks and an extended-thinking mode for complex multi-step reasoning in mathematics, code, and logic. The flagship Qwen3-235B-A22B uses a mixture-of-experts architecture activating 22 billion of 235 billion parameters per inference step. Dense variants from 0.6B to 32B followed, covering sizes tractable on consumer hardware, and the modality-specific lines (Qwen3-VL for vision, Qwen-Audio, Qwen-Coder, Qwen-Math) received corresponding updates.

The Qwen team was founded under Lin Junyang, who served as its head until departing in March 2026, when oversight passed to Alibaba Cloud CTO Zhou Jingren. The Qwen 3 series was developed under Lin's direction and reflects the integrated, variant-heavy release philosophy he established.

Capabilities

Qwen 3's core text reasoning capability covers instruction following, multi-turn dialogue, document analysis, code generation, and mathematical problem solving. The hybrid thinking architecture is the defining technical feature: users and developers can select whether a given request triggers extended chain-of-thought reasoning or returns a direct, low-latency response. The switch is implemented as a model-level configuration rather than a separate model, which simplifies deployment and allows the same weights to serve both use cases.

Multilingual coverage is a distinctive strength. Qwen 3 was trained on data in more than 100 languages, with particular depth in Chinese, English, and other Asian languages. On Chinese-language benchmarks, Qwen 3 consistently leads among open-weights models, including larger Western alternatives. On English-language benchmarks, the flagship Qwen3-235B-A22B is competitive with models several times its parameter count due to the efficiency advantages of the MoE configuration.

The modality-specific sub-lines extend coverage well beyond text. Qwen3-VL handles image and video understanding, with capability covering document OCR, chart analysis, scene description, and visual question answering. Qwen-Audio supports speech recognition, audio understanding, and voice-based interaction. Qwen-Coder is a code-generation and code-editing specialist, distributed separately for developers who want coding-optimized weights without the overhead of the full general-purpose model. Qwen-Math addresses mathematical reasoning, including competition-level problems and proof assistance.

Tool use and agentic capabilities are part of the Qwen 3 specification. The model supports external API calls, code interpreters, and multi-step agent operation within scaffolding frameworks via QwenAgent, a lightweight orchestration library.

The architecture of the flagship, Qwen3-235B-A22B, uses Group Query Attention and rotary position embeddings. Mixture-of-experts routing selects a subset of specialized feed-forward layers per token, keeping per-inference compute at roughly the level of a 22-billion-parameter dense model while maintaining higher effective capacity.

Benchmarks and standing

Qwen 3 benchmarks competitively within the open-weights tier and in several categories approaches closed-weights frontier models.

On the Artificial Analysis Intelligence Index, Qwen3-235B-A22B occupies a position in the upper range of open-weights models, behind the closed-weights frontier (GPT-5.5 at 60.24, Claude Opus 4.7 at 57.28, Gemini 3.1 Pro at 57.18) but ahead of the majority of open-weights alternatives at similarly manageable active-parameter counts.

On LMArena's general ELO leaderboard, Qwen3-235B-A22B places in the top tier among open-weights models, with strong multilingual and coding scores.

On GPQA Diamond (graduate-level scientific reasoning), Qwen3-235B-A22B reports scores in the low-to-mid 70s, competitive with the DeepSeek V3 generation and ahead of most sub-100B-parameter models. On AIME 2025 (advanced mathematics competition problems), Qwen3-235B-A22B with extended thinking enabled scores in the 70s, reflecting the benefit of the hybrid reasoning architecture on structured mathematical tasks.

On HumanEval+ (function-completion code generation), the Qwen3-235B-A22B and Qwen-Coder variants score in the high 80s, trailing DeepSeek V4's 91.2 and Claude Opus 4.7's benchmark-leading coding positions, but ahead of most open-weights alternatives at similar or larger parameter counts.

On ARC-AGI Challenge (abstract reasoning), Qwen 3 with extended thinking shows competitive performance among open-weights models, with the hybrid architecture providing material benefit on the reasoning-intensive tasks in that benchmark.

The Qwen team characterizes Qwen 3's claim as breadth and consistency across the portfolio rather than peak leadership on any single benchmark. Benchmark leadership is point-in-time; the open-weights space rotates rapidly with each new release cycle.

Access and pricing

Open-weights variants of Qwen 3 are available through the Hugging Face Qwen organization, which hosts model cards, weights, and tokenizer files for all text and multimodal variants. The ModelScope platform, Alibaba's model hub and the primary distribution channel within China, provides the same files alongside Chinese-language documentation. Source code and inference tooling are available through GitHub QwenLM.

Commercial API access runs through Alibaba Cloud's DashScope platform and Bailian, the enterprise model-as-a-service interface. Per-token pricing is published at dashscope.aliyuncs.com and varies by model size; the flagship Qwen3-235B-A22B is priced comparably to mid-tier closed-weights alternatives.

For self-hosted deployment, the Qwen 3 line is compatible with vLLM, TGI, and llama.cpp. Smaller variants run on consumer GPUs; the Qwen3-32B fits on professional single or dual-GPU setups; the Qwen3-235B-A22B requires multi-GPU infrastructure but benefits from MoE efficiency that reduces memory requirements relative to a comparable dense model.

Comparison

Direct competitors to Qwen 3 as of April 2026:

  • GPT-5.5 (OpenAI). Closed-weights composite benchmark leader. GPT-5.5 outperforms Qwen3-235B-A22B on the Intelligence Index by a substantial margin. For organizations that need the absolute highest benchmark scores and are not constrained by open-weights requirements or API pricing, GPT-5.5 is the default choice. Qwen 3's advantage is open-weights access, zero per-token cost for self-hosted deployment, and full weight-level fine-tuning control.
  • Claude Opus 4.7 (Anthropic). The strongest closed-weights performer on code (SWE-bench Verified 74.0) and instruction following. Claude Opus 4.7 leads Qwen3-235B-A22B on English-language coding and reasoning benchmarks. Qwen 3 partially closes the gap on multilingual tasks, where Claude's training data is less comprehensive in Asian languages.
  • Gemini 3.1 Pro (Google DeepMind). Closed-weights, with a 2-million-token context window and native Google Search grounding for live web retrieval. The context-window advantage is significant for long-document tasks. Qwen 3's 128K context is competitive within the open-weights space but does not match the longest-context closed options.
  • Llama 4 (Meta AI). The primary Western open-weights competitor. Llama 4 Maverick uses a 109-billion-parameter MoE with 17 billion active parameters; Qwen3-235B-A22B uses 235 billion total with 22 billion active. On most English-language benchmarks, the two models are close, with Llama 4 showing advantages on some instruction-following tasks and Qwen 3 showing advantages on multilingual and math-heavy tasks. The strategic difference is geographic and supply-chain: Llama carries a US-origin certification and is more readily deployable in regulated Western enterprise environments; Qwen offers broader variant coverage and deeper multilingual depth.
  • DeepSeek V4 (DeepSeek). The closest Chinese open-weights peer. DeepSeek V4 Pro uses a 1.6-trillion-parameter MoE with 49 billion active parameters and a 1-million-token context window, and benchmarks above Qwen3-235B-A22B on most English-language categories, particularly coding (HumanEval+ 91.2 vs. Qwen 3's high-80s) and reasoning. Qwen 3's distinctive position relative to DeepSeek is the full variant spectrum, from 0.6B to 235B-A22B, and the multimodal coverage (vision, audio, coding, math specializations) that DeepSeek does not match at comparable breadth.

Qwen 3's distinctive market position is the combination of the broadest variant spectrum in the open-weights space, multimodal coverage across vision and audio, multilingual depth across 100+ languages with particular strength in Chinese, and Alibaba Cloud distribution for managed commercial deployment.

Outlook

Open questions for Qwen 3 and the Alibaba Qwen program over the next 6 to 18 months:

  • Qwen 4 timeline. Qwen3.5 and Qwen3.6 have extended the generation through April 2026; the question is when a Qwen 4 pre-training run arrives and what architectural changes it brings.
  • Closed-weights drift. The decision to gate Qwen3.6-Max-Preview as API-only marks a partial departure from the Apache 2.0 default. If Alibaba extends this to more models, developer-community goodwill is at risk.
  • US export-control implications. Tightening US-China trade controls affect Alibaba's hardware access for training and the addressable market for Alibaba Cloud in Western enterprise accounts.
  • Multimodal consolidation. Whether Qwen 4 integrates vision, audio, and text into a single omni model rather than parallel specialized variants is an architectural decision that will shape competitive positioning against native-multimodal closed-weights alternatives.
  • Qwen Coder and Qwen Math specialization. As coding agents become commercially consequential, the strategic value of separate specialist weights versus a single capable frontier model is an open question.
  • Regulatory pressure on Chinese-origin models. Several Western governments have applied restrictions to Chinese-origin AI on data-residency grounds. How broadly these restrictions expand through 2026 and 2027 will shape Qwen's addressable market in regulated sectors.

Sources

About the author
Nextomoro

AI Research Lab Intelligence

Keep track of what's happening from cutting edge AI Research institutions.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.