AI Models

Every notable foundation model from the labs covered on Nextomoro. Filter by modality or by lab.
Eleven v3

Eleven v3

Eleven v3 is ElevenLabs's flagship text-to-speech and voice-cloning model, supporting 32 languages with highly expressive, natural-sounding speech synthesis and voice cloning from short audio samples.

Suno v4

Suno v4

Suno v4 is the fourth-generation music-generation model from Suno, capable of producing complete songs with vocals, instrumentation, and lyrics from a single text prompt.

Whisper

Whisper

Whisper is OpenAI's open-weights automatic speech recognition system, supporting transcription and translation across approximately 100 languages under the MIT license.

Seedance 2.0

Seedance 2.0

Seedance 2.0 is ByteDance Seed's second-generation text-to-video generation model, distributed through the Doubao consumer app, CapCut, and Volcano Engine, and notable for its integration into ByteDance's billion-user creator and short-video platforms.

Veo 3

Veo 3

Veo 3 is Google DeepMind's third-generation text-to-video generation model, producing high-resolution video with native audio and tight prompt fidelity, available through Vertex AI, the Gemini app, and Google's creative tools.

Sora 2

Sora 2

Sora 2 is OpenAI's second-generation text-to-video generation model, producing high-resolution videos up to two minutes long with physics realism and character consistency across frames.

Stable Diffusion 3.5

Stable Diffusion 3.5

Stable Diffusion 3.5 is an open-weights text-to-image model released by Stability AI in October 2024, available in Large and Medium variants and distributed through Hugging Face, the Stability AI API, and a wide ecosystem of self-hosting tools.

Midjourney v7

Midjourney v7

Midjourney v7 is the seventh-generation text-to-image model from the bootstrapped San Francisco lab, distributed through Discord and the Midjourney web app, and widely recognized as the leading image-generation system for creative and aesthetic output.

FLUX.2

FLUX.2

FLUX.2 is Black Forest Labs's second-generation flagship text-to-image model, available in closed-weights commercial and open-weights variants, with documented strengths in prompt fidelity, photorealism, text rendering, and multi-image reference synthesis.

Imagen 4

Imagen 4

Imagen 4 is Google DeepMind's fourth-generation text-to-image model, released in 2025 in Standard, Ultra, and Fast variants, and distributed through Vertex AI, the Gemini app, and Google AI Studio.

DALL-E 3

DALL-E 3

DALL-E 3 is OpenAI's third-generation text-to-image model, released in October 2023 and notable for its strong prompt fidelity, improved text rendering inside images, and native integration with ChatGPT.

gpt-oss

gpt-oss

gpt-oss is OpenAI's open-weights mixture-of-experts language model family released in August 2025, available in 20B and 117B parameter variants and distributed freely for self-hosted deployment.

Qwen 3

Qwen 3

Qwen 3 is the open-weights large language model family released by Alibaba's Qwen team in 2025, introducing a hybrid thinking architecture across a broad range of sizes and featuring Apache 2.0 licensing for most variants.

DeepSeek V4

DeepSeek V4

DeepSeek V4 is a 1.6-trillion-parameter open-weights mixture-of-experts language model released by DeepSeek in April 2026, combining frontier-tier capability with per-token pricing an order of magnitude below closed Western frontier labs.

Mistral Large 2

Mistral Large 2

Mistral Large 2 is a closed-weights frontier language model from Mistral AI, released in July 2024 as the most capable model in the Mistral Large family and notable for its multilingual range and function-calling capabilities.

Llama 4

Llama 4

Llama 4 is Meta AI's April 2025 open-weights large language model family, available in Maverick and Scout variants with native multimodal vision capabilities and released under the Llama Community License for broad self-hosted and partner deployment.

Grok 4.20

Grok 4.20

Grok 4.20 is xAI's 2026 flagship large language model, handling text and images and distributed through X, grok.com, Tesla vehicles, and a developer API, trained on the Colossus supercomputer in Memphis, Tennessee.

Gemini 3.1 Pro

Gemini 3.1 Pro

Gemini 3.1 Pro is Google DeepMind's April 2026 flagship multimodal model, handling text, images, audio, and video as native first-class inputs across a 2 million-token context window.

Claude Opus 4.7

Claude Opus 4.7

Claude Opus 4.7 is Anthropic's November 2025 flagship large language model, combining multimodal text and vision capabilities with hybrid extended-thinking inference across a 200,000-token context window.

GPT-5.5

GPT-5.5

GPT-5.5 is OpenAI's April 2026 flagship large language model, combining multimodal text and vision capabilities with optional deep-reasoning variants derived from the o-series reinforcement learning pipeline.

Every notable foundation model from the labs covered on Nextomoro. Filter by modality (text, image, video, audio, multimodal, code) or by lab. Each post covers capabilities, benchmark standing, access, and how the model sits against its peers.

AI Research Lab Intelligence

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.