The robot stack convergence nobody formally announced

Nextomoro now tracks fourteen labs whose primary product is a robot or the software a robot runs on. The architecture they are converging on is recognisable: a VLA backbone, a diffusion policy head, an RL whole-body controller, and a generative world model.
The robot stack convergence nobody formally announced
Nextomoro now tracks fourteen labs whose primary product is a robot or the software a robot runs on. They span humanoid Insurgents at near-frontier valuations, two robotics-AI foundation-model companies that have explicitly reframed robotics as a model problem, the longest-running AV operators, two defense-autonomy primes, and a Hyundai-owned incumbent that just unveiled a third-party behavior-foundation-model partnership with Toyota Research Institute. The architecture they are converging on is recognisable: a vision-language-action (VLA) backbone for perception and intent, a diffusion or flow-matching policy head for high-frequency control, a reinforcement-learning whole-body controller for balance and contact, and a generative world model that doubles as a training simulator and an evaluator. The capital map of 2025 priced the humanoid hardware. The capital map of 2026 prices the data and the models that go in it.

The robotics cohort on nextomoro has crossed a structural threshold over the last twelve months. The list of labs that ship a robot, an autonomous vehicle, or the embodied AI that runs on one is now long enough, and architecturally similar enough across companies, to read as a coherent industry rather than a collection of moonshots. The humanoid Insurgents are no longer the only story. The foundation-model layer (Physical Intelligence, Skild AI, and the NVIDIA Research GR00T line) has formalised as a distinct commercial category. The autonomous-vehicle cohort (Waymo, Tesla AI FSD, Wayve, Zoox) has split cleanly into modular-stack and end-to-end camps. Defense autonomy (Anduril, Shield AI) has acquired the contract base to fund frontier-grade compute. And the incumbents who never left (Boston Dynamics, Toyota Research Institute, Honda Research Institute) are increasingly the ones supplying the substrate that the Insurgents train on. This essay maps the cohort by form factor, walks the convergent software stack that all of them are now building toward, embeds the demos that matter, and identifies the five signals worth watching through 2026 and into 2027.

The cohort, mapped

The lab list breaks naturally into five buckets. Humanoid Insurgents: Figure AI, 1X, Tesla AI's Optimus program, Boston Dynamics' electric Atlas, Apptronik, Agility Robotics, and Unitree, plus the unprofiled adjacent peers (Sanctuary AI, Fourier Intelligence, Agibot, XPeng's humanoid program). Robotics-AI foundation-model Insurgents: Physical Intelligence and Skild AI. Autonomous vehicles: Waymo, Tesla FSD, Wayve, Zoox, plus the Chinese cohort. Defense and autonomous-edge systems: Anduril and Shield AI. Incumbents and research institutes that supply the underlying research substrate: Boston Dynamics, Toyota Research Institute, Honda Research Institute, NVIDIA Research, and most recently Project Prometheus, Jeff Bezos and Vik Bajaj's $38 billion physical-world AI insurgent founded November 2025.

The capital concentration across that list is heavily skewed. Figure AI is private at a reported $39.5 billion (Series C, February 2025). Project Prometheus closed an April 2026 round at $38 billion post-money. Anduril traded as high as roughly $84.5 billion on the secondary market in late 2025 against a primary $30.5 billion Series G in June 2025. Waymo raised approximately $5.6 billion in October 2024 at a multi-tens-of-billions valuation. 1X was reported to be raising up to $1 billion at a $10 billion valuation in late 2025. Wayve raised $1.05 billion in May 2024, the largest UK AI round at the time. Physical Intelligence sits at $2.4 billion (Series A November 2024); Skild at approximately $4 billion late 2025. Shield AI reached approximately $2.7 billion in its May 2024 Series F. The Insurgents in this category have raised aggregate disclosed capital somewhere between $35 billion and $50 billion across the past 24 months, depending on which Project Prometheus and Anduril valuations one counts, with another $10 billion to $20 billion implied across the unprofiled humanoid peers (Apptronik, Agility, Unitree, Fourier, Apptronik partners). That number is large enough that the funding-cycle dynamics first identified in the funding vintages essay apply with full force.

Robotics cohort latest disclosed valuations, 2024 to 2026
Capital concentration by cohort across the named robotics labs

Two patterns stand out. First, the Insurgent valuation ladder is unusually steep even by frontier-AI standards: the top three names (Figure AI, Project Prometheus, Anduril) together exceed the next seven combined, and all three crossed $30 billion within an 18-month window. Second, the cohort totals rebalance the public framing. Humanoid Insurgents dominate the press cycle but the foundation-model bucket already concentrates more aggregate capital per company on average than humanoids do once Figure is set aside, and the defense-autonomy cohort is structurally larger than the entire foundation-model bucket combined. The model-and-data layer has not yet caught up to the hardware layer in headline dollar terms, but the trajectory through 2026 to 2027 is the inversion the rest of this essay traces.

The geographic split is also distinctive. North America dominates on dollars (Figure, Tesla, Boston Dynamics, Physical Intelligence, Skild, Waymo, Zoox, Anduril, Shield AI, Project Prometheus, Apptronik, Agility, NVIDIA Research). Europe contributes the most credible AV2.0 entrant (Wayve, London) and the consumer-humanoid Insurgent (1X, originally Norwegian). Asia owns the incumbents (Honda, Toyota Research Institute, Boston Dynamics under Hyundai) and the Chinese low-cost humanoid cohort (Unitree, Fourier, Agibot, XPeng's humanoid program). The dollar-versus-units gap is enormous: Unitree's G1 humanoid ships at approximately $16,000, an order of magnitude under the $20,000-to-$30,000 NEO consumer pricing 1X has signalled and two orders under what an early commercial Figure 02 unit reportedly costs to build.

The software stack has converged

The most consequential development of the past 18 months is not on the hardware side. It is that almost every credible robotics lab is now building the same four-layer software stack, with implementation differences but not architectural ones.

Layer one is a vision-language-action (VLA) foundation model. This is the system that ingests camera frames and a natural-language instruction and outputs either an executable action sequence or a latent that the policy head decodes into actions. Physical Intelligence's π₀ (Pi-Zero) and the successor π₀.₅ established the recipe: a pretrained vision-language model (PaliGemma in π₀'s case) provides perception and language grounding, and a flow-matching action expert generates high-frequency motor commands at 30 to 50 Hz. Figure AI's Helix, released February 2025 and demonstrated below, is explicitly dual-system: a slower System 2 VLM does scene understanding and intent, and a faster 200 Hz System 1 visuomotor network handles dexterous motion. Google DeepMind's Gemini Robotics 1.5 line follows the same split with an explicit "thinking before acting" trace and an embodied-reasoning sibling model (Gemini Robotics-ER 1.5) for spatial planning. NVIDIA's open-weights Isaac GR00T N1.5 packages a diffusion-transformer action head on a VLM backbone. Skild AI's Skild Brain is the cross-platform variant of the same architecture, trained to operate across humanoids, quadrupeds, and manipulators from a single set of weights.

The architectural consensus is the dual-system VLA. The disagreement is about what to put in the action head (flow matching, diffusion, or autoregressive token prediction), how much sim and how much real teleop to train on, and whether the perception backbone is open-weights (NVIDIA's bet) or closed (Figure, DeepMind, Physical Intelligence's most recent releases).

Layer two is the policy head, almost always a diffusion or flow-matching model that outputs chunked action sequences (typically 8-to-50 step horizons) rather than single-step predictions. The diffusion-policy line was established by Toyota Research Institute and Columbia in their 2023 paper, and TRI's subsequent Large Behavior Models scaled it into approximately 450M-parameter diffusion transformers running at 30 Hz, pretrained on Atlas teleoperation data, Atlas Multi-Task Stand upper-body data, TRI Ramen, and the Open X-Embodiment cross-embodiment dataset. The August 2025 demonstration of an Atlas robot autonomously packing and sorting using TRI's LBM on Boston Dynamics hardware was the public state of the art for whole-body loco-manipulation as of the time of writing. The policy-head loss has shifted from L1 or L2 regression (the pre-2023 default for behavior cloning) toward flow matching, which converges faster and produces smoother trajectories.

Layer three is the whole-body controller. Locomotion, contact, and balance are still the domain of model-free reinforcement learning trained in simulation with heavy domain randomisation. Unitree's G1 and H1 demonstrations, Boston Dynamics' electric Atlas dynamic-recovery moves, and the various 2025 sim-to-real demos from academic labs all use this recipe. The state of the art has moved from per-skill RL (a separate policy for each gait or maneuver) toward whole-body behavior foundation models (a single multi-task policy that handles locomotion, recovery, and contact across hundreds of motions). NVIDIA's Eureka uses LLMs to automatically synthesise reward functions, which has eaten the human-engineered reward problem that historically blocked sim-to-real scaling. The hierarchy that is now emerging puts the VLA at the top, the diffusion-policy in the middle, and the RL whole-body controller at the bottom, a tripartite stack that is now standardising across labs.

Layer four is the world model. This is the most recent addition to the consensus stack and the most strategically consequential. NVIDIA Cosmos, released October 2024, is the principal physical-AI world-model platform; its GR00T-Dreams blueprint generated 780,000 training trajectories and 827 hours of synthetic video from an 88-hour teleoperation seed, which were used to train GR00T N1.5 in 36 hours instead of the months of teleop the equivalent dataset would have required. Wayve's GAIA-2, released March 2025 and trained on multi-camera footage from the UK, US, and Germany, is the AV-cohort analog; the December 2025 GAIA-3 release scaled to approximately 15 billion parameters and nine countries. Google DeepMind's Genie 2 and 3 produce playable interactive simulations from prompts. 1X has publicly described a "1X World Model" trained on EVE and NEO operational footage. The shared pattern: train a world model on real footage, use it both as a controllable simulator for policy training and as a counterfactual evaluator (a sim-as-judge that scores hypothetical actions against the world model's learned dynamics) that closes the loop on data scarcity.

The four-layer stack is not yet uniformly deployed. Some labs run only layers one and three (Boston Dynamics historically). Some run only layers one and two (most foundation-model Insurgents). Tesla AI famously runs an end-to-end neural-network policy that collapses layers one through three into a single network for FSD V12 and successor versions. But the direction of travel is unambiguous: the four-layer stack is the architecture the field is converging on, and the implementation choices that distinguish individual labs are increasingly the differentiator that matters.

The humanoid Insurgents

The humanoid cohort is where the public attention concentrates, and 2025 was the year that demonstrations stopped being scripted set pieces and started looking like manipulable policy outputs.

Figure AI released Helix in February 2025 and immediately repositioned the company from "humanoid hardware Insurgent backed by Microsoft and OpenAI" to "vertically integrated VLA-and-humanoid platform". The Helix demo shows two Figure 02 robots collaborating on a kitchen-task sequence with a single shared model coordinating both, a capability that no peer had publicly demonstrated in February 2025 and that has since become a competitive baseline.

The competitive context for Figure is harder than the $39.5 billion valuation suggests. The BMW Spartanburg deployment, the company's principal commercial reference, has been positively but not transformatively covered. The mid-2025 exit from the OpenAI collaboration agreement (Brett Adcock cited internal AI breakthroughs as the reason) removed a strategic-investor anchor that the Series B leveraged. And the post-2024 Chinese cohort (Unitree, Fourier, Agibot) ships humanoids at one-tenth Figure's price point, which constrains Figure's path to mass-market commercialisation even with a superior model layer.

1X is the consumer-humanoid bet. The company spent a decade as Halodi Robotics building compliant electric actuators before pivoting toward humanoids in 2022, raised from the OpenAI Startup Fund in March 2023, and through 2025 demonstrated NEO Gamma with consumer pre-orders open against early-access deposits. The 2026 product positioning combines a teleoperation-assisted operating mode (a human supervisor can take control during deployment) with a roadmap toward autonomous operation as the model improves.

The teleoperation transition is the central commercial question for 1X. A consumer-priced humanoid that depends on human teleoperators for most tasks is operating a different cost structure than a fully autonomous one; the speed at which 1X can shift the ratio from majority-teleop to majority-autonomous defines the commercial viability of the NEO consumer product.

Tesla AI's Optimus program is structurally distinct from every peer in that Tesla can deploy Optimus inside its own manufacturing facilities without external customer commitment. The We, Robot event in October 2024 showcased Optimus alongside the Cybercab unveiling, with Tesla projecting consumer pricing in the $20,000 to $30,000 range and production targets that, taken at face value, would dwarf every peer's planned 2027 volume.

The Optimus credibility question, as much as Tesla's other AI programs, comes down to delivery against the public timeline. Tesla's $700-billion-plus market capitalisation prices in the Optimus and Cybercab programs at substantially higher confidence than the publicly disclosed milestones currently justify. The 2026 to 2027 commercial-rollout window for both products will resolve that gap one way or the other.

Boston Dynamics' electric Atlas is the engineering-incumbent's bid for the same commercial market. The April 2024 unveiling retired the hydraulic Atlas in favour of an all-electric platform designed for Hyundai manufacturing deployment.

The structurally important Atlas event was not the hardware reveal but the August 2025 demonstration of long-horizon packing-and-sorting with TRI's Large Behavior Model. Boston Dynamics is the only humanoid Insurgent that has publicly adopted an external partner's behavior foundation model, and the choice signals that even the company with three decades of legged-robotics engineering depth has concluded that the model layer is more efficiently sourced from a research partner than built in-house at the speed the market now demands.

Apptronik and Agility Robotics complete the US humanoid cohort and arguably carry the strongest commercial-deployment evidence of any peer. Apptronik raised a $350 million Series A in February 2025 with Google participation, partnered with Google DeepMind in December 2024 to bring DeepMind's physical-AI models onto the Apollo platform, and has progressed through trials at Mercedes-Benz, Jabil, and GXO. The February 2026 disclosure of approximately $935 million in cumulative Series A capital at a $5.3 billion post-money valuation positioned Apptronik as the principal Google-aligned humanoid Insurgent. Agility Robotics' Digit, the wheeled-foot bipedal humanoid optimised for logistics, has logged more than 100,000 totes at GXO Logistics under the industry's first humanoid robot-as-a-service contract, with deployments at Spanx alongside Amazon trial work. The Agility rebrand to simply "Agility" in March 2026 signalled the consolidation of the company's positioning around humanoid automation rather than the broader robotics-research-spinout identity the Oregon State University lineage initially implied.

Unitree is the Chinese humanoid that has redrawn the cost curve. Founded in 2016 in Hangzhou by Wang Xingxing, Unitree began as a quadruped specialist and pivoted into humanoids with the H1 (2023) and G1 (2024). The G1 ships at approximately $16,000, the H1 has appeared in commercial pilot deployments and in Spring Festival Gala performances, and the public demonstrations through 2025 have escalated to a Unitree-versus-Unitree humanoid boxing match that surfaced widely in robotics-research social media. Industry reporting placed Unitree at more than 5,500 humanoid shipments in 2025, with Chinese manufacturers accounting for roughly 90 percent of the global humanoid unit volume that year. The May 2026 launch of a humanoid-robot motion App Store and the GD01 manned mecha at approximately 3.9 million yuan signalled the company's intent to define both the consumer-application platform and the upper end of the form-factor frontier.

The Unitree cost curve is the structural threat the US humanoid Insurgents have not yet fully priced. If the Chinese cohort sustains the order-of-magnitude price advantage through 2026 to 2027, and if Skild Brain or Physical Intelligence's π series provides a credible US-aligned model layer that can run on Chinese hardware, the commercial topology of the humanoid market may end up looking more like the Android-versus-iPhone split than like the closed integrated-platform model that Figure and Tesla are building toward.

The foundation-model layer

The two principal robotics-AI foundation-model Insurgents are structurally distinct from the humanoid hardware cohort. Both Physical Intelligence and Skild AI are explicitly positioned as the software layer that hardware partners integrate, not as vertically integrated platforms.

Physical Intelligence (Karol Hausman, Sergey Levine, Chelsea Finn, Lachy Groom, with founder lineage tracing to Google DeepMind robotics, UC Berkeley BAIR, and Stanford) released π₀ in October 2024 and the more recent π₀.₅ as open-weights research models with detailed technical reports. The choice to release weights, in contrast to Figure's closed Helix and DeepMind's closed Gemini Robotics, has built Pi's research-community standing rapidly.

Skild AI (Deepak Pathak and Abhinav Gupta, both Carnegie Mellon faculty) frames the bet as "one brain, any robot": a single set of weights trained to operate across quadrupeds, humanoids, and manipulators rather than a per-platform specialised model. The cross-platform thesis is unproven at commercial scale, but the public demonstrations across multiple form factors are the most ambitious published cross-embodiment work outside the Open X-Embodiment academic collaboration.

The frontier-lab analogs are Google DeepMind's Gemini Robotics, NVIDIA Research's open-weights Isaac GR00T N1 and N1.5, Toyota Research Institute's Large Behavior Models, and Meta AI's embodied-research line. The competitive dynamic among the foundation-model layer is one of the most actively contested in AI research today, with hardware partners (Figure, 1X, Boston Dynamics, Unitree, Apptronik, Agility, the auto OEMs) as the implicit customers each is competing to serve.

The autonomous-vehicle cohort

The AV cohort has split into a modular-stack camp (Waymo, Zoox, the Chinese robotaxi operators) and an end-to-end camp (Tesla FSD V12 onward, Wayve).

Waymo crossed 200,000-plus weekly paid robotaxi rides in early 2025 across Phoenix, San Francisco, Los Angeles, and Austin, with Atlanta, Miami, and Washington DC scheduled for additional 2025 to 2026 launches. The October 2024 $5.6 billion capital raise funded the commercial-expansion roadmap. Waymo's modular sensor-fusion approach (lidar, camera, radar) and its quarterly safety reporting through California DMV filings constitute the most rigorous public dataset on autonomous-driving safety performance in commercial operation.

Wayve is the European AV2.0 standard-bearer and the closest architectural analog to Tesla FSD outside of Tesla itself. The end-to-end neural-network approach, the OEM-supplier business model (Wayve sells AI Driver software to automakers rather than operating a robotaxi service), and the GAIA and LINGO research releases position Wayve as the structurally most credible commercial vehicle for the architecture that the rest of the robotics field is converging on. The March 2024 hire of Erez Dagan from Mobileye signalled commercial-execution credibility to OEM customers; the May 2024 $1.05 billion Series C funded the buildout.

Tesla FSD is the largest deployed end-to-end driving policy by fleet size (more than seven million Tesla vehicles), and FSD V12's transition to a fully end-to-end neural network in early 2024 was the field's most consequential validation of the architecture. The Tesla-Waymo safety-record comparison is methodologically inconclusive, but the architectural debate has shifted: as of 2026, the question is no longer whether end-to-end can match modular but how soon.

Zoox, the Amazon subsidiary, launched commercial robotaxi service in Las Vegas in June 2025 with its purpose-built bidirectional vehicle, and is expanding into the SF Bay Area through 2025 to 2026. The closed-loop integration between Amazon's infrastructure and Zoox's commercial deployment is structurally distinct from any other AV operator.

Defense and autonomous edge

The defense-autonomy cohort has scaled into the same capital ranges as the consumer-and-enterprise Insurgents, and the architectural overlap with humanoid and AV robotics is increasing.

Anduril Industries is the structural anchor. The June 2025 Series G of $2.5 billion at $30.5 billion led by Founders Fund, the secondary-market trades implying valuations as high as $84.5 billion by late 2025, the February 2025 takeover of the US Army's IVAS program from Microsoft (a $22 billion, 10-year contract), and the late-2024 OpenAI partnership for Lattice integration all positioned Anduril alongside frontier AI labs in capital terms. The Roadrunner counter-drone platform, the Bolt loitering munition line, the Dive-LD autonomous underwater vehicle, and the Lattice operating system constitute one of the broadest autonomous-systems portfolios in defense.

Shield AI is the autonomy-stack specialist, with the Hivemind software operating in GPS- and communications-denied environments and the V-BAT VTOL fixed-wing platform (acquired through the 2021 Martin UAV deal) deployed across USSOCOM, US Coast Guard, and DoD Replicator selections. The May 2024 Series F at $2.7 billion valuation funded the platform expansion.

The structurally interesting cross-pollination is that Anduril and Shield AI are now competing for the same AI talent as the humanoid Insurgents and the AV cohort, and the autonomy-stack architectures (perception → world-model-based planning → low-level control) are substantially similar across the defense and commercial cohorts. Whether the commercial-side and defense-side software stacks ultimately converge or fork into distinct branches is one of the more underwatched architectural questions in the field.

Quadrupeds, mobile manipulators, and the rest

The non-humanoid form factors have anchored the most reliable commercial-deployment metrics in the field. Boston Dynamics' Spot quadruped, deployed in thousands of units across industrial inspection, security, construction monitoring, and utilities applications, is the largest commercial legged-robot deployment globally.

Agility Robotics' Digit (a wheeled-foot bipedal humanoid optimised for warehouse logistics, not a traditional humanoid form factor) is the strongest public counterexample to the "humanoids are economically premature" argument; the 100,000-plus totes logged at GXO Logistics in 2024 to 2025 is the most rigorous public deployment data for bipedal robots in commercial operation. The Stretch warehouse-logistics robot from Boston Dynamics, ANYbotics' quadrupeds in industrial inspection, and the broader cohort of mobile-manipulator startups (Dexterity AI, Symbotic, Berkshire Grey, Locus Robotics) round out the commercial-deployment story that the humanoid cohort is still aspiring to.

The form-factor honourable mentions worth flagging: surgical robotics (Intuitive Surgical's da Vinci, Verb Surgical, Vicarious Surgical, Medtronic Hugo, plus newer entrants like Moon Surgical and CMR Surgical) remains the most economically successful robotics category by a wide margin and is now beginning to integrate VLA-style models for assistance; agricultural robotics (John Deere's autonomous tractor line, FarmWise, Carbon Robotics, Iron Ox) is scaling rapidly on a data-flywheel that more closely resembles the AV cohort than the humanoid one; marine and underwater autonomy (Saildrone, Anduril's Dive-LD, Saronic, Sea Machines) sits between defense and commercial. None of these are profiled on nextomoro yet, but they sit inside the same software-stack convergence that the labs that are profiled exemplify.

The data flywheel

The bottleneck in robotics, as in all of contemporary AI, is data. Robotics has the additional constraint that data is physical, slow to collect, and embodiment-specific. Three flywheels are emerging in parallel.

Fleet operation is the cleanest. Waymo's 200,000-plus weekly rides produce continuous real-world data that feeds back into Waymo Driver training. Tesla's seven-million-vehicle fleet does the same at much larger scale (and at substantially noisier label quality). Digit's 100,000 totes at GXO. NEO consumer units shipping to early-access homes through 2026. The fleet-operation flywheel scales linearly with deployment and is, by orders of magnitude, the most data-efficient lane available.

Teleoperation farms are the explicit substitute when no fleet exists. Physical Intelligence, Figure, 1X, Skild, and TRI all operate dedicated teleoperation centres, with derivatives of the ALOHA, Mobile ALOHA, and UMI handheld-gripper rigs lowering the cost per demonstration-hour by an order of magnitude relative to traditional teleop. The economics are still hard. One hour of teleoperation produces approximately one hour of training data, against industry estimates that generalist robot policies require millions of hours of dexterous contact-rich demonstrations. Teleop will plausibly never be the dominant data source by itself.

Synthetic data from world models is the lane that has scaled most aggressively over the past 18 months. NVIDIA's Cosmos + GR00T-Dreams pipeline turning 88 hours of teleop seed data into 827 hours of synthetic video and 780,000 training trajectories is the cleanest published example. Wayve's GAIA-2 and GAIA-3 do the same for driving. 1X's World Model does it for household robotics. The pattern is consistent: train a generative world model on real footage, use it as a controllable simulator for policy training, use it as a counterfactual evaluator for safety analysis. The remaining open question is fidelity. World models that look photorealistic to humans may still be off-distribution for low-level dynamics, and the field is actively building benchmarks (sim-as-judge agreement with real-world deployment outcomes) to measure that.

The data-flywheel competitive dynamic favours, in descending order of structural strength: companies with deployed fleets at scale (Tesla, Waymo, then Boston Dynamics on Spot, then Digit on warehouse), companies with the largest teleop farms and the lowest cost per demonstration (Physical Intelligence, Figure, 1X, Skild, TRI), and companies with the strongest world-model infrastructure (NVIDIA, Wayve, DeepMind, 1X). The labs that combine all three (NVIDIA Research is the leading example; Wayve in the AV cohort) are the structurally best-positioned for the 2027 to 2028 model-quality leap that everyone in the field is now pricing in.

What to watch

Five concrete signals over the next twelve to eighteen months.

The autonomy-to-teleop ratio for consumer-shipping humanoids. 1X's NEO consumer units and Figure's Helix-on-Figure-02 deployments will produce the first public dataset on how often a human operator has to take control. The Waymo analog (miles between disengagements) maps directly. Public targets of more than 50 to 100 task completions per human intervention will be the threshold that separates a viable consumer product from a teleop service masquerading as one.

Dollar-per-task economics in warehouse and manufacturing settings. Apptronik's Apollo at Mercedes and Jabil, Agility's Digit at GXO and Spanx, and Boston Dynamics' Atlas at Hyundai are the cleanest commercial proving grounds. The number to watch is humanoid-hourly opex against the US warehouse minimum-wage-plus-benefits blended cost of approximately $20 to $25 per hour. The first quarterly disclosure that a deployed humanoid operates below that threshold at any meaningful scale will be a structural commercial inflection.

Whether Skild Brain and π₀.₅ (or successors) get adopted on hardware platforms outside their developers' direct integration. The cross-platform foundation-model thesis depends on humanoid hardware partners adopting external model layers instead of building in-house. Boston Dynamics' August 2025 adoption of TRI's LBM is the strongest public datapoint in favour. Figure's June 2025 exit from the OpenAI collaboration is the strongest public datapoint against. The 2026 to 2027 partnership announcements will tell which architecture wins.

Project Prometheus's first public artifact. Jeff Bezos and Vik Bajaj's November 2025 launch raised $16.2 billion in two rounds at $38 billion post-money in roughly six months, a capital concentration on a scale that has no near analog in the recent Insurgent cohort. The thesis ("physical-world AI for engineering, manufacturing, aerospace, robotics, drug discovery") is broad enough to be unfalsifiable in 2026 but will resolve through 2027 to 2028 as the company ships its first concrete output. The question is whether Prometheus operates as a vertically integrated Insurgent (a Figure-or-Tesla shape) or as a horizontal research platform (a DeepMind-or-Anthropic shape applied to physical systems).

The geopolitical compute supply chain for robotics-AI training. The compute-rent dynamic identified in the Anthropic-Colossus essay applies with full force to robotics. World-model training, large-scale VLA pretraining, and policy training at GR00T-class scale are now consuming tens of thousands of GPUs per training run. NVIDIA Research's GR00T-Dreams pipeline, the Cosmos training infrastructure, and the implicit reliance on NVIDIA hardware across every Insurgent in the cohort positions NVIDIA as the structural platform layer for robotics-AI in the same way it became the structural platform layer for language-model training. Whether US-aligned compute remains the platform of choice for Chinese humanoid manufacturers (Unitree, Fourier, Agibot) is the geopolitical question the next two years will resolve.

The cohort that nextomoro tracks is now large enough and capital-rich enough to constitute a coherent industry. The architectural convergence on the dual-system VLA stack, the cross-cohort competitive dynamics between humanoid Insurgents and foundation-model platforms, the emergence of defense autonomy as a near-frontier compute consumer, and the structural data-flywheel advantages accruing to the labs with deployed fleets all suggest the same conclusion: the 2025 capital cycle priced the hardware. The 2026 capital cycle will price the model layer and the data. The labs that ship working policies on diverse embodiments will be the ones that capture the resulting value.

Sources

About the author
Nextomoro

Nextomoro

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.