The robot stack convergence nobody formally announced

Nextomoro now tracks fourteen labs whose primary product is a robot or the software a robot runs on. They span humanoid Insurgents at near-frontier valuations, two robotics-AI foundation-model companies that have explicitly reframed robotics as a model problem, the longest-running AV operators, two defense-autonomy primes, and a Hyundai-owned incumbent that just unveiled a third-party behavior-foundation-model partnership with Toyota Research Institute. The architecture they are converging on is recognisable: a vision-language-action (VLA) backbone for perception and intent, a diffusion or flow-matching policy head for high-frequency control, a reinforcement-learning whole-body controller for balance and contact, and a generative world model that doubles as a training simulator and an evaluator. The capital map of 2025 priced the humanoid hardware. The capital map of 2026 prices the data and the models that go in it.

The robotics cohort on nextomoro has crossed a structural threshold over the last twelve months. The list of labs that ship a robot, an autonomous vehicle, or the embodied AI that runs on one is now long enough, and architecturally similar enough across companies, to read as a coherent industry rather than a collection of moonshots. The humanoid Insurgents are no longer the only story. The foundation-model layer (Physical Intelligence, Skild AI, and the NVIDIA Research GR00T line) has formalised as a distinct commercial category. The autonomous-vehicle cohort (Waymo, Tesla AI FSD, Wayve, Zoox) has split cleanly into modular-stack and end-to-end camps. Defense autonomy (Anduril, Shield AI) has acquired the contract base to fund frontier-grade compute. And the incumbents who never left (Boston Dynamics, Toyota Research Institute, Honda Research Institute) are increasingly the ones supplying the substrate that the Insurgents train on. This essay maps the cohort by form factor, walks the convergent software stack that all of them are now building toward, embeds the demos that matter, and identifies the five signals worth watching through 2026 and into 2027.

The cohort, mapped

The lab list breaks naturally into five buckets. Humanoid Insurgents: Figure AI, 1X, Tesla AI's Optimus program, Boston Dynamics' electric Atlas, Apptronik, Agility Robotics, and Unitree, plus the unprofiled adjacent peers (Sanctuary AI, Fourier Intelligence, Agibot, XPeng's humanoid program). Robotics-AI foundation-model Insurgents: Physical Intelligence and Skild AI. Autonomous vehicles: Waymo, Tesla FSD, Wayve, Zoox, plus the Chinese cohort. Defense and autonomous-edge systems: Anduril and Shield AI. Incumbents and research institutes that supply the underlying research substrate: Boston Dynamics, Toyota Research Institute, Honda Research Institute, NVIDIA Research, and most recently Project Prometheus, Jeff Bezos and Vik Bajaj's $38 billion physical-world AI insurgent founded November 2025.

The capital concentration across that list is heavily skewed. Figure AI is private at a reported $39.5 billion (Series C, February 2025). Project Prometheus closed an April 2026 round at $38 billion post-money. Anduril traded as high as roughly $84.5 billion on the secondary market in late 2025 against a primary $30.5 billion Series G in June 2025. Waymo raised approximately $5.6 billion in October 2024 at a multi-tens-of-billions valuation. 1X was reported to be raising up to $1 billion at a $10 billion valuation in late 2025. Wayve raised $1.05 billion in May 2024, the largest UK AI round at the time. Physical Intelligence sits at $2.4 billion (Series A November 2024); Skild at approximately $4 billion late 2025. Shield AI reached approximately $2.7 billion in its May 2024 Series F. The Insurgents in this category have raised aggregate disclosed capital somewhere between $35 billion and $50 billion across the past 24 months, depending on which Project Prometheus and Anduril valuations one counts, with another $10 billion to $20 billion implied across the unprofiled humanoid peers (Apptronik, Agility, Unitree, Fourier, Apptronik partners). That number is large enough that the funding-cycle dynamics first identified in the funding vintages essay apply with full force.

Robotics cohort latest disclosed valuations, 2024 to 2026

Capital concentration by cohort across the named robotics labs

Two patterns stand out. First, the Insurgent valuation ladder is unusually steep even by frontier-AI standards: the top three names (Figure AI, Project Prometheus, Anduril) together exceed the next seven combined, and all three crossed $30 billion within an 18-month window. Second, the cohort totals rebalance the public framing. Humanoid Insurgents dominate the press cycle but the foundation-model bucket already concentrates more aggregate capital per company on average than humanoids do once Figure is set aside, and the defense-autonomy cohort is structurally larger than the entire foundation-model bucket combined. The model-and-data layer has not yet caught up to the hardware layer in headline dollar terms, but the trajectory through 2026 to 2027 is the inversion the rest of this essay traces.

The geographic split is also distinctive. North America dominates on dollars (Figure, Tesla, Boston Dynamics, Physical Intelligence, Skild, Waymo, Zoox, Anduril, Shield AI, Project Prometheus, Apptronik, Agility, NVIDIA Research). Europe contributes the most credible AV2.0 entrant (Wayve, London) and the consumer-humanoid Insurgent (1X, originally Norwegian). Asia owns the incumbents (Honda, Toyota Research Institute, Boston Dynamics under Hyundai) and the Chinese low-cost humanoid cohort (Unitree, Fourier, Agibot, XPeng's humanoid program). The dollar-versus-units gap is enormous: Unitree's G1 humanoid ships at approximately $16,000, an order of magnitude under the $20,000-to-$30,000 NEO consumer pricing 1X has signalled and two orders under what an early commercial Figure 02 unit reportedly costs to build.

The software stack has converged

The most consequential development of the past 18 months is not on the hardware side. It is that almost every credible robotics lab is now building the same four-layer software stack, with implementation differences but not architectural ones.

Layer one is a vision-language-action (VLA) foundation model. This is the system that ingests camera frames and a natural-language instruction and outputs either an executable action sequence or a latent that the policy head decodes into actions. Physical Intelligence's π₀ (Pi-Zero) and the successor π₀.₅ established the recipe: a pretrained vision-language model (PaliGemma in π₀'s case) provides perception and language grounding, and a flow-matching action expert generates high-frequency motor commands at 30 to 50 Hz. Figure AI's Helix, released February 2025 and demonstrated below, is explicitly dual-system: a slower System 2 VLM does scene understanding and intent, and a faster 200 Hz System 1 visuomotor network handles dexterous motion. Google DeepMind's Gemini Robotics 1.5 line follows the same split with an explicit "thinking before acting" trace and an embodied-reasoning sibling model (Gemini Robotics-ER 1.5) for spatial planning. NVIDIA's open-weights Isaac GR00T N1.5 packages a diffusion-transformer action head on a VLM backbone. Skild AI's Skild Brain is the cross-platform variant of the same architecture, trained to operate across humanoids, quadrupeds, and manipulators from a single set of weights.

The architectural consensus is the dual-system VLA. The disagreement is about what to put in the action head (flow matching, diffusion, or autoregressive token prediction), how much sim and how much real teleop to train on, and whether the perception backbone is open-weights (NVIDIA's bet) or closed (Figure, DeepMind, Physical Intelligence's most recent releases).

Layer two is the policy head, almost always a diffusion or flow-matching model that outputs chunked action sequences (typically 8-to-50 step horizons) rather than single-step predictions. The diffusion-policy line was established by Toyota Research Institute and Columbia in their 2023 paper, and TRI's subsequent Large Behavior Models scaled it into approximately 450M-parameter diffusion transformers running at 30 Hz, pretrained on Atlas teleoperation data, Atlas Multi-Task Stand upper-body data, TRI Ramen, and the Open X-Embodiment cross-embodiment dataset. The August 2025 demonstration of an Atlas robot autonomously packing and sorting using TRI's LBM on Boston Dynamics hardware was the public state of the art for whole-body loco-manipulation as of the time of writing. The policy-head loss has shifted from L1 or L2 regression (the pre-2023 default for behavior cloning) toward flow matching, which converges faster and produces smoother trajectories.

Layer three is the whole-body controller. Locomotion, contact, and balance are still the domain of model-free reinforcement learning trained in simulation with heavy domain randomisation. Unitree's G1 and H1 demonstrations, Boston Dynamics' electric Atlas dynamic-recovery moves, and the various 2025 sim-to-real demos from academic labs all use this recipe. The state of the art has moved from per-skill RL (a separate policy for each gait or maneuver) toward whole-body behavior foundation models (a single multi-task policy that handles locomotion, recovery, and contact across hundreds of motions). NVIDIA's Eureka uses LLMs to automatically synthesise reward functions, which has eaten the human-engineered reward problem that historically blocked sim-to-real scaling. The hierarchy that is now emerging puts the VLA at the top, the diffusion-policy in the middle, and the RL whole-body controller at the bottom, a tripartite stack that is now standardising across labs.

Layer four is the world model. This is the most recent addition to the consensus stack and the most strategically consequential. NVIDIA Cosmos, released October 2024, is the principal physical-AI world-model platform; its GR00T-Dreams blueprint generated 780,000 training trajectories and 827 hours of synthetic video from an 88-hour teleoperation seed, which were used to train GR00T N1.5 in 36 hours instead of the months of teleop the equivalent dataset would have required. Wayve's GAIA-2, released March 2025 and trained on multi-camera footage from the UK, US, and Germany, is the AV-cohort analog; the December 2025 GAIA-3 release scaled to approximately 15 billion parameters and nine countries. Google DeepMind's Genie 2 and 3 produce playable interactive simulations from prompts. 1X has publicly described a "1X World Model" trained on EVE and NEO operational footage. The shared pattern: train a world model on real footage, use it both as a controllable simulator for policy training and as a counterfactual evaluator (a sim-as-judge that scores hypothetical actions against the world model's learned dynamics) that closes the loop on data scarcity.

The four-layer stack is not yet uniformly deployed. Some labs run only layers one and three (Boston Dynamics historically). Some run only layers one and two (most foundation-model Insurgents). Tesla AI famously runs an end-to-end neural-network policy that collapses layers one through three into a single network for FSD V12 and successor versions. But the direction of travel is unambiguous: the four-layer stack is the architecture the field is converging on, and the implementation choices that distinguish individual labs are increasingly the differentiator that matters.

The humanoid Insurgents

The humanoid cohort is where the public attention concentrates, and 2025 was the year that demonstrations stopped being scripted set pieces and started looking like manipulable policy outputs.

Figure AI released Helix in February 2025 and immediately repositioned the company from "humanoid hardware Insurgent backed by Microsoft and OpenAI" to "vertically integrated VLA-and-humanoid platform". The Helix demo shows two Figure 02 robots collaborating on a kitchen-task sequence with a single shared model coordinating both, a capability that no peer had publicly demonstrated in February 2025 and that has since become a competitive baseline.

The competitive context for Figure is harder than the $39.5 billion valuation suggests. The BMW Spartanburg deployment, the company's principal commercial reference, has been positively but not transformatively covered. The mid-2025 exit from the OpenAI collaboration agreement (Brett Adcock cited internal AI breakthroughs as the reason) removed a strategic-investor anchor that the Series B leveraged. And the post-2024 Chinese cohort (Unitree, Fourier, Agibot) ships humanoids at one-tenth Figure's price point, which constrains Figure's path to mass-market commercialisation even with a superior model layer.

1X is the consumer-humanoid bet. The company spent a decade as Halodi Robotics building compliant electric actuators before pivoting toward humanoids in 2022, raised from the OpenAI Startup Fund in March 2023, and through 2025 demonstrated NEO Gamma with consumer pre-orders open against early-access deposits. The 2026 product positioning combines a teleoperation-assisted operating mode (a human supervisor can take control during deployment) with a roadmap toward autonomous operation as the model improves.

The teleoperation transition is the central commercial question for 1X. A consumer-priced humanoid that depends on human teleoperators for most tasks is operating a different cost structure than a fully autonomous one; the speed at which 1X can shift the ratio from majority-teleop to majority-autonomous defines the commercial viability of the NEO consumer product.

Tesla AI's Optimus program is structurally distinct from every peer in that Tesla can deploy Optimus inside its own manufacturing facilities without external customer commitment. The We, Robot event in October 2024 showcased Optimus alongside the Cybercab unveiling, with Tesla projecting consumer pricing in the $20,000 to $30,000 range and production targets that, taken at face value, would dwarf every peer's planned 2027 volume.

The Optimus credibility question, as much as Tesla's other AI programs, comes down to delivery against the public timeline. Tesla's $700-billion-plus market capitalisation prices in the Optimus and Cybercab programs at substantially higher confidence than the publicly disclosed milestones currently justify. The 2026 to 2027 commercial-rollout window for both products will resolve that gap one way or the other.

Boston Dynamics' electric Atlas is the engineering-incumbent's bid for the same commercial market. The April 2024 unveiling retired the hydraulic Atlas in favour of an all-electric platform designed for Hyundai manufacturing deployment.

The structurally important Atlas event was not the hardware reveal but the August 2025 demonstration of long-horizon packing-and-sorting with TRI's Large Behavior Model. Boston Dynamics is the only humanoid Insurgent that has publicly adopted an external partner's behavior foundation model, and the choice signals that even the company with three decades of legged-robotics engineering depth has concluded that the model layer is more efficiently sourced from a research partner than built in-house at the speed the market now demands.

Apptronik and Agility Robotics complete the US humanoid cohort and arguably carry the strongest commercial-deployment evidence of any peer. Apptronik raised a $350 million Series A in February 2025 with Google participation, partnered with Google DeepMind in December 2024 to bring DeepMind's physical-AI models onto the Apollo platform, and has progressed through trials at Mercedes-Benz, Jabil, and GXO. The February 2026 disclosure of approximately $935 million in cumulative Series A capital at a $5.3 billion post-money valuation positioned Apptronik as the principal Google-aligned humanoid Insurgent. Agility Robotics' Digit, the wheeled-foot bipedal humanoid optimised for logistics, has logged more than 100,000 totes at GXO Logistics under the industry's first humanoid robot-as-a-service contract, with deployments at Spanx alongside Amazon trial work. The Agility rebrand to simply "Agility" in March 2026 signalled the consolidation of the company's positioning around humanoid automation rather than the broader robotics-research-spinout identity the Oregon State University lineage initially implied.

Unitree is the Chinese humanoid that has redrawn the cost curve. Founded in 2016 in Hangzhou by Wang Xingxing, Unitree began as a quadruped specialist and pivoted into humanoids with the H1 (2023) and G1 (2024). The G1 ships at approximately $16,000, the H1 has appeared in commercial pilot deployments and in Spring Festival Gala performances, and the public demonstrations through 2025 have escalated to a Unitree-versus-Unitree humanoid boxing match that surfaced widely in robotics-research social media. Industry reporting placed Unitree at more than 5,500 humanoid shipments in 2025, with Chinese manufacturers accounting for roughly 90 percent of the global humanoid unit volume that year. The May 2026 launch of a humanoid-robot motion App Store and the GD01 manned mecha at approximately 3.9 million yuan signalled the company's intent to define both the consumer-application platform and the upper end of the form-factor frontier.

The Unitree cost curve is the structural threat the US humanoid Insurgents have not yet fully priced. If the Chinese cohort sustains the order-of-magnitude price advantage through 2026 to 2027, and if Skild Brain or Physical Intelligence's π series provides a credible US-aligned model layer that can run on Chinese hardware, the commercial topology of the humanoid market may end up looking more like the Android-versus-iPhone split than like the closed integrated-platform model that Figure and Tesla are building toward.

The foundation-model layer

The two principal robotics-AI foundation-model Insurgents are structurally distinct from the humanoid hardware cohort. Both Physical Intelligence and Skild AI are explicitly positioned as the software layer that hardware partners integrate, not as vertically integrated platforms.

Physical Intelligence (Karol Hausman, Sergey Levine, Chelsea Finn, Lachy Groom, with founder lineage tracing to Google DeepMind robotics, UC Berkeley BAIR, and Stanford) released π₀ in October 2024 and the more recent π₀.₅ as open-weights research models with detailed technical reports. The choice to release weights, in contrast to Figure's closed Helix and DeepMind's closed Gemini Robotics, has built Pi's research-community standing rapidly.

Skild AI (Deepak Pathak and Abhinav Gupta, both Carnegie Mellon faculty) frames the bet as "one brain, any robot": a single set of weights trained to operate across quadrupeds, humanoids, and manipulators rather than a per-platform specialised model. The cross-platform thesis is unproven at commercial scale, but the public demonstrations across multiple form factors are the most ambitious published cross-embodiment work outside the Open X-Embodiment academic collaboration.

The frontier-lab analogs are Google DeepMind's Gemini Robotics, NVIDIA Research's open-weights Isaac GR00T N1 and N1.5, Toyota Research Institute's Large Behavior Models, and Meta AI's embodied-research line. The competitive dynamic among the foundation-model layer is one of the most actively contested in AI research today, with hardware partners (Figure, 1X, Boston Dynamics, Unitree, Apptronik, Agility, the auto OEMs) as the implicit customers each is competing to serve.

The autonomous-vehicle cohort

The AV cohort has split into a modular-stack camp (Waymo, Zoox, the Chinese robotaxi operators) and an end-to-end camp (Tesla FSD V12 onward, Wayve).

Waymo crossed 200,000-plus weekly paid robotaxi rides in early 2025 across Phoenix, San Francisco, Los Angeles, and Austin, with Atlanta, Miami, and Washington DC scheduled for additional 2025 to 2026 launches. The October 2024 $5.6 billion capital raise funded the commercial-expansion roadmap. Waymo's modular sensor-fusion approach (lidar, camera, radar) and its quarterly safety reporting through California DMV filings constitute the most rigorous public dataset on autonomous-driving safety performance in commercial operation.

Wayve is the European AV2.0 standard-bearer and the closest architectural analog to Tesla FSD outside of Tesla itself. The end-to-end neural-network approach, the OEM-supplier business model (Wayve sells AI Driver software to automakers rather than operating a robotaxi service), and the GAIA and LINGO research releases position Wayve as the structurally most credible commercial vehicle for the architecture that the rest of the robotics field is converging on. The March 2024 hire of Erez Dagan from Mobileye signalled commercial-execution credibility to OEM customers; the May 2024 $1.05 billion Series C funded the buildout.

Tesla FSD is the largest deployed end-to-end driving policy by fleet size (more than seven million Tesla vehicles), and FSD V12's transition to a fully end-to-end neural network in early 2024 was the field's most consequential validation of the architecture. The Tesla-Waymo safety-record comparison is methodologically inconclusive, but the architectural debate has shifted: as of 2026, the question is no longer whether end-to-end can match modular but how soon.

Zoox, the Amazon subsidiary, launched commercial robotaxi service in Las Vegas in June 2025 with its purpose-built bidirectional vehicle, and is expanding into the SF Bay Area through 2025 to 2026. The closed-loop integration between Amazon's infrastructure and Zoox's commercial deployment is structurally distinct from any other AV operator.

Defense and autonomous edge

The defense-autonomy cohort has scaled into the same capital ranges as the consumer-and-enterprise Insurgents, and the architectural overlap with humanoid and AV robotics is increasing.

Anduril Industries is the structural anchor. The June 2025 Series G of $2.5 billion at $30.5 billion led by Founders Fund, the secondary-market trades implying valuations as high as $84.5 billion by late 2025, the February 2025 takeover of the US Army's IVAS program from Microsoft (a $22 billion, 10-year contract), and the late-2024 OpenAI partnership for Lattice integration all positioned Anduril alongside frontier AI labs in capital terms. The Roadrunner counter-drone platform, the Bolt loitering munition line, the Dive-LD autonomous underwater vehicle, and the Lattice operating system constitute one of the broadest autonomous-systems portfolios in defense.

Shield AI is the autonomy-stack specialist, with the Hivemind software operating in GPS- and communications-denied environments and the V-BAT VTOL fixed-wing platform (acquired through the 2021 Martin UAV deal) deployed across USSOCOM, US Coast Guard, and DoD Replicator selections. The May 2024 Series F at $2.7 billion valuation funded the platform expansion.

The structurally interesting cross-pollination is that Anduril and Shield AI are now competing for the same AI talent as the humanoid Insurgents and the AV cohort, and the autonomy-stack architectures (perception → world-model-based planning → low-level control) are substantially similar across the defense and commercial cohorts. Whether the commercial-side and defense-side software stacks ultimately converge or fork into distinct branches is one of the more underwatched architectural questions in the field.

Quadrupeds, mobile manipulators, and the rest

The non-humanoid form factors have anchored the most reliable commercial-deployment metrics in the field. Boston Dynamics' Spot quadruped, deployed in thousands of units across industrial inspection, security, construction monitoring, and utilities applications, is the largest commercial legged-robot deployment globally.

Agility Robotics' Digit (a wheeled-foot bipedal humanoid optimised for warehouse logistics, not a traditional humanoid form factor) is the strongest public counterexample to the "humanoids are economically premature" argument; the 100,000-plus totes logged at GXO Logistics in 2024 to 2025 is the most rigorous public deployment data for bipedal robots in commercial operation. The Stretch warehouse-logistics robot from Boston Dynamics, ANYbotics' quadrupeds in industrial inspection, and the broader cohort of mobile-manipulator startups (Dexterity AI, Symbotic, Berkshire Grey, Locus Robotics) round out the commercial-deployment story that the humanoid cohort is still aspiring to.

The form-factor honourable mentions worth flagging: surgical robotics (Intuitive Surgical's da Vinci, Verb Surgical, Vicarious Surgical, Medtronic Hugo, plus newer entrants like Moon Surgical and CMR Surgical) remains the most economically successful robotics category by a wide margin and is now beginning to integrate VLA-style models for assistance; agricultural robotics (John Deere's autonomous tractor line, FarmWise, Carbon Robotics, Iron Ox) is scaling rapidly on a data-flywheel that more closely resembles the AV cohort than the humanoid one; marine and underwater autonomy (Saildrone, Anduril's Dive-LD, Saronic, Sea Machines) sits between defense and commercial. None of these are profiled on nextomoro yet, but they sit inside the same software-stack convergence that the labs that are profiled exemplify.

What you can actually buy today

The cohort-level capital concentration the earlier sections trace is a story about funding rounds and demonstration cadence, not shipped product. The robotics market as of May 2026 is structurally asymmetric in a way that public conversation often obscures: a small number of platforms ship at scale to anyone with a credit card, a larger set is pilot-deployed to enterprise customers under negotiated terms, and the headline humanoid platforms remain mostly pre-product. The reality check matters because the data-flywheel argument the next section develops depends on which platforms actually accumulate operational hours in the world.

Quadrupeds are the most commercially mature legged-robot category. Unitree's Go2 ships at approximately $1,600 for the consumer variant, with higher-spec Pro configurations stepping up into the low five figures, the lowest entry price by an order of magnitude in the legged-robot market. The B2 industrial variant prices in the $30,000 to $60,000 range. Boston Dynamics' Spot, the industry-standard inspection-and-monitoring quadruped, lists at approximately $74,500 plus integration and support fees, with thousands of units deployed across utility inspection, security, construction monitoring, entertainment, and academic research. ANYbotics' ANYmal serves heavier industrial environments under custom-quote pricing. The quadruped market has been commercial for half a decade and is the closest thing the legged-robot space has to a mature product category.

Bipedal humanoids available for direct purchase are concentrated in research and developer markets. Unitree's G1 at $16,000 is the lowest-cost commercially shipping full-size bipedal humanoid globally; the H1 in the $90,000 to $100,000 range serves higher-spec research customers. Engineered Arts' Ameca and RoboThespian ship at roughly $130,000 to $150,000 for the entertainment-and-museum-installation segment, with a multi-decade RoboThespian deployment history. 1X's NEO Gamma is taking consumer pre-orders against early-access deposits with a target $20,000 to $30,000 price point, the bridge between the Unitree research-tier pricing and the mass-market consumer pricing the rest of the humanoid cohort is implicitly targeting. Rainbow Robotics' HUBO research platforms ship to academic robotics labs under custom-quote pricing. Compared with the Insurgent valuation ladder the chart above shows, the actually-purchaseable humanoid market is roughly two orders of magnitude smaller in dollar volume than the capital invested in the cohort.

Industrial and warehouse robots straddle direct-purchase and robot-as-a-service models. Boston Dynamics' Stretch warehouse-logistics robot is direct-purchase at approximately $200,000 plus deployment cost. Agility Robotics' Digit operates under robot-as-a-service contracts (the industry's first humanoid RaaS arrangement) at GXO Logistics, Spanx, and the Amazon trial relationships, with no direct-purchase pricing publicly disclosed. UBTECH's Walker S has logged the largest-public-volume humanoid factory deployment in the Chinese automotive cohort (Geely, Nio, BYD, Foxconn pilots) but remains enterprise-pilot rather than retail product. Kawada Robotics' NEXTAGE stationary dual-arm humanoid is the longest-deployed commercial humanoid platform globally, shipping since 2010 to Japanese electronics-and-semiconductor manufacturers under direct-purchase terms.

Consumer-scale household robotics has been a real industry for two decades. iRobot's Roomba and Braava lines, at $300 to $1,500 retail, have accumulated tens of millions of deployed units globally. The Roomba's data-flywheel volume is structurally larger than every humanoid Insurgent's pilot deployment combined and provides the operational baseline against which the humanoid cohort's general-purpose-home-robot thesis is implicitly measured.

The Insurgent flagships are mostly not for sale. Tesla AI's Optimus, Figure AI's Figure 02, Apptronik's Apollo, Sanctuary AI's Phoenix, Agibot's Yuanzheng A2, XPeng Robotics' Iron, and Fourier Intelligence' GR-2 are in enterprise-pilot deployment but none ships through a consumer or general-business retail channel as of May 2026. The combined capital invested in these seven platforms exceeds $100 billion at the headline valuations charted above, against zero retail SKUs and a combined deployed-unit count that public disclosures suggest is in the low thousands. The gap between funding and shipping is the structural test the 2026 to 2027 cycle will resolve.

Capability roadmap

Knowing what ships today is half the picture. The other half is what each capability looks like across the next 36 months, and which constraint sets the timing. The chart below maps eight headline capabilities against their progression through three maturity tiers (research demo, pilot deployment, commercial scale). The dotted line is the present.

The structural pattern is that the hardware-side capabilities (locomotion, mass production, sub-$10K pricing) are progressing through the maturity tiers faster than the model-and-AI-side capabilities (dexterous manipulation, long-horizon autonomy, household deployment). That gap is the central editorial point of the four-layer-stack analysis earlier in this essay: the field has shipped the platforms; the policies that run on them are still maturing.

Capability-by-capability framing, with the primary blocker for each:

Bipedal locomotion
Today (May 2026): Mature outdoor walking on the Unitree G1 at $16K
12 to 18 months: Dynamic gait plus disturbance recovery on consumer-priced units
18 to 36 months: Acrobatic and gymnastics-tier across cohort
Primary blocker: Battery energy density at human-form-factor weight
Dexterous manipulation
Today: Brittle; scripted or teleop-assisted contact-rich tasks
12 to 18 months: Reliable pick-place with VLA-driven policy
18 to 36 months: Reliable contact-rich assembly (peg-in-hole, threading, fabric folding)
Primary blocker: Training-data scarcity for fingertip-contact dynamics
Long-horizon autonomy
Today: Sub-10-minute unsupervised stretches in pilots
12 to 18 months: Hour-long household chores via VLA plus RL hybrid
18 to 36 months: Full-day shifts in structured environments
Primary blocker: Foundation-model robustness under distribution drift
Language-grounded instruction
Today: Single-sentence instructions reliable; multi-step brittle
12 to 18 months: Multi-step plans with episodic memory
18 to 36 months: Conversational re-planning during execution
Primary blocker: World-model fidelity to physical dynamics
Sub-$10K consumer humanoid
Today: Only Unitree G1 below $20K
12 to 18 months: Chinese cohort drives sub-$10K research SKUs
18 to 36 months: US and EU cohort reach sub-$10K mass market
Primary blocker: Actuator and battery BOM compression
Mass production (>10K units/yr)
Today: Unitree at approximately 5,500 humanoids/yr (industry estimate)
12 to 18 months: Tesla AI Optimus plus Chinese cohort scaling above 10K
18 to 36 months: Multiple US-cohort platforms above 10K
Primary blocker: Manufacturing supply-chain capacity and contract-manufacturer build-out
Household deployment at scale
Today: Effectively zero; 1X NEO pre-orders open
12 to 18 months: NEO and Figure AI consumer launches; under 10K units globally
18 to 36 months: 100K to 1M units globally across cohorts
Primary blocker: Consumer-safety certification and household-task training data
$1B/yr humanoid product line
Today: Zero
12 to 18 months: Apptronik and Agility Robotics close to threshold via enterprise pilots
18 to 36 months: First $1B humanoid product line emerges
Primary blocker: Per-unit-economics gap to warehouse labor cost (approximately $20-25/hour)

Four notes on the projections.

The Chinese cohort leads the cost curve. Sub-$10K consumer pricing reaches research-tier first in the Chinese cohort (Unitree is already at $16K, and the structural cost advantage from vertical-integration on actuators and supply-chain proximity is durable), then bridges into mass-market US and EU pricing through 2028 as foundation-model partnerships mature. The Android-versus-iPhone analog from earlier in the essay sharpens here.

The dexterous-manipulation bottleneck is data, not hardware. The leading dexterous hands (Sanctuary AI's, the Tesla AI Optimus second-generation design, the various academic and commercial five-finger designs from Shadow Robot, Inspire Robotics, and others) are already mechanically capable. The bottleneck is teleop-and-simulation training data for the kinds of contact-rich fingertip tasks (peg-in-hole, threading, fabric folding, kitchen prep) that consumer and enterprise applications require. The data-flywheel section below dissects why this bottleneck is structural.

Household deployment moves on a different curve than factory deployment. Factory deployment unlocks at the unit-economics threshold (humanoid hourly opex below US warehouse labor cost), which Apptronik, Agility Robotics, and Figure AI all credibly hit between 2027 and 2028 at projected production volumes. Household deployment additionally depends on safety certification (consumer regulators have not yet established a humanoid-safety framework), household-task training data (a residential-tasks analog to the AgiBot World dataset does not yet exist publicly), and consumer price elasticity. The household curve trails factory deployment by roughly 18 to 24 months.

The hardest projection is mass production. Tesla AI, Unitree, Figure AI, Agibot, and UBTECH have each published production-volume targets that would individually make them the largest humanoid-shipping company globally; the published targets are not internally consistent across cohorts and would imply a humanoid market several times larger than independent industry analysts currently project. The 18-to-36-month band on the chart treats only Unitree and Tesla as credibly reaching 10K-per-platform-per-year on conservative assumptions; aggressive readings of the public commitments would pull commercial scale forward by 12 to 18 months.

The data flywheel

The bottleneck in robotics, as in all of contemporary AI, is data. Robotics has the additional constraint that data is physical, slow to collect, and embodiment-specific. Three flywheels are emerging in parallel.

Fleet operation is the cleanest. Waymo's 200,000-plus weekly rides produce continuous real-world data that feeds back into Waymo Driver training. Tesla's seven-million-vehicle fleet does the same at much larger scale (and at substantially noisier label quality). Digit's 100,000 totes at GXO. NEO consumer units shipping to early-access homes through 2026. The fleet-operation flywheel scales linearly with deployment and is, by orders of magnitude, the most data-efficient lane available.

Teleoperation farms are the explicit substitute when no fleet exists. Physical Intelligence, Figure, 1X, Skild, and TRI all operate dedicated teleoperation centres, with derivatives of the ALOHA, Mobile ALOHA, and UMI handheld-gripper rigs lowering the cost per demonstration-hour by an order of magnitude relative to traditional teleop. The economics are still hard. One hour of teleoperation produces approximately one hour of training data, against industry estimates that generalist robot policies require millions of hours of dexterous contact-rich demonstrations. Teleop will plausibly never be the dominant data source by itself.

Synthetic data from world models is the lane that has scaled most aggressively over the past 18 months. NVIDIA's Cosmos + GR00T-Dreams pipeline turning 88 hours of teleop seed data into 827 hours of synthetic video and 780,000 training trajectories is the cleanest published example. Wayve's GAIA-2 and GAIA-3 do the same for driving. 1X's World Model does it for household robotics. The pattern is consistent: train a generative world model on real footage, use it as a controllable simulator for policy training, use it as a counterfactual evaluator for safety analysis. The remaining open question is fidelity. World models that look photorealistic to humans may still be off-distribution for low-level dynamics, and the field is actively building benchmarks (sim-as-judge agreement with real-world deployment outcomes) to measure that.

The data-flywheel competitive dynamic favours, in descending order of structural strength: companies with deployed fleets at scale (Tesla, Waymo, then Boston Dynamics on Spot, then Digit on warehouse), companies with the largest teleop farms and the lowest cost per demonstration (Physical Intelligence, Figure, 1X, Skild, TRI), and companies with the strongest world-model infrastructure (NVIDIA, Wayve, DeepMind, 1X). The labs that combine all three (NVIDIA Research is the leading example; Wayve in the AV cohort) are the structurally best-positioned for the 2027 to 2028 model-quality leap that everyone in the field is now pricing in.

What to watch

Five concrete signals over the next twelve to eighteen months.

The autonomy-to-teleop ratio for consumer-shipping humanoids. 1X's NEO consumer units and Figure's Helix-on-Figure-02 deployments will produce the first public dataset on how often a human operator has to take control. The Waymo analog (miles between disengagements) maps directly. Public targets of more than 50 to 100 task completions per human intervention will be the threshold that separates a viable consumer product from a teleop service masquerading as one.

Dollar-per-task economics in warehouse and manufacturing settings. Apptronik's Apollo at Mercedes and Jabil, Agility's Digit at GXO and Spanx, and Boston Dynamics' Atlas at Hyundai are the cleanest commercial proving grounds. The number to watch is humanoid-hourly opex against the US warehouse minimum-wage-plus-benefits blended cost of approximately $20 to $25 per hour. The first quarterly disclosure that a deployed humanoid operates below that threshold at any meaningful scale will be a structural commercial inflection.

Whether Skild Brain and π₀.₅ (or successors) get adopted on hardware platforms outside their developers' direct integration. The cross-platform foundation-model thesis depends on humanoid hardware partners adopting external model layers instead of building in-house. Boston Dynamics' August 2025 adoption of TRI's LBM is the strongest public datapoint in favour. Figure's June 2025 exit from the OpenAI collaboration is the strongest public datapoint against. The 2026 to 2027 partnership announcements will tell which architecture wins.

Project Prometheus's first public artifact. Jeff Bezos and Vik Bajaj's November 2025 launch raised $16.2 billion in two rounds at $38 billion post-money in roughly six months, a capital concentration on a scale that has no near analog in the recent Insurgent cohort. The thesis ("physical-world AI for engineering, manufacturing, aerospace, robotics, drug discovery") is broad enough to be unfalsifiable in 2026 but will resolve through 2027 to 2028 as the company ships its first concrete output. The question is whether Prometheus operates as a vertically integrated Insurgent (a Figure-or-Tesla shape) or as a horizontal research platform (a DeepMind-or-Anthropic shape applied to physical systems).

The geopolitical compute supply chain for robotics-AI training. The compute-rent dynamic identified in the Anthropic-Colossus essay applies with full force to robotics. World-model training, large-scale VLA pretraining, and policy training at GR00T-class scale are now consuming tens of thousands of GPUs per training run. NVIDIA Research's GR00T-Dreams pipeline, the Cosmos training infrastructure, and the implicit reliance on NVIDIA hardware across every Insurgent in the cohort positions NVIDIA as the structural platform layer for robotics-AI in the same way it became the structural platform layer for language-model training. Whether US-aligned compute remains the platform of choice for Chinese humanoid manufacturers (Unitree, Fourier, Agibot) is the geopolitical question the next two years will resolve.

The cohort that nextomoro tracks is now large enough and capital-rich enough to constitute a coherent industry. The architectural convergence on the dual-system VLA stack, the cross-cohort competitive dynamics between humanoid Insurgents and foundation-model platforms, the emergence of defense autonomy as a near-frontier compute consumer, and the structural data-flywheel advantages accruing to the labs with deployed fleets all suggest the same conclusion: the 2025 capital cycle priced the hardware. The 2026 capital cycle will price the model layer and the data. The labs that ship working policies on diverse embodiments will be the ones that capture the resulting value.

Sources

Companion profile: Figure AI. Series C valuation, Helix release, BMW Spartanburg deployment.
Companion profile: 1X. NEO consumer pricing, EVE deployment history, OpenAI Startup Fund relationship.
Companion profile: Tesla AI. Optimus, FSD architecture transition, Cybercab, Dojo.
Companion profile: Boston Dynamics. Electric Atlas, Spot fleet, Hyundai ownership, Boston Dynamics AI Institute.
Companion profile: Apptronik. Apollo humanoid, Mercedes and Jabil deployments, Google DeepMind partnership, $5.3 billion February 2026 valuation.
Companion profile: Agility Robotics. Digit humanoid, GXO Logistics robot-as-a-service contract, March 2026 rebrand to Agility.
Companion profile: Unitree. G1 and H1 humanoids, $16K cost curve, 5,500+ shipments in 2025, GD01 manned mecha launch.
Companion profile: Physical Intelligence. π₀ and π₀.₅ release history, Series A.
Companion profile: Skild AI. Skild Brain cross-platform model, CMU founder lineage.
Companion profile: Wayve. AV2.0 thesis, GAIA-1 and GAIA-2, May 2024 Series C, Erez Dagan hire.
Companion profile: Waymo. 200,000-plus weekly rides, Co-CEO structure, October 2024 capital raise.
Companion profile: Zoox. Las Vegas commercial launch June 2025, Amazon subsidiary structure.
Companion profile: Anduril Industries. June 2025 Series G, IVAS contract takeover, Lattice and the Roadrunner, Bolt, Dive-LD portfolio.
Companion profile: Shield AI. Hivemind autonomy stack, V-BAT platform, May 2024 Series F.
Companion profile: Toyota Research Institute. Diffusion Policy, Large Behavior Models, August 2025 Atlas collaboration with Boston Dynamics.
Companion profile: Honda Research Institute. ASIMO lineage, continuing robotics research.
Companion profile: Project Prometheus. Bezos and Bajaj $38 billion physical-world AI Insurgent.
Companion profile: NVIDIA Research. Isaac GR00T N1.5, Cosmos world models, GR00T-Dreams blueprint, Nemotron Coalition.
Companion essay: Compute rent on the broader pattern of compute-as-bottleneck that this essay extends with the robotics-data-flywheel framing.
Companion essay: AI vintages for the capital-cycle context that the humanoid-Insurgent funding wave participates in.
Companion essay: The frontier lab exodus for the talent-side dynamics that Physical Intelligence, Skild, and the Insurgent cohort have benefited from.
Open X-Embodiment dataset. Cross-embodiment training dataset.
Physical Intelligence π₀ blog. October 2024 model release.
Figure Helix announcement. February 2025 VLA release.
Wayve GAIA-1 paper. World-model research.
Boston Dynamics Atlas product page. Electric Atlas commercial positioning.
NVIDIA Cosmos. Physical-AI world-model platform.

The robot stack convergence nobody formally announced

The cohort, mapped

The software stack has converged

The humanoid Insurgents

The foundation-model layer

The autonomous-vehicle cohort

Defense and autonomous edge

Quadrupeds, mobile manipulators, and the rest

What you can actually buy today

Capability roadmap

The data flywheel

What to watch

Sources

Nextomoro

Compute rent: what the Anthropic-xAI Colossus deal says about the frontier

Tracking lab diaspora and discovering where AI talent actually flows

Sovereign AI: $608 billion, four playbooks, and the Chinese state stack

AI Vintages and how AI's funding cycle decoupled from reality

How China funds its AI labs

AI Research Lab Intelligence