Notable agents robotics OpenAI

LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories

Baochang Ren, Xinjie Liu, Xi Chen, Yanshuo Liu, Chenxi Li, Daqi Gao

Published: Jun 11, 2026 — 17:03 UTC

Problem
The paper addresses the gap in the capability of AI systems to execute scientific protocols in laboratory environments, where existing Vision-Language-Action (VLA) models have primarily been trained on household and tabletop tasks. This limitation restricts the application of AI in scientific research, where the execution of protocols often involves complex instruments and specific workflows. The authors highlight the need for laboratory-specific supervision and a unified learning framework that can accommodate diverse robotic embodiments. This work is presented as a preprint, indicating it has not yet undergone peer review.

Method
The authors propose LabVLA, a VLA model that integrates a two-stage training approach. The first stage involves FAST action token pretraining, which enhances the Qwen3-VL-4B-Instruct backbone to become action-aware before any continuous control is learned. The second stage, flow matching posttraining, incorporates a DiT action expert while maintaining knowledge insulation. To support this model, the authors developed RoboGenesis, a simulation-based workflow and data engine that generates structured demonstrations of laboratory workflows from atomic skills. This engine validates and filters rollouts and exports data across various robot profiles, addressing the data bottleneck in training.

Results
LabVLA was evaluated on the LabUtopia benchmark, achieving the highest average success rate compared to all evaluated baselines. Specifically, it demonstrated a success rate of 85% in in-distribution settings and 78% in out-of-distribution scenarios, significantly outperforming the next best model, which achieved 70% and 65% respectively. These results indicate a substantial improvement in the model’s ability to generalize across different experimental conditions and robot embodiments.

Limitations
The authors acknowledge several limitations, including the reliance on simulation for training, which may not fully capture the complexities of real-world laboratory environments. Additionally, the model’s performance may be constrained by the diversity of the training data and the specific robot embodiments used. The paper does not address potential issues related to the scalability of the RoboGenesis engine or the generalizability of the learned policies to novel experimental setups.

Why it matters
The development of LabVLA represents a significant advancement in bridging the gap between AI reasoning and physical execution in scientific laboratories. By enabling robots to autonomously execute complex protocols, this work has the potential to enhance the efficiency and accuracy of scientific research. The implications extend to various fields, including biology, chemistry, and materials science, where automated experimentation can accelerate discovery processes. This research contributes to the growing body of work on integrating AI with robotics in specialized domains, as published in arXiv cs.AI.

By Callan Zhang · Jun 11, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: arXiv cs.AI