Collecting robot training data is dirty, unglamorous work. Some AI labs are already paying XDOF to do it.
- Published
- Jun 17, 2026 — 15:00 UTC
The AI industry is facing a significant challenge in developing physical AI that can compete with the capabilities of large language models (LLMs). To address this, some labs are now partnering with XDOF, a company specializing in collecting the necessary training data. This collaboration underscores the urgency of solving the data problem to advance physical AI technologies.
As AI continues to evolve, the disparity between the data requirements for LLMs and those for physical AI has become increasingly apparent. While LLMs have access to vast amounts of text data for training, physical AI systems require extensive real-world data to learn and adapt effectively. According to a claim highlighted by TechCrunch AI, “If physical AI is going to match the accomplishments of LLMs, there’s a data problem that needs to be solved.” This statement reflects the broader sentiment within the industry that without adequate data, the development of robust physical AI will stagnate.
XDOF’s role in this landscape is crucial, as it provides the necessary infrastructure and expertise to gather and curate the data that physical AI systems require. The company’s work involves not just collecting data but ensuring its quality and relevance, which is essential for training algorithms effectively. This partnership could potentially accelerate the development of physical AI, allowing it to reach milestones comparable to those achieved by LLMs. The implications of this shift are significant, as it may enable advancements in robotics, autonomous vehicles, and other applications that rely on physical AI.
In a competitive context, other companies in the AI space are also grappling with similar data challenges. As firms race to develop more sophisticated AI systems, the ability to collect and utilize high-quality training data will become a key differentiator. This trend may lead to increased investments in data collection technologies and partnerships, as firms recognize that the success of their AI initiatives hinges on the availability of robust datasets.
Looking ahead, the AI community will be watching closely to see how XDOF’s efforts impact the development of physical AI and whether these advancements can bridge the gap with LLMs. The ongoing evolution of data collection strategies will likely shape the future landscape of AI, influencing both market dynamics and technological capabilities.
By Callan Zhang · Jun 17, 2026 · Editorial standards →
Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.
Source: TechCrunch AI