Notable other null

Bridgewater's Finance Tests Show GPT and Claude's Limitations on Accuracy

Published: Jul 3, 2026 — 11:16 UTC

Also in this story: OpenAI Claude Bridgewater Thinking Machines Lab Gemini

Bridgewater’s finance tests revealed that both GPT and Claude achieved an accuracy rate of 84.7 percent. In contrast, the Qwen3-235B model, developed by the startup Thinking Machines Lab and founded by former OpenAI CTO Mira Murati, offers a more cost-effective solution, priced at one-fourteenth of the cost of Gemini, Claude, and GPT. However, the accuracy numbers have not been independently verified, raising questions about their reliability. This follows ongoing discussions about the effectiveness of LLMs in specialized domains, particularly finance, where precise data is critical. Practitioners may consider Qwen3-235B as a viable alternative for financial tasks due to its lower cost and competitive accuracy, as reported by The Decoder.

By Callan Zhang · Jul 3, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: The Decoder