FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model
Mahmood Alzubaidi, Uzair Shah, Raden Muaz, Ines Abbes, Nader Mohammed, Abdullatif Magram
- Published
- Jun 9, 2026 — 17:03 UTC
Problem
This paper addresses the critical shortage of trained sonographers in low- and middle-income countries, where over half of pregnant women lack access to skilled ultrasound services. Current deep learning methodologies typically tackle detection, segmentation, or classification tasks in isolation, necessitating separate models and expert-defined labels during inference. The authors present FADA, a preprint work that aims to unify these tasks into a single model, thereby streamlining the ultrasound interpretation process without requiring external labels.
Method
FADA is built on the Qwen3.5-VL architecture and integrates knowledge from four domain-specific foundation models: FetalCLIP, UltraSAM, USF-MAE, and UltraFedFM. The model employs a selective distillation approach, where feature alignment is applied exclusively to annotation tasks, while clinical interpretation is achieved through standard fine-tuning. This method allows FADA to outperform traditional full distillation techniques across various evaluation metrics. The recommended variant, FADA-SKD, is designed to be trainable on a single consumer-grade GPU and is optimized for edge deployment, running on a Qualcomm Snapdragon 7 Gen 1 smartphone with 12 GB of RAM. The model completes the entire five-phase pipeline in approximately 60 seconds offline, demonstrating its practicality for real-world applications.
Results
FADA-SKD achieves a mean Dice score of 0.8820 for segmentation tasks and a mean Average Precision (mAP) of 0.7671 at a 0.50 IoU threshold for detection tasks. Additionally, it reports 100% compliance with structured interpretation standards. Validation by expert sonographers on a dataset of 237 images indicates that 73.5% of interpretations received perfect scores under clinician guidance, confirming the model’s clinical acceptability in both autonomous and human-in-the-loop scenarios.
Limitations
The authors acknowledge that the work is a preprint and has not undergone peer review, which may affect the robustness of the findings. Additionally, while the model demonstrates strong performance in controlled settings, its efficacy in diverse clinical environments and with varied ultrasound equipment remains to be fully validated. The reliance on a single consumer GPU for training may also limit scalability in more complex scenarios requiring larger datasets or more extensive computational resources.
Why it matters
FADA represents a significant advancement in the integration of AI into prenatal care, particularly in resource-constrained environments where access to trained professionals is limited. By enabling efficient, label-free ultrasound interpretation, this work has the potential to enhance diagnostic capabilities and improve maternal and fetal health outcomes. The implications of this research extend to the broader field of medical imaging and AI-assisted diagnostics, paving the way for future innovations in accessible healthcare solutions, as published in arXiv.
By Turing Wire editorial staff · Jun 9, 2026 · Editorial standards →
Source: arXiv cs.AI