Major alignment safety

What Does the AI Doctor Value? Auditing Pluralism in the Clinical Ethics of Language Models

Payal Chandak, Victoria Alkin, David Wu, Maya Dagan, Taposh Dutta Roy, Maria Clara Saad Menezes

Published
May 18, 2026 — 17:56 UTC

Problem
This preprint addresses the gap in understanding how large language models (LLMs) incorporate ethical pluralism in medical decision-making. While medicine is characterized by conflicting ethical principles such as autonomy, beneficence, nonmaleficence, and justice, the ethical frameworks of LLMs in clinical contexts have not been systematically audited. The authors highlight the risk that deploying LLMs without an understanding of their inherent value priorities could lead to a monoculture in clinical ethics, undermining the pluralistic nature of medical practice.

Method
The authors propose a framework for auditing value pluralism in medical AI, which includes a benchmark of clinician-verified ethical dilemmas and an attribution method to extract value priorities from model decisions. The framework assesses how LLMs navigate ethical dilemmas by analyzing their reasoning processes, termed “Overton pluralism,” which reflects the competing values considered before arriving at a decision. The study evaluates various frontier models, focusing on their ability to represent the heterogeneity of physician values. The authors employ a systematic approach to quantify the consistency of model decisions across repeated sampling and variations in input semantics, revealing systematic value preferences in the models.

Results
The findings indicate that while LLMs exhibit some degree of value pluralism in their reasoning, their decisions are largely deterministic, lacking the distributional variability seen among human physicians. Specifically, the models’ decisions reflect committed value preferences, with most priorities aligning with the natural range of inter-physician variation. However, a notable concern is that some models significantly underweight patient autonomy, which could have profound implications if such models are deployed in clinical settings. The authors provide quantitative metrics demonstrating that the models’ ethical priorities do not fully capture the complexity of human ethical reasoning, particularly in high-stakes medical scenarios.

Limitations
The authors acknowledge several limitations, including the potential for bias in the selection of benchmark dilemmas and the reliance on clinician verification, which may not encompass the full spectrum of ethical considerations in diverse medical contexts. Additionally, the deterministic nature of model outputs raises questions about their robustness and adaptability in real-world applications. The study does not explore the implications of model training data on ethical decision-making, nor does it address the potential for emergent behaviors in more advanced models.

Why it matters
This work has significant implications for the deployment of LLMs in clinical settings, emphasizing the need for a nuanced understanding of the ethical frameworks these models embody. By highlighting the risks of ethical monoculture, the authors advocate for explicit efforts to balance diverse ethical perspectives in AI systems. This research lays the groundwork for future studies aimed at developing LLMs that better reflect the pluralistic nature of medical ethics, ultimately enhancing patient care and decision-making processes in healthcare.

Authors: Payal Chandak, Victoria Alkin, David Wu, Maya Dagan, Taposh Dutta Roy, Maria Clara Saad Menezes, Ayush Noori, Nirali Somia et al.
Source: arXiv:2605.18738
URL: https://arxiv.org/abs/2605.18738v1

Turing Wire

By Turing Wire editorial staff · May 18, 2026 · Editorial standards →

Source: arXiv cs.AI