Is It You or Your Environment? A Bayesian Inference Framework for Genomically-Anchored Personalized Physiological Interpretation
Aruna Dey, Suraj Biswas
- Published
- Jun 11, 2026 — 16:38 UTC
Problem
Personalized health AI systems encounter a significant cold-start problem, where machine learning models require extensive individual behavioral data to differentiate between constitutional variations and environmentally driven deviations. This paper addresses this gap by proposing a novel Bayesian inference framework that utilizes genomic profiles as fixed, exogenous anchors for personalized physiological interpretation. The work is presented as a preprint and has not undergone peer review.
Method
The authors propose a Bayesian model where an individual’s genomic profile serves as a personalized prior, denoted as G-hat = μ + Σ(β_i * g_i), with β_i representing GWAS-derived effect sizes and g_i as risk-allele counts. This model initializes a belief state that allows for the separation of physiological measurements (P) into constitutional deviations (delta = P - G-hat). As behavioral data accumulates, the prior is dynamically adjusted using a decay function, transitioning from a genomic-dominated inference to one based on empirical baselines. The framework is applied across six physiological domains, with genomic priors graded by evidence strength, distinguishing between robustly replicated anchors (e.g., FTO, FADS1/2, FKBP5) and contested candidate genes (e.g., SLC6A4, MAOA, DRD2). The authors also delineate the inference boundaries between association, Mendelian randomization, and individual causation, establishing four deployment constraints: evidence-graded priors, dynamic decay, ancestry-matched effect sizes, and attribution-focused outputs.
Results
The proposed framework demonstrates its efficacy through various physiological scenarios, showcasing its ability to generate distinct suppression and enhancement hypotheses based on individual genomic profiles. While specific numerical results and comparisons to baseline models are not detailed in the abstract, the authors emphasize the framework’s capacity to provide personalized insights that traditional models cannot achieve without a genomic anchor.
Limitations
The authors acknowledge several limitations, including the reliance on the availability and accuracy of genomic data, potential biases in effect size estimates, and the need for extensive validation across diverse populations. Additionally, the dynamic decay mechanism may introduce complexities in real-time applications, which are not fully explored in the paper. The lack of empirical validation and performance metrics against established benchmarks is a notable gap.
Why it matters
This work has significant implications for the development of personalized health AI systems, particularly in enhancing the interpretability and accuracy of physiological assessments. By integrating genomic data into the inference process, the framework offers a promising approach to overcoming the cold-start problem, potentially leading to more effective health interventions. The methodology and findings contribute to the broader discourse on causal inference in personalized medicine, as discussed in related literature, and are available on arXiv.
By Turing Wire editorial staff · Jun 11, 2026 · Editorial standards →
Source: arXiv cs.AI