Notable multimodal

ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction

Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Gregory Slabaugh

Published
May 20, 2026 — 17:43 UTC

Problem
This paper addresses the gap in multimodal cancer survival prediction by integrating histopathology and transcriptomics in a biologically interpretable manner. Existing methods often lack interpretability and fail to leverage the rich biological context provided by both imaging and genomic data. The authors present ProtoPathway, a preprint framework that aims to enhance survival prediction accuracy while providing insights into the biological mechanisms underlying cancer progression.

Method
ProtoPathway employs a dual-encoder architecture. On the histopathology side, it utilizes $K$ learnable morphological prototypes that are trained end-to-end with a survival prediction objective. This approach allows for the transformation of variable-length image patches into fixed-size prototype tokens through a soft assignment mechanism, effectively compressing the input data while retaining relevant features. On the genomic side, a bipartite graph neural network encodes gene expression data within the Reactome pathway hierarchy. This network generates pathway embeddings that incorporate both individual gene contributions and their relationships within biological pathways via bidirectional message passing on a shared gene-pathway graph. The model employs cross-modal attention over a prototype-pathway matrix, enabling prototypes to query pathways and model the biological interactions between molecular programs and tissue morphology. The attention matrix serves as an interpretability output, providing insights into the biological hierarchy from genes to pathways and spatial tissue maps.

Results
The authors evaluate ProtoPathway on five cancer cohorts from The Cancer Genome Atlas (TCGA), demonstrating competitive or superior performance in survival prediction compared to existing multimodal baselines. Specific metrics include improved concordance index (C-index) scores, although exact numerical results are not disclosed in the abstract. The model also exhibits reduced computational costs and enhanced biological interpretability, validated through fold-stratified rank-based population-level analysis, which confirms the model’s ability to provide meaningful biological insights.

Limitations
The authors acknowledge that while ProtoPathway improves interpretability and performance, it may still be limited by the quality and representativeness of the training data. Additionally, the reliance on the Reactome pathway database may restrict the model’s applicability to other biological contexts or pathways not covered in this resource. The paper does not address potential biases in the training data or the generalizability of the model across diverse cancer types beyond those evaluated.

Why it matters
ProtoPathway’s integration of multimodal data with a focus on biological interpretability has significant implications for cancer research and clinical practice. By providing a framework that not only predicts survival outcomes but also elucidates the underlying biological mechanisms, this work paves the way for more informed therapeutic strategies and personalized medicine approaches. The availability of the source code and model weights facilitates further research and development in this area, potentially leading to advancements in multimodal machine learning applications in oncology.

Authors: Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Gregory Slabaugh
Source: arXiv:2605.21454
URL: https://arxiv.org/abs/2605.21454v1

Turing Wire

By Turing Wire editorial staff · May 20, 2026 · Editorial standards →

Source: arXiv cs.CV