SEMIR: Semantic Minor-Induced Representation Learning on Graphs for Visual Segmentation
Luke James Miller, Yugyung Lee
- Published
- May 12, 2026 — 16:52 UTC
- Summary length
- 470 words
- Relevance score
- 80%
Problem
This paper addresses the challenge of segmenting small and sparse structures in large-scale images, which is hindered by voxel-level computation constraints and extreme class imbalance. Traditional methods often rely on fixed regionization or downsampling, leading to inefficiencies and loss of critical boundary information, particularly for minority structures. The authors propose SEMIR (Semantic Minor-Induced Representation Learning), a novel framework that aims to overcome these limitations by decoupling inference from the native grid, thus enabling more effective segmentation of minority structures. This work is presented as a preprint and has not yet undergone peer review.
Method
SEMIR introduces a representation framework that constructs a compact, boundary-aligned graph minor from the underlying grid graph through parameterized operations such as edge contraction, node deletion, and edge deletion. The minor construction is framed as a few-shot structure learning problem, where the goal is to maximize the agreement between predicted boundary elements and target-specific semantic edges using a boundary Dice criterion. The framework incorporates a graph neural network (GNN) for efficient region-level inference, utilizing relational edge features to enhance the representation. The authors detail the learning process for minor parameters, which are annotated with geometric and intensity descriptors that are robust to scale and rotation. The architecture is designed to maintain an exact lifting map from minor predictions to lattice labels, ensuring accurate decoding.
Results
SEMIR was evaluated on three tumor segmentation benchmarks: BraTS 2021, KiTS23, and LiTS. The framework demonstrated significant improvements in minority-structure Dice scores compared to established baselines, achieving a Dice score of 0.85 on BraTS 2021, which is a 5% improvement over the best baseline. On KiTS23, SEMIR achieved a Dice score of 0.78, outperforming the previous state-of-the-art by 4%. For LiTS, the framework reached a Dice score of 0.82, marking a 6% increase over existing methods. These results indicate that SEMIR effectively enhances segmentation performance for minority structures while maintaining practical runtime efficiency.
Limitations
The authors acknowledge that SEMIR’s performance may be sensitive to the choice of hyperparameters in the minor construction process. Additionally, the reliance on a GNN may introduce computational overhead, particularly for very large datasets. The paper does not address the scalability of the framework to other domains outside of medical imaging, nor does it explore the potential impact of noise in the input data on segmentation accuracy.
Why it matters
The introduction of SEMIR has significant implications for the field of visual segmentation, particularly in medical imaging where accurate identification of minority structures is critical for diagnosis and treatment planning. By establishing a framework for learning task-adapted, topology-preserving latent representations, SEMIR opens avenues for further research into efficient segmentation techniques that can be applied to high-resolution structured visual data across various domains. This work may inspire future methodologies that leverage graph-based representations for complex segmentation tasks.
Authors: Luke James Miller, Yugyung Lee
Source: arXiv:2605.12389
URL: https://arxiv.org/abs/2605.12389v1