Forecasting Conceptual Diffusion in Science: The Case of Quantum Computing
Thomas Maillart, Thibaut Chataing, David Dosu, Paul Bagourd, Julian Jang-Jaccard, Alain Mermoud
- Published
- Jun 2, 2026 — 17:12 UTC
Problem
This work addresses the gap in understanding the dynamics of scientific concept diffusion, particularly distinguishing between endogenous consolidation and exogenous diffusion. The authors highlight the lack of predictive models in the literature that can effectively forecast these phenomena, especially in rapidly evolving fields like quantum computing. The study is presented as a preprint, indicating that it has not yet undergone peer review.
Method
The authors construct a temporally resolved concept co-occurrence network using the quantum computing subtree from OpenAlex. They employ LightGBM, a gradient boosting framework, to model four key outcomes: endogenous reinforcement, exogenous diffusion, their ratio, and diffusion entropy. The features used for training include distributional and diversity-aware metrics, which capture the structural properties of the citation network. The model’s performance is evaluated against the overall growth of the scientific literature, with a focus on the upstream citation lineage and downstream diffusion patterns. The authors also conduct SHAP (SHapley Additive exPlanations) analyses to interpret feature importance and validate their findings across other domains, including robotics, advanced materials, and neuro implants.
Results
The results indicate that endogenous reinforcement in quantum computing is largely unpredictable, with a low R² value, while exogenous diffusion and entropy show strong predictability, achieving R² values up to 0.78. The predictive power is driven by factors such as upstream heterogeneity, citation breadth, and distributional dispersion. In cross-domain replications, exogenous diffusion remains the top-ranked target, with R²_test values ranging from 0.60 to 0.87. Notably, in the neuro implants domain, endogenous predictability improves significantly, reaching R²_test = 0.83. Case studies reveal that increases in entropy correlate with the emergence of new conceptual frontiers, while decreases signal technological convergence or paradigm shifts.
Limitations
The authors acknowledge that the unpredictability of endogenous reinforcement in quantum computing may limit the model’s applicability in certain contexts. Additionally, the study’s reliance on citation data may introduce biases related to the visibility and impact of concepts. The generalizability of findings across all scientific domains is also questioned, as the observed asymmetry in quantum computing does not uniformly extend to other fields.
Why it matters
This research provides a scalable framework for anticipatory scientometrics and technology foresight, enabling stakeholders to identify early signals of cross-domain uptake and inform innovation-oriented policy analysis. The findings underscore the importance of structural regularities in semantic and citation environments, which can guide future research in understanding scientific change. This work contributes to the broader discourse on the dynamics of knowledge diffusion, as published in arXiv.
By Turing Wire editorial staff · Jun 2, 2026 · Editorial standards →
Source: arXiv cs.LG