Towards Automated Discovery: A Review of Generative Models, Multimodal Learning and Closed-Loop Workflows in Inverse Materials Design
Anand Babu, Rogério Almeida Gouvêa, Gian-Marco Rignanese
- Published
- Jun 1, 2026 — 17:20 UTC
Problem — The paper addresses the gap in the literature regarding the automated discovery of materials through inverse design, focusing on the transition from forward prediction to targeted candidate proposals. It highlights the need for a comprehensive understanding of generative models, multimodal learning, and closed-loop workflows in the context of crystalline solids. This work is a preprint and has not undergone peer review.
Method — The authors conduct a systematic review of various generative modeling techniques applicable to crystal structure prediction, including variational autoencoders (VAEs), normalizing flows, autoregressive models, and diffusion models. They emphasize the importance of integrating physical constraints and feasibility checks throughout the design process. The paper discusses how these models learn from extensive databases to facilitate controllable sampling of periodic structures. Additionally, the authors explore multimodal learning approaches that combine diverse data types—such as crystal structures, thermodynamic properties, electronic information, and experimental data—to create a unified representation of chemical space. The review also covers inverse design strategies that merge conditional generation with optimization techniques like Bayesian optimization, reinforcement learning, and active learning.
Results — While the paper does not present original experimental results, it synthesizes findings from various studies, highlighting the effectiveness of different generative models in producing valid and novel materials candidates. The authors note recurring challenges such as surrogate exploitation and diversity collapse, which can hinder the performance of generative models. They propose evaluation metrics for discovery-grade practices, focusing on validity, novelty, uniqueness, stability, and cost, although specific quantitative benchmarks are not provided.
Limitations — The authors acknowledge several limitations, including the potential for models to exploit surrogates, leading to overfitting and lack of generalizability. They also point out the challenges of distribution shift and the stability-synthesizability gap, which can affect the practical applicability of generated materials. Notably, the review lacks empirical validation of the discussed methodologies, relying instead on existing literature without presenting new experimental data.
Why it matters — This review is significant for researchers and engineers in materials science and machine learning, as it consolidates knowledge on generative models and their application in inverse materials design. The insights into multimodal learning and closed-loop workflows can inform future research directions and methodologies in automated materials discovery. The outlined evaluation practices provide a framework for assessing the effectiveness of generative approaches, which is crucial for advancing the field. This work contributes to the ongoing discourse on improving the reliability and efficiency of materials design processes, as published in arXiv cs.LG.
By Turing Wire editorial staff · Jun 1, 2026 · Editorial standards →
Source: arXiv cs.LG