Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future
Sihong Wu, Owen Jiang, Yilun Zhao, Tiansheng Hu, Yiling Ma, Kaiyan Zhang
Problem
This survey addresses the gap in the literature regarding the integration of AI, particularly large language models (LLMs), into the peer review process. Despite the increasing interest in automating various stages of peer review, there is a lack of comprehensive synthesis of existing techniques, evaluation methods, and ethical considerations. The paper is a preprint and has not undergone peer review, indicating that the findings and methodologies presented are preliminary.
Method
The authors categorize the contributions of AI in the peer review process into three main areas: (i) peer review generation, (ii) after-review tasks, and (iii) evaluation methods. For peer review generation, they discuss various approaches including fine-tuning strategies for LLMs, agent-based systems, and reinforcement learning (RL)-based methods. They also highlight emerging paradigms that enhance the generation of reviews. In the after-review tasks, the focus is on rebuttals, meta-reviews, and manuscript revisions that align with the feedback received. The evaluation methods are classified into human-centered, reference-based, LLM-based, and aspect-oriented approaches. The authors provide a catalog of datasets relevant to these tasks and compare different modeling choices, offering insights into the computational resources required for training and evaluation.
Results
The survey does not present quantitative results or performance metrics against specific baselines, as it is primarily a synthesis of existing literature rather than an empirical study. However, it does provide a qualitative assessment of the state-of-the-art techniques and their applicability across various stages of the peer review process. The authors emphasize the potential of LLMs to enhance the efficiency and quality of peer reviews, although specific effect sizes or performance improvements are not detailed.
Limitations
The authors acknowledge several limitations, including the variability in the quality of LLM-generated reviews and the potential biases inherent in training datasets. They also note the ethical concerns surrounding the use of AI in peer review, such as transparency, accountability, and the risk of perpetuating existing biases in the academic publishing process. An additional limitation not explicitly mentioned is the lack of empirical validation of the proposed methods in real-world peer review scenarios, which could affect their practical applicability.
Why it matters
This survey is significant as it provides a foundational overview for researchers and engineers interested in leveraging AI to improve the peer review process. By synthesizing existing techniques and evaluation methods, it lays the groundwork for future research aimed at developing robust AI systems that can assist in peer review. The insights on ethical considerations also highlight the need for responsible AI deployment in academic settings, which is crucial for maintaining the integrity of the peer review process.
Authors: Sihong Wu, Owen Jiang, Yilun Zhao, Tiansheng Hu, Yiling Ma, Kaiyan Zhang, Manasi Patwardhan, Arman Cohan
Source: arXiv:2604.27924
URL: https://arxiv.org/abs/2604.27924v1