Notable other

What Are LLMs Doing to Scientific Communication? Measuring Changes in Writing Practices and Reading Experience

Filip Miletić, Neele Falk

Published
May 19, 2026 — 14:54 UTC

Problem
This preprint addresses the gap in understanding how the integration of large language models (LLMs) into the scientific writing process has altered writing practices and reading experiences in the field of Natural Language Processing (NLP). Despite the proliferation of LLMs, there is limited empirical analysis on their impact on scientific communication, particularly in terms of lexical and stylistic changes in published papers.

Method
The authors create two distinct datasets: a naturalistic corpus comprising over 37,000 papers from the ACL Anthology published between 2020 and 2024, and a synthetic dataset of 3,000 human-written passages alongside their LLM-generated revisions. They conduct diachronic lexical analyses to assess changes in word frequency and usage contexts over time. Additionally, they model complex stylistic features, focusing on syntactic constructions, word complexity, and lexical diversity. A pilot annotation study involving 20 domain experts is also performed to evaluate the subjective reading experience of LLM-modified texts compared to original human-written texts.

Results
The analysis reveals significant changes in both lexical frequency and usage contexts, indicating a trend towards semantic specialization and generalization. LLM-modified texts exhibit a higher prevalence of specific syntactic constructions, increased complexity in word choice, and longer word lengths, while also demonstrating lower lexical diversity. In the pilot study, experts rated LLM-improved texts as more understandable and engaging, with an average rating of 4.2 out of 5 for understandability. However, qualitative feedback indicated a pervasive skepticism towards LLMs, with experts expressing concerns about the implications of AI-assisted writing on the integrity of scientific communication.

Limitations
The authors acknowledge that their findings are based on a limited temporal scope (2020-2024) and may not capture long-term trends in scientific writing. The pilot study’s small sample size (20 experts) may not be representative of the broader scientific community, potentially limiting the generalizability of the subjective reading experience results. Additionally, the study does not explore the potential biases introduced by LLMs in scientific discourse or the implications of these changes on the peer review process.

Why it matters
This research has significant implications for the future of scientific communication, particularly as LLMs become more integrated into writing workflows. The observed changes in writing practices could influence how research is perceived and understood, potentially affecting knowledge dissemination and collaboration within the scientific community. Furthermore, the mixed reactions from domain experts highlight the need for ongoing discourse about the ethical and practical ramifications of AI in academia, suggesting that while LLMs may enhance readability, they also raise critical questions about authorship, originality, and the role of human expertise in scientific writing.

Authors: Filip Miletić, Neele Falk
Source: arXiv:2605.19936
URL: https://arxiv.org/abs/2605.19936v1

Turing Wire

By Turing Wire editorial staff · May 19, 2026 · Editorial standards →

Source: arXiv cs.CL