Smarter edits? Post-editing with error highlights and translation suggestions
Fleur V. J. van Tellingen, Gautam Ranka, Dora Žugčić, Joyce van der Wal, Andrea Camasta, Livio Guerra
- Published
- May 20, 2026 — 13:09 UTC
Problem
This preprint addresses the gap in understanding the effectiveness of enhanced post-editing (PE) features in machine translation (MT), specifically focusing on the utility of quality estimation (QE)-derived error highlights and automatic post-editing (APE) suggestions. Despite the increasing quality of MT systems, empirical evidence supporting the benefits of these advanced post-editing tools remains limited, particularly in professional translation contexts.
Method
The authors conducted a user study involving professional translators working from English to Dutch. Participants were tasked with post-editing translations under three conditions: standard post-editing (PE), PE with QE-derived error highlights, and PE with APE-derived error highlights and correction suggestions. The study measured productivity (time taken), translation quality (using a scoring rubric), and user experience (via surveys). The APE system utilized LLM-derived error highlights to guide translators in identifying and correcting errors, while correction suggestions aimed to facilitate the editing process. The study did not disclose specific architectural details of the APE system or the computational resources used for training.
Results
The findings indicated that none of the post-editing conditions resulted in significant productivity or quality improvements compared to standard PE. However, APE-derived highlights were rated more favorably by participants than QE-derived highlights, suggesting a preference for the APE approach. Additionally, the inclusion of correction suggestions from the APE system positively influenced overall user experience, although quantifiable metrics for this improvement were not provided. The study highlights the nuanced reception of different error highlighting methods, even in the absence of measurable gains in productivity or quality.
Limitations
The authors acknowledge several limitations, including the lack of significant productivity and quality gains across conditions, which may suggest that the enhancements do not substantially impact the post-editing workflow. Furthermore, the study’s scope is limited to a specific language pair (English to Dutch) and a particular group of professional translators, which may affect the generalizability of the results. The absence of detailed information regarding the APE system’s architecture and training data also limits reproducibility and further exploration of the method’s efficacy.
Why it matters
This work contributes to the ongoing discourse on the integration of advanced post-editing tools in professional translation workflows. By demonstrating that APE-derived highlights and suggestions are preferred by translators, the study suggests potential pathways for improving user experience in MT systems. The findings may inform future research on the design of post-editing tools, emphasizing the importance of user-centered approaches in developing features that enhance the efficiency and satisfaction of professional translators. This could lead to more effective MT systems that better support human translators, ultimately improving translation quality and productivity in various applications.
Authors: Fleur V. J. van Tellingen, Gautam Ranka, Dora Žugčić, Joyce van der Wal, Andrea Camasta, Livio Guerra, Alina Karakanta
Source: arXiv:2605.21135
URL: https://arxiv.org/abs/2605.21135v1
By Turing Wire editorial staff · May 20, 2026 · Editorial standards →
Source: arXiv cs.CL