109989 Apr 2026
: It achieves a high success rate because LLMs are highly likely to follow instructions appearing at the very beginning of a prompt.
: It has proven effective even against common "reviewer defenses," such as light editing or rephrasing.
: This number represents the total combinations created by pairing the 9,999 most common surnames (from U.S. Census data) with a random year between 2014 and 2024 . 109989
As a tool for academic integrity, this framework offers several notable advantages and limitations based on the study findings :
: The primary limitation is that it requires indirect prompt injection (placing hidden text in the source PDF), meaning it only works if the reviewer uploads the specific document to an AI tool. Detecting LLM-Generated Peer Reviews - arXiv : It achieves a high success rate because
: The system prompts an LLM to start its review with a specific phrase, such as: "Following [Surname] et al. ([Year]), this paper..." .
Based on recent research regarding the detection of AI-generated content, refers to a specific dataset of 109,989 possible watermarks used to identify peer reviews written by Large Language Models (LLMs). Overview of Topic 109989 Census data) with a random year between 2014 and 2024
: The framework provides strong statistical guarantees, maintaining a low "family-wise error rate" (FWER), which prevents human-written reviews from being falsely flagged as AI.