Can AI Replace Human Peer Reviewers? A Comparative Analysis of AI-Generated and Human Expert Reviews

Can AI Replace Human Peer Reviewers? A Comparative Analysis of AI-Generated and Human Expert Reviews Journal of Digital Information Management Pit Pichappan, Preethi Pichappan 23 4 2025 https://doi.org/10.6025/jdim/2025/23/4/220-233 https://www.dline.info/fpaper/jdim/v23i4/jdimv23i4_2.pdf This work analyses the performance of AI-generated peer reviews compared to human expert reviews across 62 manuscripts in the Real-Time Intelligent Systems track of the Springer Lecture Notes in Networks and Systems. Using four large language models ChatGPT 3.5, Perplexity AI, Qwen-3 Max, and DeepSeek the study evaluated 141 reviews using both AI and human reviews against 12 quality criteria, scored by five domain experts. Results show human reviews scored higher overall (mean = 3.98 vs. 3.15 for AI) with greater consistency and depth, particularly in methodological critique, literature contextualisation, and review confidence. AI reviews were more generic and less specific, struggled with scholarly subtlety, though they excelled at summarisation and formatting checks. While the difference approached statistical significance (p = 0.08), the effect size (Cohen's d = 0.56) indicated a moderate practical gap. The study concludes AI cannot replace human reviewers but may ethically augment the process in hybrid models under human oversight.