AI floods the peer-review system at major ML conference, raising credibility concerns

The world’s largest machine-learning conference is facing a problem the field has been warning about for years: AI systems are now reviewing AI research, often without disclosure, and sometimes without understanding the work they’re evaluating. According to new reporting from Nature, more than one-fifth of the peer reviews submitted to the 2026 International Conference on Learning Representations (ICLR) were allegedly fully AI-generated. Over half showed clear signs of AI involvement. That’s a remarkable shift for a process meant to depend on expert judgment and subject-matter familiarity.

Peer review is supposed to be the backbone of academic rigor—researchers assess one another’s work before it earns a place at major venues. But Panagram, a US-based company that develops tools for detecting AI-generated text, screened all 19,490 papers and 75,800 peer reviews submitted to ICLR and found widespread reliance on language models. While fewer full papers were entirely generated by AI, the numbers are still notable: roughly 1% of manuscripts appeared fully machine-written, and another 9% contained more than 50% AI-generated text. These findings, based solely on Panagram’s analysis, have yet to undergo external verification.

Researchers began noticing oddities in the review process firsthand. Carnegie Mellon scientist Graham Neubig received a peer review that appeared synthetic and ultimately reached out publicly for help investigating the issue. At the University of Copenhagen, computer scientist Desmond Elliott reported that a review on his student’s work “missed the point of the paper” so thoroughly that the student suspected it came from a large language model. Panagram flagged it as fully AI-generated, supporting their intuition.

The consequences aren’t theoretical. Some authors have withdrawn submissions after receiving error-filled, AI-generated reviews that misinterpreted methods or criticized claims that never appeared in the work. Neubig told Nature that the rapid growth of AI research—ballooning exponentially over the past five years—has already strained the reviewing system, and AI-generated reviews are compounding the problem rather than easing it.

This isn’t an isolated academic issue. AI-assisted shortcuts have already become common in education, prompting teachers to revert to in-person essays to maintain accountability. Similar patterns are emerging among professionals: US courts have documented filings that include AI-hallucinated case citations, and consultants and IT workers report an increasing tide of generic, AI-generated output slipping into formal workflows. The trend reflects a broader challenge: as AI tools become ubiquitous, distinguishing careful expertise from convincing synthetic text becomes more difficult, especially in fields that depend on precision.

The situation unfolding around ICLR underscores an uncomfortable reality. Tools originally developed to advance the study of AI are now introducing noise into the process that evaluates that very research. Whether conferences and journals can adapt—through disclosure rules, detection tools, or more fundamental changes to peer review—will shape how credible and transparent AI scholarship remains in the years ahead.