Reddit blocks Wayback Machine from archiving user posts amid AI licensing push

Reddit has begun blocking the Internet Archive’s Wayback Machine from preserving user-generated content, including post detail pages, comments, and user profiles. While the Reddit homepage and daily top post titles can still be archived, the deeper content that once formed a searchable record on the archive will no longer be saved.

The company says the decision is aimed at preventing artificial intelligence firms from scraping historical Reddit data without authorization. A spokesperson told The Verge that the Wayback Machine must demonstrate it can comply with platform rules—such as respecting deletions of removed content—before full access could be restored.

However, the move comes against a backdrop of Reddit monetizing its vast store of user posts by licensing it to major AI companies. In 2024, Reddit signed $60 million annual agreements with both Google and OpenAI, allowing them to train AI systems on Reddit data. Around the same time, Reddit also restricted access for search engines like Microsoft Bing and DuckDuckGo unless similar deals were struck.

Critics note that while Reddit frames the block as a privacy measure, users themselves have no way to opt out of having their public posts sold or used to train AI. The only guaranteed way to avoid such usage is to stop posting, and even then, existing content remains under Reddit’s control.

CEO Steve Huffman has been clear about the company’s position, calling Reddit’s data “really valuable” and stating in 2023 that it should not be given to large tech companies for free. This stance aligns with Reddit’s broader efforts to increase revenue and reduce heavy losses—last year the company posted a net loss of $484.3 million, up sharply from $90.8 million in 2023.

For now, the change means the public record of Reddit’s discussions will be harder to preserve outside the platform’s own ecosystem—unless the Internet Archive and Reddit reach terms that satisfy both sides.