A study by Stanford, Imperial College London, and the Internet Archive reveals that since 2022, more than a third of newly created websites contain AI-generated content. The analysis, covering samples up to May 2025, used Pangram v3 software to detect synthetic text. The web is automating at an accelerated pace. 🌐
Pangram v3: the detector that exposes the synthetic footprint 🤖
Pangram v3 software, developed to identify linguistic patterns typical of language models, analyzed a massive set of pages. It detected that AI-generated content proliferates not only in blogs and affiliate sites, but also in forums and news portals. The detector's accuracy allows it to differentiate between human and synthetic text with a reduced margin of error, although advanced models continue to complicate the task.
Soon we'll need an AI to know what isn't AI 😅
Here's the curious fact: more and more websites are writing themselves, but nobody seems to read them. Soon we'll have an internet full of machine-generated articles being read by other machines to train new machines. In that loop, humans will be like that friend who arrives late to the party and only finds crumbs. At least, we're still better at telling bad jokes.