The UK Competition and Markets Authority has issued a binding order requiring Google to allow website owners, such as newspapers and digital media outlets, to prevent their content from being used in search features with artificial intelligence, such as AI Overviews. This decision seeks to balance publishers' control over their data against the advancement of Google's automated tools.
The technical mechanism behind data exclusion ⚙️
Google will have to implement a technical system that allows websites to use exclusion tags, similar to those already existing for traditional search crawling. These tags, such as noindex or robots.txt, can now be applied specifically to prevent Google's AI from extracting snippets or training models with the content. Publishers will be able to configure their servers to block access by artificial intelligence crawlers, which implies changes in Google's indexing infrastructure and in the way its algorithms process public information.
AI now asks for permission, like a kid in a candy store 🍬
Now it turns out that artificial intelligence, that know-it-all and can-do-everything entity, has to ask for permission to use newspaper texts. It's like a super-gifted robot coming to your house, drinking your milk, and using your Wi-Fi without asking, and suddenly you tell it: hey, ask permission before raiding the fridge. Google will have to bite its algorithmic tongue and accept that not everything that shines on the internet belongs to it.