
AI search doesn’t just translate or localize what you see. It actively determines which sources, institutions, and versions of reality appear in the first place. Catalonia is a revealing stress test for this system. Two languages coexist in the same territory, which makes it easier to detect how retrieval patterns shift. When you run identical queries in Catalan and Spanish across Google AI Overviews and ChatGPT, the contrasts go well beyond phrasing — and expose deeper structural issues that affect far more than multilingual regions. Catalonia as a stress test for AI search Did you know that if you search for Tradicions de Sant Jordi — Saint George’s Traditions, written in Catalan — Google Translate will label the source language as Occitan? Probably not. Most Catalan speakers are unaware of this, in part because Translate’s guess isn’t entirely incorrect: Catalan and Occitan share a common Romance origin, and some linguistic taxonomies group them together. The classification is, in a narrow sense, defensible. But statistically, it’s a strange choice — and exactly the kind of small, telling anecdote that hints at a much larger infrastructural problem underneath. Google Translate showing “Detectado: Occitano” with input “Tradicions de Sant Jordi” and output “Tradiciones de San Jorge” Occitan has around 200,000 speakers, mainly in southern France. Catalan, by contrast, has about 9 million speakers and is a co-official language of Catalonia, one of Europe’s more affluent regions and home to a city where Google has maintained operations for over two decades. Yet, when queried from a Barcelona IP, Google’s translation tool still concludes that the more likely source language is the…