
Two and a half years ago, I published an article on Search Engine Land arguing that retrieval-augmented generation (RAG) represented the future of search. In that piece, I said RAG wasn’t Google’s knee-jerk response to ChatGPT, but rather the architecture they’d been assembling since the REALM paper in August 2020. SGE (now AI Overviews) was simply the production rollout of that work. Everything that has unfolded since has reinforced that view. The single-pass RAG pipeline I outlined back then — query → retriever → top-k chunks → LLM → answer with citations — is already outdated. All leading AI search systems have evolved beyond it. Google AI Mode, ChatGPT Search, Perplexity Pro Search, Claude with Computer Use, Gemini Deep Research, and even Microsoft Copilot’s Researcher and Analyst agents now rely on a different setup. They plan. They orchestrate across tools. They retrieve, read, then retrieve again. They evaluate their own initial drafts and decide whether to iterate. The retrieve-once-then-generate pattern that defined the first generation has been left behind. This is agentic RAG, and it has become the norm. If your GEO strategy is still tuned for single-shot retrieval, you’re optimizing for an ecosystem that no longer operates that way. Worse, in an agentic RAG world, you can’t see the intermediate gatekeepers turning you down — you only see whether you appear in the final response. The classic reverse-engineering toolkit (rank tracking, citation tallies, even prompt-level sampling) only exposes the last step of a multi-stage pipeline. Everything upstream is opaque. By the time you reach…