Is GEO just SEO in disguise? What Google claims and where it doesn’t add up

Shrnout tento článek pomocí AI

The question of whether Generative Engine Optimization (GEO) is a standalone discipline, or merely a marketing rebrand of classic SEO, has very practical consequences. It affects budgets, strategies, and the way companies prepare their websites for the world of AI search. In its recently published guide Optimizing your website for generative AI features on Google Search, Google answers the question unequivocally: “From Google Search’s perspective, optimizing for generative AI search is optimizing for the search experience, and thus still SEO.”[1] In other words: no GEO, no AEO, no AI SEO – just good old SEO. The guide also recommends ignoring things like LLMs.txt, content chunking, rewriting texts for AI, structured data, or chasing inauthentic mentions. From Google Search’s perspective, all of this makes internal sense. But it’s worth asking whether the position holds up outside the bubble of SEO consultants and Google itself.

In the arguments against GEO as a standalone discipline, two claims keep coming up that deserve clarification – one rooted in history, the other stemming from a technical misunderstanding of how AI systems actually work.

First claim: “Google dominates, therefore GEO = SEO for Google”

The thesis goes roughly like this: Google is and will remain the dominant supplier of information, so any debate about GEO outside of Google is pointless, and those preoccupied with GEO (typically heard from the SEO consultant community) are selling snake oil, because “GEO is just SEO.” The Google document mentioned above (links to all sources are at the end of this article) then serves as a very strong argument for this view.

The truth is that for the past twenty years, this held up. Google really did dominate online information retrieval, and SEO – with its set of techniques for optimizing for search engines – was a fairly critical marketing channel for online visibility. But user behaviour is actually changing, and the data confirms it.

ChatGPT crossed 900 million weekly active users in February 2026, more than double the 400 million reported in February 2025 [2][3]. The Gemini app surpassed 750 million monthly active users in Q4 2025 [4][5]. At the same time, pressure on classic search keeps mounting – an Ahrefs study repeated on December 2025 data showed a 58% drop in CTR for the first position in the SERP when an AI Overview is displayed, compared to 34.5% in the April version of the same measurement [6][7]. Seer Interactive, analysing 3,119 informational queries, records a 49.4% to 65.2% decline in organic CTR for queries with AIO [8]. Pew Research Center, on a sample of 900 U.S. adults, recorded a drop from 15% to 8% of clicks on traditional results where an AI summary appeared; clicks on links inside the summary itself made up just 1% [9][10].

People are going elsewhere – and even where they stay, the clicks are migrating into AI interfaces. Arguing today that “Google is dominant, therefore GEO makes no sense outside of SEO” is a factually true premise, but unfortunately one with a fast-approaching expiry date. And there’s one more fact: Google is cannibalising itself. The classic SEO victory – first position in the SERP – today brings half the clicks it did two years ago, precisely because of AI Overviews. For companies and their brands, this means GEO is not a “just in case” investment, but preparation for a user exodus that has already begun.

Second claim: “AI doesn’t know me, so GEO makes no sense”

The second view is more common among non-technical marketing specialists and concerns the way AI systems actually retrieve information. A typical scenario: someone opens ChatGPT, asks “what do you know about brand X,” and gets an outdated or inaccurate answer. From this they conclude “there’s no point optimizing for GEO, AI doesn’t know us anyway” or “it gives old and inaccurate information.”

This reasoning overlooks two things at once: how modern AI systems actually retrieve information, and on the other hand, weaknesses in the company’s existing online communication. And the latter isn’t usually welcome to hear.

So first, how most chat interfaces of large language models like ChatGPT, Perplexity, Claude, Copilot, or Gemini actually work today. When synthesising answers, they use what’s called RAG (retrieval-augmented generation). In other words, they reach for information (data) on the live web or into the index of one of the traditional search engines. When a user asks about something specific or recent, the model does not build its answer primarily from the pre-trained corpus. It runs a web search, finds which page likely contains the information, optionally fetches the current content, and from that generates an answer with citations.

So it’s not about getting into the training data – within any influenceable timeframe that simply can’t be done, especially for current and frequently changing data, where it’s essentially impossible. The point is to be a citable, authoritative, and well-structured source at the moment the AI triggers retrieval. The main pre-trained knowledge corpus is relevant for questions like “what is the capital of France,” not for “where can I find the best restaurant in Ostrava.” And the overwhelming majority of what people actually do in AI systems are precisely these commercial and current queries. The question is therefore not “does AI know me from its training data,” but “is my content well-structured, citable, consistent, and current enough for AI to pick it up during retrieval?” That’s a somewhat different discipline from classic SEO, even though they overlap at many points.

In the GEO debate, one more thing is often forgotten: Google is just one of the answer systems. ChatGPT, Perplexity, Claude, Gemini, Copilot – each of them has different retrieval, different sources, a different way of selecting citations, and a different market position in a given region. Optimization that doesn’t matter to Google can make the difference between citation and being skipped on another system, and vice versa. Google’s guide describes optimization for Google. Generalising it to the entire AI search ecosystem is not understanding – it’s distortion.

And now let’s get a bit polemical…

Let me now look at the document from a GEO perspective. The document states, for example, that “Google systems are able to understand the nuance of multiple topics on a page and show the relevant piece to users.” From this Google concludes that content chunking is not necessary.

But when we look at what AI Overviews have actually produced from launch in May 2024 essentially until today, we see a different picture. The feature recommended adding non-toxic glue to pizza to help the cheese stick better (the source was an eleven-year-old joke from Reddit). It recommended eating at least one small rock a day, allegedly according to “geologists at UC Berkeley” – in reality a satirical article from The Onion. It claimed that grease fires should be put out by adding more oil, that Obama was the first Muslim president of the USA, and that Andrew Johnson earned university degrees between 1947 and 2012, despite having died in 1875 [11][12]. When a system can’t distinguish that a satirical article about eating rocks is not a source of nutritional recommendations, it’s bold to claim it understands subtle nuances across multiple topics on a single page.

We don’t have to look far. Just today, in one film, I heard the slang word “nargle” – the translators were trying to convey the “different vocabulary” of one of the characters. I was quite curious what these “nargles” actually were. With apologies to anyone from Brno, but here in the Ostrava region, we really don’t call narcs “nargles”. Google, even though the interface was in Czech and the query was in Czech, preferred the English name for the magical creatures from Harry Potter (yes, those pestering creatures the Medek brothers gave the lovely Czech name “škrkna”).

…maybe about what’s actually better for RAG

In the introduction, Google further describes how AI Overviews technically work: RAG retrieves specific passages from pages and generates an answer from them [1]. A few paragraphs later, it claims content chunking is not needed. Something doesn’t add up here. When retrieval works with passages and reranking selects them based on relevance to the query, it’s technically self-evident that a thematically clean, well-separated passage has a higher chance of being chosen as a citation than a long paragraph mixing five topics. This follows from how RAG technically works. And practice confirms it: thematically focused, well-structured pages get cited more often in AI answers than universal “everything about topic X” texts. Sure, chunking isn’t mandatory. But claiming it plays no role when the system literally retrieves chunks contradicts the very technical description in the introduction of the same guide.

…or about structured data

“Structured data isn’t required for generative AI search, and there’s no special schema.org markup you need to add” is of course technically true, because AI systems can parse HTML and understand synonyms. But functionally, it’s a significant oversimplification.

The RAG pipeline in AI systems needs facts in a clear, unambiguous form. Structured data is the cheapest way to offer them to the AI. Schema.org markup for Article, Product, FAQ, HowTo, Author, Organization, sameAs links to Wikidata and other knowledge bases – all of this is used by AI systems for grounding and entity recognition. Whether “Apple” in a given context is a company or a fruit is often determined more reliably by entity context than by natural text alone.

There’s academic backing for this. The Princeton paper by Aggarwal et al., “GEO: Generative Engine Optimization”, presented at the KDD ’24 conference, assembled the GEO-bench benchmark with 10,000 queries from various domains and measured how specific content modifications affect visibility in generative responses [13][14]. The three most effective methods – adding sources, quotations, and statistics – increase visibility by 30 to 40% on the Position-Adjusted Word Count metric. That’s not exactly a marginal improvement. It signals that the way content is structured and supported plays a fundamental role in generative retrieval.

Google itself has a knowledge graph built on structured data. Microsoft Copilot, according to Microsoft’s own statement, takes structured data from the Bing index. The claim that schema.org “is not needed” is therefore understood from a GEO perspective as follows: for mere indexing and appearance in rich results, it’s not a necessity, but for maximising the probability that an AI system will correctly recognise, classify, and cite your information, it’s one of the best cost/performance ratios available today.

…and what about brand mentions?

On brand mentions, Google writes: “Seeking inauthentic ‘mentions’ across the web isn’t as helpful as it might seem.” With the “inauthentic” caveat, this can be agreed with – false mentions can be detected and ignored by AI. In practice, however, AI systems cite and mention brands based on source diversity, knowledge graph context, and the tonality of appearances on reputable domains. Paid collaborations, PR, hidden advertising, or partner texts are not easy to detect, so the line between what’s “authentic” and “inauthentic” is in reality much harder to draw than Google’s guide suggests.

An Ahrefs analysis of 75,000 brands found a correlation of 0.664 between brand mentions on the web and citation rate in AI answers. That’s roughly three times stronger than the relationship for classic backlinks (0.218), and importantly, mentions work even without a hyperlink – based purely on the brand name appearing in an authoritative context [15]. An Averi analysis on 680 million citations shows that ChatGPT and Perplexity share only 11% of cited domains; similarly, 12% comes from an independent study of 15,000 queries across ChatGPT, Perplexity, and Google AI [16]. Each platform has its own source pool and its own selection logic.

Citation rate in ChatGPT, Perplexity, and AI Overviews then correlates strongly with the number and quality of authentic mentions in authoritative sources – industry media, Wikipedia, expert databases, comparison sites, forums like Reddit, or PR coverage. Specific source preferences are currently quite narrow. ChatGPT cites Wikipedia in 47.9% of its top citations, Perplexity cites Reddit in 46.7%, and Google AI Overviews strongly prefers YouTube with a 23.3% citation share [17]. On top of this, domains with active profiles on comparison sites like G2 or Capterra show a three times higher probability of citation in ChatGPT than sites without such presence [18]. Both these sites encourage rewards for reviews. Likewise, many brands provably reward their customers (and not just them) for writing “authentic” reviews. In classic SEO, you simply bought links for link juice as part of link building. In GEO, you acquire mentions for entity reputation, often without a link – for example, through paid articles.

On query fan-out – the breaking down of a topic into related sub-queries – Google describes that the model generates parallel related queries. For example, “how to fix a lawn that’s full of weeds” → “best herbicides for lawns”, “remove weeds without chemicals”, “how to prevent weeds in lawn”. What does this mean for website owners? Is the all-encompassing article really the right source for AI to clearly split the information? Or is it better to chop it into clear, named entity-based parts?

Clearly, for visibility in this regime, it’s wise to cover related sub-topics. At the same time, Google warns in the text against “scaled content abuse” – creating separate pages for every variation. The line between “legitimate coverage of related queries” and “spam scaled content” is again very blurry in the guide, and in practice is separated only by the quality and depth of treatment. For a GEO practitioner, this means regular calibration that’s handled differently in classic SEO – there, it’s primarily about keyword coverage; in GEO, it’s about covering the entity space.

How credible is the official Google statement?

After several technical inconsistencies, one more question deserves mention: how trustworthy is the official Google statement in general.

In May 2024, internal documentation of the Google Search Content Warehouse API was accidentally posted on GitHub. It consisted of 2,596 modules and 14,014 attributes, described and analysed by Rand Fishkin (SparkToro) and Mike King (iPullRank) [19][20][21]. Comparing Google’s official statements with what the documentation actually describes revealed a number of contradictions. John Mueller, Search Advocate at Google, has long claimed that Google does not use Chrome browsing data for ranking purposes. The documentation describes the NavBoost system, which uses click signals and user interactions. The existence of NavBoost was also confirmed by Pandu Nayak’s testimony in the DOJ vs. Google antitrust trial [19]. Gary Illyes labelled CTR and dwell time as “generally made up crap,” and yet the leaked documentation contains attributes like goodClicks, badClicks, and lastLongestClicks. Google representatives long denied the existence of “site authority,” a sandbox for new sites, and separate evaluation of subdomains. How then to explain the siteAuthority and hostAge attributes?

The authenticity of the documents was confirmed by former Google employees. Google itself did not deny their authenticity; it only stated that it warns against “inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information” [20]. From Google’s perspective, this makes sense – the fight against manipulation (and what else is SEO…?) is its legitimate interest. Even at the price of a not-always-truthful description of working techniques and mild disinformation. From the perspective of practice in SEO and GEO, however, this means that taking Google’s official blog as an objective description of how Search works has no real grounding. More valid sources are quantitative studies, research, and experiments from practice.

What Google writes in the documentation vs. what it actually runs

The same official Google guide writes: “You don’t need to create new machine readable files, AI text files, markup, or Markdown to appear in generative AI search.” The funny thing is that the very same article is publicly available at the URL https://developers.google.com/search/docs/fundamentals/ai-optimization-guide.md.txt with the HTTP header Content-Type: text/markdown; charset=utf-8 [22]. That is, precisely what – according to Google’s recommendations – you don’t actually need: a structured, condensed, machine-readable Markdown version of the content for easier processing by machines and AI systems. While the HTML page is full of noise for an LLM (CSS, JS, navigation, headers, sidebars, cookie bars, side elements), the Markdown version is clean content, easily parseable, cheap on tokens.

And it’s not an isolated case. Google to this day runs a classic llms.txt file on its Agent Development Kit project – exactly the format it discourages creating in its documentation. The file at adk.dev/llms.txt contains a structured map of the documentation for AI agents and coding assistants, and Google itself recommends connecting it via an MCP server [23]. Anthropic, OpenAI, Perplexity, Stripe, Cloudflare, and other tech players are going the same way.

TIP: Read also our LLMs Guide

So what about llms.txt?

The empirical situation regarding llms.txt is currently fairly clear-cut. An OtterlyAI experiment measured 90 days of server logs: out of 62,100 AI bot requests, only 84 hit llms.txt, that is 0.1% [24]. John Mueller stated on Bluesky “FWIW no AI system currently uses llms.txt.” and compared its value to the meta keywords tag – a signal search engines long ago abandoned because it’s controlled by the site owner [25]. Gary Illyes confirmed at Google Search Central Deep Dive 2025 that Google does not support llms.txt and does not plan to [26]. Ahrefs summarises in its analysis: no major LLM provider has formally adopted llms.txt, and there is no evidence the file improves retrieval, traffic, or model accuracy [24].

Kai Spriestersbach, in his piece “The llms.txt is dead. More precisely: a dud.” [24], adds another important and fair point: the existence of llms.txt on the dev docs of Anthropic, OpenAI, or Perplexity does not mean that ClaudeBot, GPTBot, or PerplexityBot reads your llms.txt during web retrieval. The same applies to adk.dev/llms.txt from Google – it’s publishing for developers and coding assistants, not proof that the system reads other websites’ llms.txt files. Spriestersbach puts it aptly: just because a restaurant has a menu doesn’t mean it reads other restaurants’ menus before cooking.

This objection is justified and can’t be sidestepped. But what does it imply for the GEO debate?

First: academic quantitative research into the impact of llms.txt on visibility in AI answers does not yet exist. What we have are server logs and statements from provider representatives. That’s strong indication of “doesn’t work today,” but weak proof of “will never work.” Standards go through phases of acceptance, rejection, and re-acceptance. Robots.txt took time to be adopted; schema.org went through a period of scepticism. Being an early adopter in the case of LLMs.txt costs a few minutes of work, and the risk is zero.

Second, and this is the point for the GEO debate: the dispute about llms.txt as a specific format does not refute the necessity of GEO as a discipline. Spriestersbach himself recommends in conclusion “what to do instead”: content quality and citability, semantic structuring, topical authority, monitoring AI visibility. All of these are activities that classic SEO did not measure, optimise, or organise in the same way.

Third: the fact that Google itself serves Markdown versions of its documentation at .md.txt URLs and runs llms.txt at adk.dev is no accident. Structured, condensed, machine-readable content for AI systems is a trend that big players are following regardless of whether they formally support a specific standard. Whether the winning format will be called llms.txt, llms-full.txt, .md.txt, .md, or something else is a question that cannot be answered today. That the road leads in this direction is clear from what providers actually do, not from what they write in documentation for everyone else.

GEO is not a rebrand of SEO. The goal is different

A large part of the methodology overlaps between SEO and GEO. Quality content, technical cleanliness, authority (E-E-A-T), expert grounding, indexability, semantics – all of this held, holds, and will continue to hold. Google is right here, and nobody reasonable disputes it.

But the goal is different.

SEO tries to get a user onto the website. Success means a click-through, a visit, a conversion. The key metrics are organic traffic, position in the SERP, CTR from the SERP. GEO wants to get the brand, product, or information into the answer the user reads directly in the AI interface – ideally preserving truthfulness, consistency, and as much control as possible over what the AI says about the brand. Success means a mention, a citation, a recommendation, share of voice in AI answers – often without an immediate click to the website.

Different goal, different metrics, different tactics. Anyone who has measured traffic from AI Overviews or ChatGPT search knows that clicks are no longer the only metric. What matters is being in the answer, being a cited authority, not being skipped over, and influencing how the AI talks about the brand. That is measured and optimised differently than position in the SERP.

The cited llms.txt sceptics (Spriestersbach, Ahrefs) describe in their “what to do instead” recommendations exactly the disciplines that fall under GEO: citability, semantic structuring, topical authority, AI visibility monitoring. The dispute is therefore more about nomenclature than substance. We can call it AI SEO, GEO, AEO, LLMO, and honestly the name doesn’t matter. But pretending it’s exactly the same as classic SEO is about as solid as claiming content marketing is just copywriting. Part of the methodology overlaps; the goal is not the same.

Conclusion

According to Google, GEO is just SEO. It says so because, from its perspective, this is true – from Google Search’s standpoint, it really is still optimization for the search experience. And from Google’s standpoint, it’s also logically more advantageous to claim that no new disciplines are needed, no new files are needed, and you should keep doing what you’ve been doing for years.

But Google is not the only system. AI Overviews hallucinations show that understanding nuance has serious limits. The RAG paradox in its own guide shows that chunking technically plays a role, even if the text later denies it. Princeton research quantifies that content adjustments for generative engines yield a 30 to 40 percent visibility lift. The 2024 leak of internal documentation shows that taking Google’s official statements as objective truth about how Search works has no real basis. And the fact that Google itself serves a Markdown version of its documentation telling you that Markdown versions are not needed, while simultaneously running llms.txt at adk.dev, is vivid evidence that reality is more complex than the PR text suggests.

As for llms.txt as a specific format, the empirical data so far is fairly clear: large AI systems currently don’t read it. Academic quantitative research on its impact on visibility in AI answers is still missing and will only later show how it all develops. But this partial question is not the main one. The main thing is that the difference between SEO and GEO is not whether you create one specific text file, but what goals you set and how you measure success. The same questions that newly forming SEO was asking twenty years ago are the ones GEO is asking today. Some answers will turn out to be dead ends over time, others will become new standards. That is a normal phase in the development of a discipline – not an argument that the discipline itself doesn’t exist or doesn’t work.

Sources

Is this article useful to you and are you citing it? Copy the citation