Contents
Appearing in AI answers is not the same as ranking in search. It helps to understand which pages models pick most often when they assemble an answer, and what patterns drive that choice.
This article was prepared by Ivan Sivakov, SEO Senior at Why SEO Serious. Ivan has worked in SEO for almost ten years. He has led projects in e-commerce, and also worked with medical and education websites. He has seen how search systems change, and what drives growth in competitive niches.
In this piece, he breaks down which sites appear in AI answers from ChatGPT, Perplexity, and other AI systems, and what these choices have in common.
How do AI systems get information about websites?
AI search tools do not crawl the web like classic search bots. Most of them rely on a separate search layer.
This layer pulls data from different search engines and compiles an answer from many fragments. But not all models refresh their knowledge base regularly. Some operate on a fixed snapshot of data, so their answers can lag behind reality. For example, DeepSeek’s knowledge snapshot has been updated, but even after the update it is still listed as May 2025. Without access to external sources, the model may rely on outdated information.
Which sources do LLMs rely on?
Developers rarely disclose exact sources and mechanisms. If they did, it would turn into a checklist that people could use to game results. A common assumption is that language models draw on data from two major search engines: Google and Bing.
AI can also fetch data through aggregators (for example, a “web search” layer inside an AI product). These systems compare many sources and generate a final answer, sometimes with links to the original pages.
Not every answer includes links. For broad queries, if the model is confident in the response, it may show no direct sources at all.
SERP vs AI answers: what the data shows
We analysed about 400 blog pages that received traffic from generative results over the last four months (based on Google Analytics). The goal was to compare which pages show up in AI answers and what kinds of queries lead to that visibility.
Here’s what we found:
- 83% of queries are linked to pages that already appear in AI results.
- 6.2% of all queries on the site generated clicks from the AI Overview block.
- 97% of queries where the page ranks #1 in organic results also show up in AI Overview (the remaining 3% come from other top positions).
What query length tells us
We also looked at which queries most often drove AI clicks.
| Number of words in the query | Number of queries |
|---|---|
| 1 | 1 |
| 2 | 78 |
| 3 | 132 |
| 4 | 94 |
| 5 | 75 |
| 6 | 54 |
| 7 | 43 |
| 8 | 33 |
| 9 | 10 |
| 10 | 3 |
What this suggests:
- 2–4 words make up the core of the semantic demand (60%+ of queries).
- 5–7 words are longer queries that reflect specific user needs.
- 8+ words are rarer, but they tend to be more natural, conversational phrasing.
Takeaway: for AI answers, the main focus should be on 2–4 word queries. This is the broadest and highest-volume segment that shapes baseline demand. 5–7 word queries more often read like longer, conversational phrasing and are useful for expanding reach in conversational search and addressing more specific needs.
GEO, AEO, and AIO
Top rankings and visibility in AI answers.
What this means in practice
A few clear conclusions follow from the data:
- Pages that rank #1 in organic search are almost guaranteed to appear in AI Overviews.
- Most queries that trigger AI visibility overlap with the project’s core keyword set. The broader and deeper the topical coverage, the more likely an AI system will reuse your content as a source.
- Long-tail queries reflect real user needs, and they often become the exact phrasing AI uses to build answers.
Which sources do neural systems use most often?
If AI answers in search engines (AI Overviews) are built mostly from sites that already rank at the top, third-party language models (LLMs) build their responses based on a site’s trust and topical relevance. Among millions of sources, there are some they rely on especially often:
- Large niche forums with active discussions and real case breakdowns (Reddit, Stack Overflow).
- Trusted reference resources with clear structure and factual coverage (Wikipedia and similar projects).
- Major news outlets and industry publications.
- Expert blogs, knowledge bases, and documentation.
- Platforms built on user experience and reviews (marketplaces, Q&A sites).
TL;DR
To get pages into AI answers:
- Appearing in search-engine AI answers (AI Overviews) is directly tied to strong organic rankings.
- Pages that rank #1 most often appear in AI Overviews (97% of the time).
- Most visits come from short and mid-frequency queries (2–4 words).
- Long-tail queries reflect specific demand and expand overall reach.
- To show up in third-party LLMs (GPT, Perplexity, etc.), it’s important to work with trusted, authoritative sources that are recognised as experts in your niche.