How Search Engines Work: Retrieval, Ranking & AI Systems (for SEOs and Business)

SEO articles

Most companies investing in organic visibility are optimising for ranking — and skipping the stage that determines whether their content enters the competition at all. Google and other modern search engines operate through a multi-stage pipeline: retrieval (candidate generation using inverted indexes and vector embeddings), ranking (machine-learned scoring and re-ranking), and — increasingly — AI interpretation (confidence-weighted passage arbitration and answer synthesis through Retrieval-Augmented Generation). If your content isn’t eligible for retrieval, ranking optimisation is irrelevant.

I’m Szymon Słowik, SEO consultant & strategist, and this article maps the full pipeline and frames what each stage means for how you allocate budget, structure architecture, and measure returns on organic visibility.

Understanding this pipeline is what underpins Semantic Retrieval Optimization (SRO) — an approach I know from Sergey Lucktinov and Koray Tuğberk Gübür, which I currently apply in every strategic engagement to structure content so that it’s eligible for retrieval, competitive in ranking, and citable by AI.

TL;DR for business, marketing & strategy leaders

This is a pillar page defining how modern search systems actually select information. It explains the retrieval → ranking → AI arbitration pipeline and why visibility now depends on eligibility before position. If you invest in SEO, AI visibility, or content strategy, this is the structural layer your decisions sit on.

  • Ranking happens after retrieval — and most SEO ignores the gate.
  • AI systems evaluate passages, not pages — selection replaces position.
  • Visibility now depends on entity clarity, structural extractability, and probabilistic confidence.

In simple terms, it looks like this:

3 layers visibility funnel (simplified)

If your content is not architected for retrieval and AI selection, no amount of ranking optimisation will make it visible. Without retrieval-ready architecture, scaling content and links increases cost — not visibility.

Most SEO advice starts at ranking. That’s the wrong entry point.

Ranking models only score content that has already been retrieved. If a page fails the retrieval gate — due to rendering issues, semantic ambiguity, entity confusion, or structural problems — it never enters the candidate set. This distinction isn’t academic. It changes where you allocate budget, how you structure information architecture, and what you report to the board. Ahrefs’ research found that 96.55% of web pages get zero traffic from Google — and most of that failure happens at retrieval, not ranking. The content exists in the index. It’s absent from the competition.

If you’re a founder or CMO at a scaling B2B SaaS or tech company spending €50K–€200K/year on SEO, this pipeline is the lens through which every investment decision should be evaluated. Not “are we ranking?” but “are we even being retrieved — and for which queries or even topics?”

Real life example

In a recent audit for a boutique M&A advisory firm, the pattern was textbook. They’d invested in keyword targeting and link acquisition — new referring domains were climbing steadily — but organic visibility stayed flat and even decreased lately. No ranking movement. No traffic growth. The links weren’t broken. The content wasn’t thin. The problem was upstream.

Their “About Us” page was interchangeable with any consulting firm in any vertical — no named founders, no credentials, no case studies, no entity signals connecting the brand to the people behind it. Service pages described what M&A advisory is, not what this firm does differently or for whom. From the retrieval system’s perspective, there was nothing to disambiguate — no reason to prefer this domain over any other candidate making the same generic claims.

More referring domains, but less traffic; source: ahrefs

We’re now repurposing their content with four main interventions: proper ICP framing on every service page — who this firm serves, what constraints those clients face, what outcomes look like — structured data connecting the brand entity to its founders, their credentials, and documented engagements, topical coverage that fills the gaps between their existing pages, and content enrichment for better contentEffort scoring and information gain.

The goal isn’t just more content. It’s making the content they have retrievable — by increasing E-E-A-T signal density, resolving entity ambiguity, and closing the semantic loops that retrieval systems use to build confidence. Links amplify a signal the system trusts. Without that trust foundation, they were amplifying nothing.

The pipeline at a glance

Search engines operate through five stages, each with distinct functions and failure modes:

  1. Crawling — discovering and downloading pages via Googlebot and other crawlers.
  2. Indexing — parsing, rendering, and storing content in the inverted index and vector index.
  3. Retrieval — candidate generation using lexical matching (BM25) and neural matching (vector embeddings) to select documents that might satisfy a query.
  4. Ranking — machine-learned scoring, re-ranking via neural models (BERT, transformer architectures), and signal-based adjustments (Navboost, twiddlers) to order candidates by predicted relevance.
  5. AI interpretation — passage-level arbitration via confidence-weighted scoring, entity alignment checks, and answer synthesis through Retrieval-Augmented Generation (RAG).

Each stage is a filter. Content that fails any stage is invisible to the next. The strategic question is: which stage is your binding constraint?

The Retrieval Stage: Selection Before Competition

Retrieval is the candidate selection process. When a user submits a query, the system selects a subset of documents from an index containing an estimated 400 billion pages (a figure referenced in Google’s antitrust trial testimony). Scoring each one against every query is computationally impossible — so the system filters first. It is a gate, not a scoring curve. If a page does not pass retrieval, it does not compete — regardless of backlinks, authority, or production cost.

Think of it as a procurement shortlist, not a performance review. The index narrows billions of documents to typically 1,000–10,000 candidates in milliseconds. Only those candidates proceed to ranking. Everything else is invisible regardless of quality.

For a business investing in content production, this has a direct cost implication. Every page that fails retrieval eligibility represents sunk cost — writer time, editorial cycles, design resources — producing assets that never enter the competition.

What determines retrieval eligibility

Retrieval eligibility is a function of three interdependent layers. Each must be satisfied; failure at any layer eliminates the page from the candidate set.

Layer 1: Technical accessibility. This is the gateway test. If Googlebot can’t render the page — or if performance falls outside acceptable thresholds — the content doesn’t exist from the system’s perspective.

Crawling and indexing fundamentals: TTFB under 500ms, LCP under 2.5s on mobile, clean DOM under 1,500 nodes, no accordion-hidden primary content, no render-blocking JavaScript that prevents content parsing. Technical compliance is necessary, not sufficient — but without it, the other layers are irrelevant.

Layer 2: Semantic clarity. Does the content explicitly address the entities and relationships the query implies? A SaaS company’s “platform overview” page that never mentions specific capabilities, integration types, or use-case contexts by name might be indexed but never retrieved for the queries its potential customers actually search. The page exists in the index. It’s absent from the competition.

Entity resolution is part of this layer — can the system confidently associate the page with the correct entities in its Knowledge Graph? If your brand, your product category, or your topic area creates disambiguation problems — multiple meanings, competing entities, unclear relationships — the retrieval system hedges. And hedging means exclusion.

Layer 3: Trust and authority signals. This is where the system evaluates whether a retrieved candidate deserves to compete. Navboost (behavioural data indicating user satisfaction), external corroboration (other authoritative sources referencing your content or brand), and link portfolio quality all contribute to what the DOJ trial documents revealed as a site-level trust assessment. A domain with strong trust signals gets more generous retrieval — its pages are included in candidate sets for a wider range of queries. A domain with weak or conflicting signals faces a narrower retrieval window.

Structural coherence operates across all three layers. Heading hierarchy, passage-level clarity, and internal linking all contribute to what I’d frame as semantic cost — the computational effort the system expends to understand what a page is about. Lower semantic cost means higher retrieval probability. This is an engineering problem, not a copywriting problem.

content investment vs visibility outcome

This is where Brand, UX, and Semantics converge — the BUXS model I use as a diagnostic lens in every strategic engagement. Each dimension affects retrieval eligibility independently, but they multiply each other’s impact.

Lexical vs Neural Retrieval: How Candidate Generation Actually Works

Most SEO discussions treat retrieval as a black box — content goes in, candidates come out. But the mechanism matters, because it determines what kind of content gets selected and why.

Modern search engines use two retrieval systems in parallel, each with different selection logic.

Lexical retrieval: the inverted index and BM25

The inverted index is the foundational data structure of search. It maps every term in the corpus to the documents that contain it — essentially a reverse lookup table. When a query arrives, the system identifies documents that share terms with the query, then scores them using BM25 — a probabilistic relevance function that weights term frequency, document length, and inverse document frequency.

BM25 is a sparse retrieval method. It matches on exact terms and their known variants. At over 13 billion queries per day (DemandSage, 2025–2026 estimates), speed is not a preference — it is a constraint. BM25 remains the backbone of candidate generation in Google and most search systems precisely because it scales under that load.

The business implication: terminology precision matters at the retrieval stage. If your B2B SaaS company calls its product a “workflow orchestration platform” but your prospects search for “project management software,” the lexical gap may prevent retrieval before any ranking model gets involved.

Neural retrieval: vector embeddings and ANN search

Neural retrieval operates on meaning rather than terms. Documents and queries are converted into dense vector representations — embeddings — using transformer models. Retrieval becomes a geometry problem: find the document vectors closest to the query vector in embedding space.

Because searching billions of vectors exhaustively is too slow, systems use Approximate Nearest Neighbor (ANN) algorithms to find high-similarity candidates efficiently. ANN indexing trades a small amount of recall precision for massive speed gains — making neural retrieval viable at web scale.

Neural retrieval captures semantic relationships that BM25 misses. A page about “reducing customer acquisition cost” can be retrieved for queries about “improving CAC efficiency” even if the exact terms don’t match — because the embedding vectors are close in the semantic space.

Why both systems run in parallel

Google’s retrieval pipeline runs lexical and neural retrieval simultaneously, merging their candidate sets before passing them to the ranking stage. This dual approach means your content needs to satisfy both selection logics:

Lexical coverage — include the precise terms and entity names your audience uses.

Semantic depth — cover the conceptual neighbourhood so that neural retrieval associates your content with related queries.

Content that excels at one but fails the other leaves retrieval probability on the table. Semantic SEO addresses both — it structures content around entities and relationships (serving neural retrieval) while maintaining terminology precision (serving BM25).

The Ranking Stage: Ordering the Retrieved Set

Once the candidate set is assembled — typically 1,000–10,000 documents from combined lexical and neural retrieval — ranking begins. This is where Google applies its machine-learned scoring stack, and where most SEO discussions (and most budgets) live.

But ranking is not a single algorithm. It’s a layered system: initial scoring via lightweight models, then progressively deeper re-ranking via transformer-based neural models (including BERT and its successors) that evaluate query-document relevance at the token level.

Navboost and user interaction signals

Navboost uses aggregated click data — specifically, click patterns, dwell time, and navigation behaviour — to adjust rankings based on real user satisfaction signals. This is not a simple click-through-rate metric. It’s a behavioural model trained on approximately 13 months of click logs — as revealed in DOJ antitrust trial testimony (document PXRD003, Durrett testimony demonstrative) — that adjusts the ranking of URLs based on whether users found what they were looking for. Pandu Nayak’s interview notes (PXR0357) further confirmed the weight of interaction signals in the ranking pipeline.

The business implication is direct: pages that earn clicks but fail to satisfy — high bounce rates, quick returns to SERP, pogo-sticking — receive a negative signal that compounds over time. For B2B companies with complex products, this means your content can’t just be keyword-relevant. It must resolve the searcher’s actual decision problem. If a CMO searches “SEO for SaaS” and lands on a page that talks about SEO generically without addressing SaaS-specific constraints — CAC pressure, long sales cycles, multi-stakeholder buying committees — that’s a dissatisfaction signal the system records.

More context: US and Plaintiff States v. Google LLC — DOJ case page

UX isn’t a nice-to-have in ranking. It’s a weighted variable in the scoring function.

Neural re-ranking: BERT and transformer models

After initial scoring, top candidates pass through neural re-ranking models. Google’s integration of BERT (Bidirectional Encoder Representations from Transformers) — now applied to nearly every English-language query — and subsequent transformer architectures allows the ranking system to evaluate the relationship between query and document at a much deeper semantic level than BM25 alone.

Where BM25 matches terms, BERT understands context. The phrase “apple support” near “phone” resolves to Apple Inc. tech support, not fruit assistance. This contextual understanding at the re-ranking stage means content that is semantically precise — where entity relationships are explicit and unambiguous — scores higher than content that relies on keyword density without contextual clarity.

For content strategy, this reinforces a principle: the ranking system rewards semantic precision, not keyword repetition.

Twiddlers and personalisation

Twiddlers are modular re-ranking functions that adjust scores after the initial ranking pass. They can boost or demote results based on freshness, content type, location, language, and dozens of other contextual variables. They operate as a second-pass correction layer — meaning the SERP you see is not a static output of one algorithm but the product of multiple sequential adjustments.

Personalisation operates after retrieval, not before — an important distinction. The candidate set is assembled without personalisation. Then, user-specific signals (search history, location, language preferences) adjust the ranking of already-retrieved candidates. This means personalisation doesn’t determine whether your content competes. It influences where it’s positioned within the competition. Retrieval eligibility remains the universal gate.

For strategic planning, this matters because SERP positions aren’t deterministic. They shift by context. The question isn’t “what’s the ranking factor?” — it’s “which combination of signals, in which context, for which query class, produces the outcome we’re investing towards?”

Site quality score and domain-level signals

Google maintains a site quality score — a domain-level assessment that influences how all pages on a domain are ranked. The Google Leak materials confirmed attributes feeding into this score, including patterns around content quality, user engagement, and E-E-A-T signals at the site level.

This is where brand as an entity becomes a strategic asset — not in the vague “brand awareness” sense, but as a measurable input to a scoring system. A strong site quality score acts as a rising tide — it improves the baseline ranking position of every page on the domain. Conversely, domain-level quality problems can suppress even excellent individual pages.

The capital allocation implication for decision makers: sometimes the highest-ROI SEO investment is pruning, not producing. Removing content that degrades domain signals can unlock performance for the content that deserves to compete.

This is exactly what I’ve done when I migrated my Polish page from *.pl ccTLD to subcatalogue here. I’ve cut a pretty big chunk of pages that weren’t relevant anymore. I will publish case study on this topic later.

The AI Interpretation Layer: Retrieval-Augmented Generation and Answer Assembly

AI systems — Google AI Overviews, ChatGPT with web browsing, Perplexity, Gemini — introduce a third stage that sits on top of traditional retrieval and ranking. This isn’t a replacement for search. It’s an interpretation and synthesis layer that uses search results as grounding data.

The distinction matters technically. Semantic search is a retrieval method — finding content based on meaning rather than exact terms. RAG (Retrieval-Augmented Generation) is a system architecture: it uses retrieval (including semantic search) as input to a language model that then generates a synthesised answer. The concept was formalised by Lewis et al. (2020) as an architecture that augments language models with a retrieval component to ground generation in external knowledge. Every AI search product currently shipping — from Google’s AI Overviews to Perplexity — is fundamentally a RAG implementation.

For business leaders, the strategic question isn’t whether AI search matters — it does — but how it changes the economics of organic visibility.

How AI systems retrieve content

When an AI system processes a complex query, it doesn’t simply search once. It decomposes the query into multiple fan-out queries — sometimes 10 to 30 sub-queries — each targeting a different facet of the question. Each sub-query triggers its own retrieval pass, generating a broad candidate set of 100 to 300 passages.

These passages — not pages — are then evaluated through confidence-weighted arbitration. The system scores each passage across four dimensions: relevance to the sub-query, factual consistency with the model’s parametric knowledge, entity alignment with the Knowledge Graph, and source authority signals. The 4 to 5 passages that survive this arbitration form the basis of the AI-generated answer.

The unit of competition in AI search is the passage, not the page. And the unit of visibility is the citation, not the click.

This shifts the ROI model. In traditional search, you measure clicks and sessions. In AI-augmented search, the value increasingly comes from being cited as a source — which builds brand recognition and authority even when the user doesn’t click through. The metric shifts from clicks in the browser to clicks in minds.

What makes content citable by AI

Passage clarity is the primary arbitration signal. A passage that makes a clear, self-contained claim — with explicit entity relationships and bounded scope — is far more likely to survive confidence scoring than a passage that requires surrounding context to make sense.

Entity accuracy matters because AI systems cross-reference claims against their Knowledge Graph. If your content makes a claim that conflicts with established entity relationships — or if it’s ambiguous about which entity it’s discussing — the confidence score drops, and another source gets cited.

Structural coherence at the passage level means each major section of your content should function as an independent retrieval unit. Each block of 150–300 words should contain a declarative claim, define the primary entity, provide supporting evidence, and close the semantic loop.

Why AI hallucinations are a retrieval quality problem

When AI systems produce inaccurate information — hallucinations — the cause is often insufficient retrieval rather than model failure. When the retrieval step doesn’t surface high-confidence passages for a facet of the query, the language model fills the gap from its parametric memory — the statistical patterns learned during training. Parametric memory doesn’t distinguish fact from plausible-sounding fiction. Current benchmarks place hallucination rates at roughly 3–8.5% depending on model and task complexity (GPT-4 benchmarks ~3%; other models higher) — low enough to build user trust, high enough to make retrieval quality the critical variable.

The implication for content strategy: being the highest-confidence retrievable source on your topic doesn’t just earn citations. It reduces the probability that AI systems hallucinate about your category — which protects your brand positioning in AI-generated answers.

Multimodal retrieval and future surfaces

AI retrieval is expanding beyond text. Google’s AI systems already evaluate images, video, and structured data as retrieval candidates. Google AI Mode introduces follow-up sessions where the AI re-retrieves based on the user’s conversational refinements — each follow-up generating new fan-out queries against the same and adjacent retrieval pools.

AI agents — autonomous systems that execute multi-step research tasks — represent the next retrieval surface. The scale of AI retrieval is already significant: AI Overviews reach an estimated 2 billion users monthly and appear in roughly 21–55% of Google searches, depending on query category and market (Google announcements; BrightEdge and Ahrefs tracking, 2025–2026). When an agent searches for “best enterprise deployment tools,” it doesn’t browse SERPs. It retrieves passages, cross-references them, and synthesises recommendations without human click behaviour entering the loop. Being structurally retrievable for agent queries is an emerging competitive dimension.

Traditional Search vs AI-Native Answer Engines

DimensionTraditional searchAI answer engines
Unit of retrievalPagePassage
Output formatRanked list of linksSynthesised answer with citations
Success metricClick-through rate, positionCitation, source attribution
Arbitration methodRe-ranking via scoring modelsConfidence-weighted passage selection
User behaviourClick → visit → evaluateRead answer → maybe click source
Content requirementPage-level relevancePassage-level self-containment

And the strategic mindset shift:

Traditional SEO thinkingRetrieval-first strategy
“Rank for target keywords”“Be retrieved for target query classes”
“Build links to improve authority”“Build entity clarity to expand retrieval surface”
“Create content for keyword coverage”“Structure content as retrievable knowledge units”
“Measure rankings and traffic”“Measure retrieval coverage and citation rate”
“Optimise pages”“Engineer retrieval eligibility across the network”
“Win positions”“Win the candidate set”

From PageRank to Transformers: How Search Architecture Evolved

PageRank era (1998–2010) — Google’s foundational insight: links between pages function as votes of confidence. Ranking was dominated by link graph analysis. Retrieval was almost entirely lexical — inverted index lookups with basic term matching.

Machine-learned ranking (2010–2015) — Google replaced hand-tuned ranking functions with ML models that weight hundreds of signals simultaneously. SEO started becoming a signal optimisation problem rather than a link-counting exercise.

RankBrain (2015) — Google’s first neural network for query understanding. RankBrain mapped ambiguous queries into vector space — the first step toward neural retrieval. Queries never seen before could match relevant content through semantic similarity.

BERT (2019) — Transformer-based language understanding applied to ranking. BERT enabled contextual query-document evaluation — resolving ambiguities, understanding negation, interpreting prepositions. Ranking became truly semantic, not just statistical.

Passage indexing (2021) — Google began ranking individual passages within pages independently. A single well-structured paragraph could be retrieved for queries the page as a whole didn’t target. The unit of relevance shifted from page to passage.

RAG and AI Overviews (2023–present) — Retrieval-Augmented Generation introduced AI-driven answer synthesis. Search results became grounding data for language models. The system now retrieves passages, scores confidence, and assembles synthesised answers — adding an interpretation layer on top of the traditional pipeline.

Each era added a layer. Today’s system runs PageRank-derived link signals, ML ranking, neural re-ranking, passage indexing, and AI synthesis simultaneously. Strategy that accounts for only one layer optimises for a fraction of the system.

Query Understanding: The System’s Side of Intent

Search engines don’t receive queries passively. They interpret, classify, expand, and augment them before retrieval even begins. Query understanding is the system’s attempt to model what the user actually wants — which is often quite different from what they typed.

Query classification determines which content formats are eligible. A navigational query retrieves brand pages. An informational query retrieves in-depth articles. A transactional query retrieves product pages and comparison content. If your content format doesn’t match the inferred intent class, it’s filtered out at the retrieval stage — not demoted in ranking.

Query augmentation expands the original query into semantically related variations. Google uses synonym expansion, entity resolution, and concept mapping. AI systems use fan-out queries — 10 to 30 sub-queries per complex question. Your content competes not just for the literal query typed, but for dozens of semantically related variations the system generates internally.

This is why keyword-centric SEO has diminishing returns at scale. The system isn’t matching keywords — it’s matching intent patterns, entity relationships, and semantic neighbourhoods. A B2B SaaS company that structures content around the complete decision journey — problem awareness, solution evaluation, vendor comparison, implementation planning — is eligible for far more queries than one targeting individual keyword strings.

From Pipeline Understanding to Capital Allocation Decisions

Understanding the retrieval pipeline changes how you allocate budget towards organic visibility. Most SEO strategies are implicitly optimised for ranking — keyword targeting, backlink acquisition, on-page signals. These matter. But they only matter for content that has already been retrieved.

Diagnosing the binding constraint

If you’re allocating 80% of your SEO budget to ranking optimisation — links, new content production, on-page tweaks — and 20% to technical and structural (which involves semantic SEO) work, you might have the ratio backwards (at least for a while). In economic terms, the marginal return on ranking investment approaches zero when the retrieval constraint is binding. Start with foundations, then build new floors.

A structured diagnostic:

  1. Audit retrieval eligibility. Are your target pages actually appearing in candidate sets? Google Search Console’s performance data — specifically, impressions at zero or near-zero clicks — can indicate retrieval vs ranking problems. A page with zero impressions for a target query has a retrieval issue. A page with impressions but no clicks has a ranking or SERP presentation issue. These require entirely different interventions.
  2. Assess semantic cost. How much computational effort does the system need to understand what your page is about? High semantic cost — caused by vague entity references, missing structured data, poor heading hierarchy, or orphaned content — reduces retrieval probability.
  3. Evaluate passage-level quality for AI eligibility. Are your key claims structured as self-contained, citable passages? For B2B SaaS companies competing for consideration in AI-generated comparisons and recommendations, this is where deals begin — before the prospect ever visits your site.

Retrieval eligibility as competitive moat

Most competitors in most verticals are optimising for the same ranking signals — building links to the same anchor terms, targeting the same keywords, producing structurally similar content. The SERP is a local equilibrium. Incremental investment in the same signals yields diminishing returns.

Retrieval eligibility is a different axis of competition entirely. When you reduce semantic cost, improve entity clarity, and structure content for passage-level retrieval, you’re not competing on the same dimension. You’re expanding the number of queries your content is eligible for — including AI-generated query variations you never explicitly targeted.

In Porter’s terms, this is differentiation rather than cost leadership on the same competitive dimension. The compound returns on architectural investment — topical maps, entity resolution, semantic structure — often exceed the returns on incremental link acquisition, ceteris paribus. The architecture scales. The link-by-link approach doesn’t.

Semantic SEO as Retrieval Engineering

Semantic SEO — structuring content around entities, relationships, and meaning rather than keywords alone — is ultimately a retrieval engineering discipline. It reduces the cost of retrieval by making content machine-readable at every level — serving both BM25’s lexical matching and neural retrieval’s vector similarity simultaneously.

A well-designed Semantic Content Network (SCN) — what most practitioners call a topical map, though I’d argue the term undersells the architecture — creates a closed-loop system where every page reinforces entity relationships through internal links with descriptive anchors. The cluster structure signals to retrieval systems that the domain covers a topic comprehensively, not superficially.

Pillar pages serve as retrieval entry points for broad queries. Hub pages organise sub-topics and create the semantic bridges. Spoke pages capture the long-tail and specific intent variations. The result is compound eligibility — each new page strengthens the retrieval probability of every connected page.

This is the network effect in semantic architecture. And it’s the mechanism behind what the industry loosely calls “topical authority” — though “retrieval coverage” would be more precise.

The Investment Framework: Where to Put the Next Euro

If you’re a founder, CMO, or growth lead evaluating your organic visibility investment, the question this pipeline answers isn’t “how do search engines work?” — it’s “where should the next euro go?”

Retrieval is the qualifying round. If your content isn’t technically accessible, semantically clear, and entity-resolved, you’re not in the competition. No amount of link building or content production compensates for retrieval failure. Diagnosing this is the first step — and it’s where most companies never look.

Ranking is the competition. Once retrieved, your content competes on the strength of user satisfaction signals, authority, freshness, and contextual relevance. But even winning that competition delivers less than it used to — position #1 historically commanded roughly 40% of clicks, but on queries where AI features appear, CTR is compressing toward 19–26% (First Page Sage, 2025–2026 meta-analysis). The value is migrating from ranking position to retrieval-plus-citation eligibility.

AI citation is the emerging distribution channel. AI systems are increasingly where your prospects encounter solutions and form shortlists — before they ever visit your website. Being citable by AI requires passage-level clarity, entity accuracy, and source authority.

Architecture compounds, tactics don’t. Individual page optimisations deliver linear returns. Semantic architecture — topical maps, entity resolution, internal linking systems — delivers compound returns because each element strengthens the retrieval eligibility of the entire network. For scaling companies, this is the difference between an SEO cost centre and an organic growth engine.

SEO isn’t a ranking exercise. It’s structured influence over a probabilistic retrieval and ranking system — in service of business growth. The pipeline is the territory. Understanding it changes how you invest.


The diagnostic I run in every strategic engagement starts exactly here — mapping where the binding constraint sits across retrieval, ranking, and AI eligibility, then reallocating investment accordingly.

If that’s the conversation you need to have about your organic growth, that’s what I do.”


Key Definitions

Retrieval — candidate generation. The process of selecting documents from the index that might satisfy a query, using lexical matching (inverted index + BM25) and neural matching (vector embeddings + ANN search).

Ranking — candidate ordering. Machine-learned scoring systems — including neural re-ranking via BERT and transformer models — that evaluate query-document relevance and assign positions.

Retrieval eligibility — the structural, semantic, and technical conditions that determine whether content enters the candidate set before ranking begins.

BM25 — a probabilistic scoring function used in lexical retrieval that weights term frequency, document length, and inverse document frequency to rank candidate relevance. Developed from the Okapi weighting scheme by Robertson et al.

Passage indexing — Google’s capability to rank individual passages within a page independently of the page’s overall topic, treating each paragraph-level block as a separate retrieval candidate.

Retrieval-Augmented Generation (RAG) — a system architecture (Lewis et al., 2020) that augments language model generation with a retrieval component, grounding AI-generated answers in externally retrieved content rather than relying solely on parametric knowledge.

Semantic cost — the computational effort a retrieval system expends to understand what a page is about. Lower semantic cost increases retrieval probability.

Arbitration — passage selection. The process by which AI systems score individual passages for factual confidence, entity alignment, and source authority before assembling an answer.

Fan-out queries — the sub-queries an AI system generates when decomposing a complex question, each targeting a different facet and triggering its own retrieval pass.

Semantic Content Network (SCN) — a hierarchical content architecture of pillar, hub, and spoke pages structured around entities and relationships to maximise retrieval coverage across a topic.

Navboost — Google’s behavioural ranking system that uses aggregated click and navigation data to adjust rankings based on real user satisfaction signals, as confirmed in DOJ antitrust trial documents.

Semantic Retrieval Optimization (SRO) — an approach for structuring content to optimise retrieval probability across both traditional search and AI systems (based on Koray Tuğberk Gübür and Sergey Lucktinov’s work). For further reading, see Lucktinov’s published work on the methodology.

Frequently Asked Questions

Retrieval is candidate generation — selecting which documents from the index enter the competition for a given query using inverted indexes (BM25) and vector embeddings (neural retrieval). Ranking is candidate ordering — applying machine-learned scoring models to determine position. Retrieval happens first. If content fails retrieval, ranking models never evaluate it. Most SEO budgets are allocated to ranking while retrieval eligibility remains undiagnosed.

Retrieval eligibility is the set of conditions a page must meet to be included in the candidate set for a query. These operate across three layers: technical accessibility (render speed, crawlability, clean DOM), semantic clarity (entity resolution, topical precision), and trust signals (Navboost patterns, link portfolio quality, external corroboration). Failure at any layer eliminates the page before ranking begins.

AI systems decompose queries into 10–30 fan-out sub-queries, retrieve 100–300 candidate passages, then apply confidence-weighted arbitration — scoring each passage for relevance, factual consistency, entity alignment, and source authority — to select 4–5 passages for answer assembly via RAG. The competitive unit is the passage, not the page.

SRO is a framework — originally developed by Koray Tuğberk Gübür and Sergey Lucktinov — for structuring content to optimise retrieval probability across both traditional search and AI systems. It focuses on reducing semantic cost, improving entity clarity, and ensuring passage-level quality. It treats organic visibility as a retrieval engineering problem rather than a ranking exercise.

Topical authority is the retrieval system’s assessment that a domain covers a topic comprehensively. A well-structured Semantic Content Network — with pillar, hub, and spoke pages linked through descriptive anchors — signals coverage and increases retrieval probability across the entire cluster. Architecture builds compound eligibility; isolated pages don’t.

BM25 is a sparse, lexical retrieval method that matches exact terms and variants using the inverted index — fast and scalable but blind to semantic meaning. Neural retrieval converts documents and queries into dense vector embeddings and uses ANN search to find semantically similar content. Modern search systems run both in parallel and merge their candidate sets.

Semantic search is a retrieval method — finding content based on meaning rather than exact term matching, typically through vector embeddings. RAG is a system architecture that uses retrieval (including semantic search) as input to a language model, which then generates a synthesised answer grounded in the retrieved content. Semantic search finds relevant passages. RAG assembles them into an answer.

Ahrefs’ research shows 96.55% of web pages receive no search traffic. Most of this failure happens at retrieval, not ranking — the content either doesn’t match any search demand, lacks the semantic clarity for retrieval systems to associate it with relevant queries, or fails technical eligibility thresholds. The content exists in the index but is absent from candidate sets. The highest-leverage fix is usually retrieval eligibility, not ranking optimisation.

Traditional search returns a ranked list of links for you to evaluate. AI search retrieves passages, scores them for confidence, and assembles a synthesised answer — citing sources as attributions. The unit shifts from page to passage, the metric from click to citation, and the competitive surface from ranking position to retrieval-plus-arbitration eligibility. Both systems share the same retrieval foundation — what changes is what happens after retrieval.

B2B companies with complex products, long sales cycles, and multi-stakeholder buying committees depend on being present at the consideration stage — which increasingly happens through AI-assisted search, comparison queries, and solution-evaluation journeys. Retrieval eligibility determines whether your content is part of that conversation. If your product pages, comparison content, and methodology descriptions aren’t structured for passage-level retrieval, you’re invisible at the moment that shapes shortlists.

sources:

  • https://developers.google.com/search/docs/crawling-indexing
  • https://searchengineland.com/google-jeff-dean-ai-search-classic-ranking-retrieval-469386
  • https://docs.cloud.google.com/generative-ai-app-builder/docs/ranking-overview
  • https://arxiv.org/html/2407.21022v1
  • https://sitebulb.com/hints/performance/avoid-excessive-dom-size
  • https://law.justia.com/cases/federal/district-courts/district-of-columbia/dcdce/1:2020cv03010/223205/1436
  • https://www.justice.gov/atr/us-and-plaintiff-states-v-google-llc-2020-remedies-hearing-exhibits
  • https://www.demandsage.com/google-search-statistics/
  • https://blog.google/products/search/search-language-understanding-bert/
  • https://blog.google/products/search/generative-ai-google-search-may-2024/
  • https://ahrefs.com/blog/ai-overviews-study/
  • https://arxiv.org/abs/2311.09000
  • https://firstpagesage.com/reports/google-click-through-rates-ctrs-by-ranking-position/
Share this post:

    Let's talk about SEO!

    This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.