French SEO consultancy Resoneo, working with AI visibility platform Meteoria, spent 14 weeks tracking what ChatGPT actually does when it cites sources. They ran 400 prompts daily across a stable, comparable set, watching citation behavior across 27,000 responses.
The finding: when OpenAI made GPT-5.3 Instant the default ChatGPT experience, the citation pool compressed immediately.
Before the transition, the average ChatGPT response cited 19 unique domains and 24 unique URLs. After GPT-5.3 Instant became the default, those numbers dropped to 15 domains and 19 URLs per response. That is a 21% reduction in both metrics. Jérôme Salomon at Oncrawl independently confirmed the pattern through server log analysis: ChatGPT-User bot crawl volume and frequency declined over the same period.
The study was reported by Search Engine Journal in April 2026.
One in five previously cited sources dropped out of the citation window. Brands that were marginal citation candidates before the transition have, in many cases, already fallen out. Any GEO audit or citation baseline captured before March 2026 reflects citation economics that no longer apply.
ChatGPT citation pool — Resoneo/Meteoria study, April 2026
GPT-5.3 Instant cut the citation pool by 21% when it became the default
Source: Resoneo / Meteoria — 27,000 responses across 400 prompts tracked daily for 14 weeks
Unique domains cited per response
24 unique URLs cited · URL-to-domain ratio: 1.26
19 unique URLs cited · URL-to-domain ratio: 1.27
–21% fewer domains cited per response after GPT-5.3 Instant transition. URL-to-domain ratio held constant, meaning ChatGPT visits the same depth per site — just fewer sites overall.
Citation pool compression across model generations
Established citation tier
Becomes default ChatGPT experience (April 2026)
Research-mode compression, more sub-queries
API testing detected April 19 — release imminent
Any GEO baseline captured before March 2026 reflects a citation pool that no longer exists. GPT-5.3 Instant compressed the addressable citation space by approximately 1 in 5 domains. Brands that were marginal citation candidates before the transition have likely dropped out already.
What the study actually measured
The methodology matters here, because this is one of the more rigorous citation behavior datasets published this year.
Resoneo and Meteoria tracked 400 prompts daily for 14 weeks. That is consistent query volume over a long enough period to separate structural change from week-to-week noise. The before-and-after split aligned with the moment GPT-5.3 Instant became the default experience for most ChatGPT users.
The most telling data point is what did not change: the URL-to-domain ratio held constant at 1.0 throughout the study. Before the transition, ChatGPT averaged 24 URLs from 19 domains. After: 19 URLs from 15 domains. The ratio did not move.
That constancy tells you something specific about the mechanism. ChatGPT is not crawling shallower per site. It is not skimming less of each domain it visits. It is visiting fewer domains in total. The model goes to the same depth once it decides a domain is worth visiting. It is just deciding that fewer domains are worth visiting at all.
The cause: GPT-5.3 Instant "triggers fewer web searches and citations than earlier behavior." The model defaults to training data more often, reducing the frequency of live web retrieval. When it does retrieve, it retrieves from fewer sources.
A 21% compression means roughly 4 domains dropped per response
The math on this is worth sitting with.
Before GPT-5.3 Instant, the average ChatGPT response cited 19 unique domains. After, that number fell to 15. That is four domains per response, gone.
At the scale of millions of daily queries, that compression represents an enormous shift in which brands appear and which do not. The 15 domains that remain in the average response are not random. They are the sources ChatGPT's retrieval system has highest confidence in for the given topic area.
The brands that held their position through the GPT-5.3 Instant transition are the ones with the kind of citation authority that comes from repeated appearances across multiple sources, strong structured content, and off-site brand signals that training data already contains. The brands that dropped out were mostly in the marginal tier: cited occasionally, never consistently enough to be the default.
This is the same winner-takes-more dynamic documented in GPT-5.4's citation behavior, but triggered differently. GPT-5.4 narrowed citations through its cross-referencing behavior during retrieval. GPT-5.3 Instant narrowed them by defaulting to training data more often, reducing live retrieval events overall.
Why training-data preference changes the citation math
When ChatGPT defaults to training data instead of live web retrieval, the citation math changes in ways that are not obvious at first.
Live retrieval citations can be influenced through content quality, freshness, and crawlability. A well-structured page with strong answer blocks and clean AI crawler access has a real shot at appearing in live-retrieval responses.
Training-data citations work differently. The sources that appear in training data responses are the ones that were well-represented across the web at training cutoff. That means high-authority publishers, frequently-cited brands, and topics with deep third-party coverage. A single well-optimized blog post does not enter training data. Years of consistent coverage across multiple credible sources does.
When GPT-5.3 Instant defaults to training data more often, the citation pool narrows to sources with that kind of pre-existing authority. Recent content improvements and technical crawlability fixes still matter for the responses that do use live retrieval. But the share of responses using live retrieval has shrunk.
This is why brand authority predicts AI citations more strongly than content quality alone. Training data reflects the cumulative brand signal across the web. A brand that published high-quality content for six months before the training cutoff is not in training data the same way a brand with five years of sustained editorial coverage is.
Not sure where your brand stands after the GPT-5.3 Instant compression?
We run a full citation audit across ChatGPT, Perplexity, and Google AI Overviews, map exactly which prompts you appear in and which you have dropped from, and identify the content and brand signals that will hold your position through the next model transition.
Book a Discovery CallThe compounding problem: GPT-5.3 Instant was not the only compression event
The Resoneo/Meteoria study documents what happened with GPT-5.3 Instant. It is not the only compression that has happened recently.
Our earlier analysis of GPT-5.4 citation behavior covered a separate finding from Position Digital: GPT-5.4 cites 20% fewer unique domains than its predecessors, even while running 10 or more sub-queries per prompt. That compression is driven by a different mechanism: the cross-referencing behavior that comes with more thorough retrieval means only sources that appear consistently across multiple sub-queries make it into the final response.
Two different compression events. Two different mechanisms. Both pointing in the same direction.
When you layer these together, the effective citation space available to a brand across ChatGPT responses has narrowed considerably since late 2025. Brands that were borderline candidates six months ago have experienced two rounds of compression. The ones that remain in the citation pool today earned that position despite repeated narrowing.
| Model transition | Compression mechanism | Domain impact |
|---|---|---|
| GPT-5.3 Instant (default) | Defaults to training data more often, fewer live retrievals | 19 → 15 domains per response (–21%) |
| GPT-5.4 | Cross-referencing narrows final citation set | Additional ~20% reduction vs. prior models |
| GPT-5.5 (Spud, imminent) | Unknown; structural pattern suggests continued tightening | TBD |
GPT-5.5 "Spud" is likely days away
On April 19, 2026, GPT-5.5 (internally codenamed "Spud") was detected running in live production API testing without a public announcement. As of April 21, it is not officially released. Polymarket was pricing a roughly 81% probability of public release by April 23, based on live prediction market data from that day.
The reason for the accelerated testing timeline: Claude Opus 4.7 released April 16 and immediately topped SWE-bench Verified at 87.6%. OpenAI appears to have pulled forward external Spud testing in response, according to reporting from Digit.in on the API detection.
Greg Brockman described it as "not an incremental improvement." Sam Altman called it "very strong." Neither of those characterizations tells you what the citation behavior will look like.
What the pattern suggests: each successive GPT model has narrowed the citation pool further. GPT-5.3 Instant did it by reducing live retrieval events. GPT-5.4 did it by tightening selection at the citation stage. If Spud follows the same directional pattern, the citation tier will have compressed three times in under six months.
The pre-Spud window is likely days, not weeks.
What a stale baseline actually costs you
This is the part that has direct operational consequences.
Most GEO programs track citation presence against an initial baseline: how often does the brand appear in tracked prompts, which sources are cited, which competitors appear. That baseline is the comparison point for measuring whether content changes, technical fixes, or PR placements are working.
If the baseline was captured before March 2026, before GPT-5.3 Instant became the default, it reflects a citation pool that cited 21% more domains per response than the current environment. A brand that appeared in 40% of tracked prompts under the old citation economics might appear in 32% under the current ones, with no content quality change at all. The program looks like it lost ground. It did not lose ground. The ground moved.
Running programs against an outdated baseline produces two failure modes. First, teams conclude their content efforts are failing when they are actually holding steady against a compressed pool. Second, teams misidentify which content investments are working, because the baseline comparison is off.
The same dynamic applies before Spud. Any baseline captured before Spud's rollout will need refreshing after. The most useful data asset from this period is a pre-Spud citation audit: which prompts the brand appears in now, which domains are being cited alongside it, and where the gaps are. The post-Spud comparison will show exactly what changed.
Who moved into the citation pool as others fell out
The compression is not symmetrical. Brands dropped out. Others moved in or held position.
The pattern from citation drift research and the Resoneo/Meteoria study points to the same kind of sources consistently surviving compression events.
Sources with high pre-existing training data representation held their position. This includes major B2B publications, established analyst firms, frequently-cited vendors, and brands that have years of editorial coverage across multiple credible domains. When training-data preference increases, these sources are the default.
Sources that benefited from live retrieval frequency but had thin training data presence are most exposed. A brand that earned citations primarily through fresh, well-optimized content but lacked the accumulated off-site signal may have lost position when the model's retrieval balance shifted toward training data.
This does not mean content optimization is irrelevant. For responses that still use live retrieval, the same structural factors apply: clear answer passages in the first 30% of a page, attributed statistics, clean crawlability for AI user agents. But those factors are now operating on a smaller share of total responses than they were six months ago.
What to do right now, before Spud drops
There are two things worth doing in the days before GPT-5.5 ships.
Capture a current citation baseline. Run your priority prompt set through ChatGPT Search now. Record which prompts produce citations for your brand, which sources appear alongside you, and which competitor brands are showing up consistently. This is your pre-Spud baseline. The comparison data you get after Spud rolls out will be the most actionable data asset you produce this quarter.
Check whether your brand appears in training-data responses, not just live-retrieval responses. One proxy: disable web search and ask ChatGPT about your category or brand in a way that requires no live retrieval. If your brand does not appear in knowledge-only mode, your citation presence relies entirely on live-retrieval responses, which are now the smaller share of total responses after GPT-5.3 Instant. That exposure matters for understanding how much the training-data shift is affecting you specifically.
On the content side, the passages beat pages principle remains the most durable structural factor. Pages that include 40-60 word direct-answer passages under each major heading give live retrieval systems a clear extraction target. That investment compounds across model transitions rather than depreciating when retrieval behavior shifts.
FAQ
How do I know if my brand's citations dropped after GPT-5.3 Instant?
Compare your current citation rate in ChatGPT to whatever baseline data you have from before March 2026. If you are seeing lower citation frequency with no obvious content quality change, the GPT-5.3 Instant compression is the most likely cause. A structured prompt audit across 20-30 priority queries tracked weekly for four weeks will show whether the drop is structural or temporary. Citation drift patterns show that some week-to-week movement is normal, but a sustained downward shift across multiple prompts is a signal that baseline conditions changed.
Does the –21% compression apply to every topic or just some?
The Resoneo/Meteoria study covered 400 prompts across comparable query types over 14 weeks. The aggregate finding is 21% fewer domains per response. Individual prompts and topic areas will vary. Highly competitive commercial categories with many credible sources may see sharper compression because the pool of strong candidates is larger. Niche topics where fewer credible sources exist may see less movement because the citation pool was already small. The underlying mechanism (GPT-5.3 Instant defaulting to training data more often) applies broadly, but the visible impact depends on how crowded the citation tier is in your category.
Should I update my GEO strategy before GPT-5.5 releases?
Update your baseline before Spud releases rather than your strategy. Strategy changes take time to produce measurable citation impact. A baseline refresh takes hours. Capture where you stand now so you have a before/after comparison when Spud rolls out. That data will tell you whether Spud produced another compression event in your category and which content or brand investments held position through it. Strategy decisions should follow from that data, not precede it.
What kinds of brands survived the GPT-5.3 Instant compression?
Based on the brand authority research from The Digital Bloom and the citation drift data, the brands that held position share a few characteristics: multi-year editorial coverage across credible third-party sources, strong training-data brand signals from pre-cutoff coverage, and structured content that performs well in live retrieval when retrieval does happen. Single-source citation strength (appearing frequently because one large publisher mentions you) is more fragile than distributed coverage across many independent sources. Brands with the latter pattern held better through model transitions.
What should I measure differently now that the citation pool has compressed?
Track citation rate per prompt rather than raw citation count. If the pool of cited domains per response has fallen from 19 to 15, the absolute number of times you appear may decline even if your relative position in the pool holds steady. Citation rate per prompt (appeared in X of Y tracked prompts) captures whether you are inside the surviving citation tier. Absolute citation volume alone may mislead if you are comparing against a baseline from a larger pool.
The pool will compress again
Two rounds of citation pool compression in under six months. A third likely days away. The direction of travel is clear.
Each new model generation is narrowing the gap between brands with genuine training-data authority and brands that appeared in citations primarily through optimized live content. That narrowing is not going to reverse.
The brands that hold citation position through these transitions are not the ones that are reacting after each model update. They are the ones that have been building the off-site brand signal that training data reflects for long enough that compression events pass without pushing them out.
Content quality still matters for live-retrieval responses. Technical crawlability still matters. Answer passage structure still matters. But those factors are amplifiers on top of a brand signal foundation. Without the foundation, optimization alone cannot hold a citation position when the model shifts its retrieval balance.
The pre-Spud window is the right moment to know exactly where you stand.
Spud is likely days away. Your citation baseline should be current before it arrives.
We run a complete citation audit across your priority prompt set, show you which domains are inside the surviving citation tier, and build the strategy that holds your position through the next model transition. Most clients see measurable movement within 60 days.
Book a Discovery CallFramework
Learn the CITE framework behind our GEO and AEO work
See how Comprehend, Influence, Track, and Evolve turn AI visibility into an operating system.
Services
Explore our managed GEO services and AEO execution model
Audit, prompt discovery, content execution, and ongoing monitoring tied to AI search outcomes.
GEO Agency
See what a managed GEO agency should actually do
Compare real GEO operating work against generic reporting or tool-only approaches.
Audit
Start with an AI visibility audit before execution
Understand prompt coverage, recommendation gaps, source mix, and where competitors are winning.