How AI Assistants Choose Citations

AI Citation Scorer · Live● scoring

Ggeolify.comcandidate

Authority

9225%

Freshness

9615%

Entity

8825%

Structure

9415%

Graph

8920%

Composite citation score

91 / 100

✓ CITED

Live demo · the same five-signal scoring model every major LLM uses to filter candidate sources before composing an answer.

The 5-second answer

Every major LLM filters citation candidates through five signals: authority, freshness, entity strength, structural clarity and citation graph position. Sources scoring above ~70 in the composite get cited; everything else gets filtered. The five signals are weighted differently per platform, but the model is the same.

1. The five citation signals every LLM weighs

Internal evaluations from OpenAI, Anthropic and Perplexity have all described variants of the same scoring architecture - a candidate retrieval set scored against the prompt, then a per-source quality score that decides which candidates make the citation cut. The quality score itself is a weighted blend of the five signals below. The weights move per platform, but the signals don't.

Authority

~25%

How often the source's domain appears in the model's high-quality training data, and how well it correlates with verified facts the model already knows. Wikipedia, major news, industry leaders score top.

Freshness

~15%

When the page was last published or meaningfully updated. Heavily up-weighted by Perplexity and AI Overviews; moderately up-weighted by ChatGPT browse mode.

Entity strength

~25%

How confidently the model identifies the brand or topic the source covers. Strong Wikidata, Wikipedia, sameAs schema and consistent NAP all feed this.

Structural clarity

~15%

Headings, FAQ blocks, comparison tables, atomic facts and JSON-LD schema. The cleaner the structure, the more confidently the model can quote and attribute the page.

Citation graph

~20%

How many sources the model already trusts also link to or mention this source. The PageRank concept, but weighted by which neighbours do the linking.

Bonus: indexability

gate

Not a weighted factor - it's a gate. If GPTBot/ClaudeBot/PerplexityBot can't fetch the page or the page is JS-only, the source never enters the candidate set in the first place.

2. Why some sources get cited and others don't

The 70/30 rule applies here too: about 70% of citation outcomes are decided by signals you also control with classic SEO (authority, technical health, structured content). The remaining 30% are GEO-specific - entity strength and the citation graph from LLM-trusted sources. If you're not actively building those, you cap out at being a 60-point candidate that gets filtered out below the 70-point cut. For the full picture of how these disciplines overlap, read our GEO vs SEO guide.

The asymmetry matters: a strong site that's 95 in authority but 22 in entity strength scores around 65 - filtered out. A modest site that's 70 across the board scores 70 - cited. Balance beats spikes, which is exactly why the brands that win in AI search are the ones running an integrated GEO package rather than doubling down on raw link building alone.

3. The hidden trust graph

Every LLM ships with an internal "trusted source" map - the publications and domains the model has been trained to weight more heavily during citation selection. Wikipedia is at the top. The major news brands (Reuters, AP, BBC, NYT, FT, WSJ, Bloomberg) are next. Then the big industry verticals - Forbes, TechCrunch, The Verge, Wired, Ars Technica, MIT Technology Review. Then category leaders like Stack Overflow, GitHub, HuggingFace, Arxiv. The practical move: get cited by these sources and your own site inherits their trust by association, which lifts your composite score for everything else you publish.

This is why the integrated playbook for getting cited in ChatGPT, Claude, Perplexity and Google AI Overviews starts with earned mentions from trusted publications - not just link building, but mention building, because LLMs weight unlinked brand mentions almost as heavily as linked ones for entity strength.

4. How to engineer your content for citations

You can't fake the trust graph in a week, but you can fix the structural clarity signal almost immediately. Pages structured as clear claims with supporting context get cited 5-8x more often than equivalent narrative prose. The exact format - headings, atomic facts, FAQ blocks, comparison tables - is documented in our write content AI assistants cite guide. And the JSON-LD schema patterns that make those facts parseable are in our schema markup for AI search guide.

Once your content is structurally clean, the next leverage point is entity strength - getting the model to confidently identify your brand. The full playbook lives in our entity SEO for the AI era guide; the short version is Wikidata, Wikipedia (where appropriate), Crunchbase, G2, Capterra, LinkedIn company page and consistent schema across your site, all pointing at the same canonical entity.

5. Per-platform citation variations

Each platform tweaks the weights differently. The biggest deltas:

ChatGPT

Heavy authority + entity. Citation graph weighted via web search overlay.

Guide →

Claude

Heaviest structural clarity weight of the four. Long-form, well-headed content over-indexes.

Guide →

Perplexity

Heaviest freshness weight. New pages can rank within hours if structurally clean.

Guide →

Gemini

Citation graph dominates - if you rank well in classic Google, you over-index in Gemini.

Guide →

Google AI Overviews

Almost identical to Gemini. Anchored in Google's index + AIO-specific structural preferences.

Guide →

Recap

Citation selection is a five-signal scoring model: authority, freshness, entity strength, structural clarity and citation graph position, with indexability as a hard gate. Score above 70 in the composite and you get cited; below it and you get filtered. The fastest way to lift your composite is to fix structural clarity first (this week), build entity strength second (this quarter) and earn citations from LLM-trusted sources third (this year). Brands running an integrated GEO package compound on all three at once.

Engineer your content for AI citations

Get cited in ChatGPT, Claude and Perplexity

Geolify GEO packages cover all 7 major AI platforms - structural clarity, entity build, authority citations and per-platform tracking. Delivered in 14 days, from $499.

Buy GEO Packages See Pricing

FAQ

How do AI assistants decide which sources to cite?

Every modern AI assistant runs a layered scoring model that weighs five major signals: authority of the source domain, freshness of the content, entity strength of the brand being discussed, structural clarity of the source page (headings, schema, atomic facts) and the citation graph - whether other sources the model already trusts also link to it. Each platform tweaks the weights, but those five signals dominate.

Does ChatGPT cite the same sources as Perplexity?

There's a meaningful overlap - both lean heavily on Wikipedia, major news, industry leaders and well-structured documentation - but their citation distributions are noticeably different. ChatGPT skews toward training-data sources with strong entity confidence, while Perplexity (which retrieves live for almost every query) skews toward fresh, well-linked pages. Claude sits between them, slightly favouring authoritative long-form content.

What makes a source 'authoritative' to an LLM?

Authority for an LLM is a combined function of: how often the source appears in high-quality training data, how many other trusted sources cite it, the entity strength of the publisher (Wikipedia and Wikidata presence, third-party verification, brand recognition) and how cleanly the page is structured. A no-name blog with thin content scores low. A long-form, well-cited piece on a recognised industry publication scores high.

Can I influence which sources an AI assistant cites for my topic?

Yes - that's literally what GEO is. The levers are: get cited by sources LLMs already trust (tier-1 press, industry leaders, Wikipedia where appropriate), publish content with the format LLMs preferentially quote (clear claims, FAQ schema, definition paragraphs, comparison tables), build entity strength so the model recognises your brand confidently, and make sure all the major AI crawlers can actually fetch your pages.

How fast can a new source start getting cited?

It depends on the platform. Perplexity (and Google AI Overviews via Gemini) can cite a brand-new page within hours of indexing if it ranks well and has strong link signals. ChatGPT and Claude lag because their citation behaviour is anchored heavily in training data - new sources typically need to build entity strength via mentions in already-trusted publications before the model starts confidently citing them, which usually takes weeks to months.

Back to Knowledge Hub Next: Why your site isn't in ChatGPT →

HowAIassistants choose citations

1. The five citation signals every LLM weighs

2. Why some sources get cited and others don't

3. The hidden trust graph

4. How to engineer your content for citations

5. Per-platform citation variations

Recap

Get cited in ChatGPT, Claude and Perplexity

FAQ

Explore More Packages

GEO Packages

Local GEO

AI SEO Boost