Outline Technologies — SEO, AEO & GEO Agency
Back to Blog
Technical GEOComplete Guide

LLMs.txt: The Complete Guide to What It Does (and Doesn't Do) in 2026

The honest guide to llms.txt. We'll tell you exactly which AI systems read it (spoiler: fewer than you think), which developer tools genuinely use it, and whether you should bother adding it to your site.

Abd Shanti 13 min readMay 10, 2026
In This Guide
What llms.txt actually isWho created it and whyThe honest adoption realityWhich AI tools actually read itllms.txt vs llms-full.txtllms.txt vs robots.txtHow to implement it correctlyThe real GEO signals that workPractical implementation checklist

What Is llms.txt?

llms.txt is a plain-text file you place at the root of your website — at yoursite.com/llms.txt — that gives AI language models a structured, markdown-formatted summary of what your site contains. Think of it as a curated index: instead of an AI crawler having to crawl every page and figure out what matters, you hand it a clean map.

The format is straightforward. A required # Site Name H1 heading comes first. An optional blockquote provides a one-sentence description. Then H2 sections list groups of links with brief descriptions. One section can be marked ## Optionalto signal lower-priority content. That's the entire specification.

Minimal Valid llms.txt Example

# Your Company Name

> We help marketers track AI citations and brand visibility.

## Guides

- [What Is GEO?](https://yoursite.com/blog/what-is-geo/): Introduction to generative engine optimization
- [AEO Guide](https://yoursite.com/blog/what-is-aeo/): How answer engine optimization works

## API Docs

- [REST API Reference](https://yoursite.com/docs/api/): Full endpoint documentation

## Optional

- [Blog Archive](https://yoursite.com/blog/): All published articles

Who Created llms.txt and Why

The spec was proposed by Jeremy Howard, co-founder of Answer.AI (formerly fast.ai), and published at llmstxt.orgon September 3, 2024. Howard's reasoning was practical: websites are built for human readers, with navigation menus, cookie banners, and marketing copy that adds noise when an AI model tries to extract useful content. llms.txt removes that noise.

The proposal came at a moment of real excitement. AI coding assistants were taking off, and developers were genuinely frustrated that tools like Cursor or GitHub Copilot would hallucinate API details because the actual documentation was buried under layers of JavaScript-rendered HTML. A clean markdown index that pointed directly to canonical docs made obvious sense.

Within months, adoption spread through developer-tooling sites and documentation platforms — Mintlify, a popular docs platform, added automatic llms.txt generation for all its customers, which heavily inflated early adoption statistics. By May 2025, roughly 105 of the Majestic Million top websites had llms.txt files, up from just 15 in January 2025. That sounds promising until you realize it's 0.0105% of the top million sites.

The Honest Adoption Reality

Here's where we have to be direct with you, because most llms.txt articles are not.

What the Data Actually Shows

Otterly.AI ran a 90-day study across 62,100+ AI events tracked on real client websites. The number of AI requests that fetched the llms.txt file: 84 total. That's 0.1% of all AI traffic events. SE Ranking's study of 300,000 domains found 10.13% adoption — but this number is skewed because Mintlify generates llms.txt automatically for all its hosted docs, inflating the figure dramatically.

Google's John Mueller was even more blunt. In a June 2025 Bluesky post he wrote: "FWIW no AI system currently uses llms.txt." When someone pointed out that Google had ironically added an llms.txt file to its own documentation site in December 2025, Mueller clarified it was not an endorsement of the spec — and those files were subsequently removed.

The big consumer AI systems — ChatGPT, Perplexity, Gemini, Claude.ai — do not use your llms.txt file when deciding what to cite or recommend. They use their own crawlers, training data, and ranking signals. llms.txt is not in their retrieval pipeline.

Which AI Tools Actually Read llms.txt

The answer is more specific than "AI tools" — it's developer-facing AI tools, and mostly coding assistants.

Cursor. The AI code editor fetches llms.txt when you add a documentation source to your project. It uses it to navigate large codebases and API references without reading every page.
Claude Code (Anthropic). Claude's CLI tool reads llms.txt when browsing documentation during agentic coding tasks. It's part of the context-gathering phase before writing code.
Windsurf (Codeium). Another AI-powered editor that respects llms.txt for documentation discovery.
LangChain MCPDOC Server. The Model Context Protocol documentation server uses llms.txt as an index when serving documentation to LLM agents.
Cursor Docs (@docs). When you use @docs in Cursor to pull in external documentation, llms.txt is how Cursor efficiently indexes what's available without brute-force crawling.

The pattern here is clear: if your audience includes developers who use AI coding tools, llms.txt is genuinely useful. If your audience is primarily consumers who search on Google, ChatGPT, or Perplexity, llms.txt has essentially no impact on your AI visibility.

The Cloudflare Angle

Cloudflare's AI Gateway and their "AI agents readiness" framework includes llms.txt as a signal for agent-friendly websites. As AI-powered agents become more common (browsing the web on behalf of users), a well-structured llms.txt may grow in importance. It's a hedge against a future that isn't here yet.

llms.txt vs llms-full.txt: What's the Difference

The spec actually defines two files, and most guides conflate them or only mention one.

llms.txt

A navigation index. Contains links to your most important pages with brief descriptions. Designed to help an AI agent find the right page, not to serve content directly. Analogous to a table of contents.

Typically under 10KB

llms-full.txt

A complete content dump. Contains the actual full markdown text of your most important pages, concatenated into one file. Designed so an AI can ingest everything in one request without further crawling.

Can be megabytes for large sites

Which one should you implement? Both, if you can. llms.txt is the lightweight option — every site should have it. llms-full.txt makes more sense for documentation sites, developer tools, or knowledge-heavy sites where you want AI coding assistants to have instant access to your full content corpus.

For most marketing sites and blogs, llms.txt alone is sufficient. The content is sparse enough that a well-structured index is more useful than a giant concatenated dump.

llms.txt vs robots.txt: Not the Same Thing

A common question: isn't this what robots.txt is for? No, and the distinction matters.

Featurerobots.txtllms.txt
PurposeControl crawler access (allow/disallow)Guide AI to best content
EffectMandatory — crawlers must respect itVoluntary — AI can ignore it
Established standard?Yes — Robots Exclusion Protocol (1994)Proposed spec only — no RFC
Read by Google?Yes, alwaysNo (as of 2026)
Read by coding AI?SometimesYes (Cursor, Claude Code, etc.)
ContentPath directives onlyMarkdown with links + descriptions

robots.txt is a blocking and access-control mechanism. llms.txt is an invitation and content guide. They serve completely different purposes and you need both on a well-optimized site.

How to Implement llms.txt Correctly

If you've decided to implement it (and you should, given the low cost), here's what best practice looks like in 2026.

1

Place the file at the exact root: yoursite.com/llms.txt — not /blog/llms.txt or /docs/llms.txt. The spec requires root placement.

2

Start with a single H1 (#) that is your site or company name. This is the only required element in the entire spec.

3

Follow with an optional blockquote (>) giving a one-sentence description of what your site does. Keep it factual, not marketing copy.

4

Group your links into H2 (##) sections by content type: Guides, Docs, Tools, Blog Posts, etc. Don't dump everything — curate your 20–50 most important pages.

5

Write link descriptions that complete the sentence 'This page explains...' — be specific and factual rather than vague.

6

Add an ## Optional section for lower-priority content like archive pages, older posts, or supporting material.

7

Serve the file as text/plain with UTF-8 encoding. No authentication gate, no redirect, no JavaScript rendering.

8

Keep llms.txt under 100KB. If you need more, use llms-full.txt for the heavy content and keep llms.txt as the lean index.

Next.js / Static Site Implementation

For Next.js with static export, place llms.txt in your /public folder. It will be served at the root automatically. For dynamic sites, you can also create a route handler at app/llms.txt/route.ts that generates the file programmatically from your CMS content.

The Real GEO Signals That Actually Work

Since we've been honest about what llms.txt doesn't do, let's be equally clear about what the Princeton GEO study (the most rigorous academic research on the topic) found actually increases AI citation rates.

Cite authoritative sources. Content that references credible third-party sources (studies, official data, recognized experts) saw citation rates increase by up to 30% in the Princeton study. AI models look for content that itself demonstrates epistemic rigor.
Add specific statistics and data. Concrete numbers — percentages, dates, study sizes — make content more citable. 'About half' becomes forgettable; '47% of marketers (HubSpot, 2025)' becomes quotable.
Use quotation-style formatting. Direct quotes, clearly attributed statements, and pull-quote structures appear more frequently in AI-generated summaries. Structure your content so key insights are self-contained sentences.
Write fluently with clear topic signals. The GEO study found that fluency improvements (clearer writing, logical structure, reduced jargon) consistently improved AI citation rates across all tested systems.
Use structured markup. FAQ schema, HowTo schema, and Organization schema directly inform how AI systems parse and re-surface your content in answers. This is far more impactful than llms.txt for consumer AI search.

None of these require a file in your root directory. They require better content. That's a less convenient answer, but it's the honest one.

Should You Implement llms.txt? (Our Verdict)

Yes — but with calibrated expectations.

The implementation cost is genuinely low: one markdown file, no dependencies, no ongoing maintenance. The potential upside — however small today — exists on multiple axes: developer tools already using it, agent frameworks growing, and the possibility that consumer AI systems adopt it in the future.

What you should not do is treat llms.txt as your GEO strategy. It isn't. The marketers and agencies selling "llms.txt optimization" as a service are selling you something that has essentially no measurable impact on ChatGPT, Perplexity, or Gemini citations today.

Your GEO investment should go into: authoritative content with cited sources, FAQ and HowTo schema markup, consistent brand mentions across authoritative sites, and a clear entity definition for your brand. Those signals move the needle. llms.txt is a nice-to-have that takes 30 minutes to implement.

Practical llms.txt Checklist

File placed at exact root domain (/llms.txt)
H1 heading is your official site/company name
Blockquote with one accurate sentence description
Links grouped into logical H2 sections
Descriptions are specific (what does each page explain?)
File served as text/plain, no auth gate
File under 100KB (or separate llms-full.txt for heavy content)
robots.txt allows /llms.txt (don't accidentally block it)
Sitemap and canonical URLs match what's in llms.txt
Reviewed and updated when major new content is added
AS
Written by Abd Shanti
Co-Founder, Outline Technologies

Abd co-founded Outline to help brands understand and act on AI-era visibility signals. He's been tracking AI crawler behavior, llms.txt adoption, and GEO metrics since the protocols emerged in late 2024.