What Is llms.txt?
llms.txt is a plain-text file you place at the root of your website — at yoursite.com/llms.txt — that gives AI language models a structured, markdown-formatted summary of what your site contains. Think of it as a curated index: instead of an AI crawler having to crawl every page and figure out what matters, you hand it a clean map.
The format is straightforward. A required # Site Name H1 heading comes first. An optional blockquote provides a one-sentence description. Then H2 sections list groups of links with brief descriptions. One section can be marked ## Optionalto signal lower-priority content. That's the entire specification.
Minimal Valid llms.txt Example
# Your Company Name > We help marketers track AI citations and brand visibility. ## Guides - [What Is GEO?](https://yoursite.com/blog/what-is-geo/): Introduction to generative engine optimization - [AEO Guide](https://yoursite.com/blog/what-is-aeo/): How answer engine optimization works ## API Docs - [REST API Reference](https://yoursite.com/docs/api/): Full endpoint documentation ## Optional - [Blog Archive](https://yoursite.com/blog/): All published articles
Who Created llms.txt and Why
The spec was proposed by Jeremy Howard, co-founder of Answer.AI (formerly fast.ai), and published at llmstxt.orgon September 3, 2024. Howard's reasoning was practical: websites are built for human readers, with navigation menus, cookie banners, and marketing copy that adds noise when an AI model tries to extract useful content. llms.txt removes that noise.
The proposal came at a moment of real excitement. AI coding assistants were taking off, and developers were genuinely frustrated that tools like Cursor or GitHub Copilot would hallucinate API details because the actual documentation was buried under layers of JavaScript-rendered HTML. A clean markdown index that pointed directly to canonical docs made obvious sense.
Within months, adoption spread through developer-tooling sites and documentation platforms — Mintlify, a popular docs platform, added automatic llms.txt generation for all its customers, which heavily inflated early adoption statistics. By May 2025, roughly 105 of the Majestic Million top websites had llms.txt files, up from just 15 in January 2025. That sounds promising until you realize it's 0.0105% of the top million sites.
The Honest Adoption Reality
Here's where we have to be direct with you, because most llms.txt articles are not.
What the Data Actually Shows
Otterly.AI ran a 90-day study across 62,100+ AI events tracked on real client websites. The number of AI requests that fetched the llms.txt file: 84 total. That's 0.1% of all AI traffic events. SE Ranking's study of 300,000 domains found 10.13% adoption — but this number is skewed because Mintlify generates llms.txt automatically for all its hosted docs, inflating the figure dramatically.
Google's John Mueller was even more blunt. In a June 2025 Bluesky post he wrote: "FWIW no AI system currently uses llms.txt." When someone pointed out that Google had ironically added an llms.txt file to its own documentation site in December 2025, Mueller clarified it was not an endorsement of the spec — and those files were subsequently removed.
The big consumer AI systems — ChatGPT, Perplexity, Gemini, Claude.ai — do not use your llms.txt file when deciding what to cite or recommend. They use their own crawlers, training data, and ranking signals. llms.txt is not in their retrieval pipeline.
Which AI Tools Actually Read llms.txt
The answer is more specific than "AI tools" — it's developer-facing AI tools, and mostly coding assistants.
The pattern here is clear: if your audience includes developers who use AI coding tools, llms.txt is genuinely useful. If your audience is primarily consumers who search on Google, ChatGPT, or Perplexity, llms.txt has essentially no impact on your AI visibility.
The Cloudflare Angle
Cloudflare's AI Gateway and their "AI agents readiness" framework includes llms.txt as a signal for agent-friendly websites. As AI-powered agents become more common (browsing the web on behalf of users), a well-structured llms.txt may grow in importance. It's a hedge against a future that isn't here yet.
llms.txt vs llms-full.txt: What's the Difference
The spec actually defines two files, and most guides conflate them or only mention one.
A navigation index. Contains links to your most important pages with brief descriptions. Designed to help an AI agent find the right page, not to serve content directly. Analogous to a table of contents.
Typically under 10KB
A complete content dump. Contains the actual full markdown text of your most important pages, concatenated into one file. Designed so an AI can ingest everything in one request without further crawling.
Can be megabytes for large sites
Which one should you implement? Both, if you can. llms.txt is the lightweight option — every site should have it. llms-full.txt makes more sense for documentation sites, developer tools, or knowledge-heavy sites where you want AI coding assistants to have instant access to your full content corpus.
For most marketing sites and blogs, llms.txt alone is sufficient. The content is sparse enough that a well-structured index is more useful than a giant concatenated dump.
llms.txt vs robots.txt: Not the Same Thing
A common question: isn't this what robots.txt is for? No, and the distinction matters.
| Feature | robots.txt | llms.txt |
|---|---|---|
| Purpose | Control crawler access (allow/disallow) | Guide AI to best content |
| Effect | Mandatory — crawlers must respect it | Voluntary — AI can ignore it |
| Established standard? | Yes — Robots Exclusion Protocol (1994) | Proposed spec only — no RFC |
| Read by Google? | Yes, always | No (as of 2026) |
| Read by coding AI? | Sometimes | Yes (Cursor, Claude Code, etc.) |
| Content | Path directives only | Markdown with links + descriptions |
robots.txt is a blocking and access-control mechanism. llms.txt is an invitation and content guide. They serve completely different purposes and you need both on a well-optimized site.
How to Implement llms.txt Correctly
If you've decided to implement it (and you should, given the low cost), here's what best practice looks like in 2026.
Place the file at the exact root: yoursite.com/llms.txt — not /blog/llms.txt or /docs/llms.txt. The spec requires root placement.
Start with a single H1 (#) that is your site or company name. This is the only required element in the entire spec.
Follow with an optional blockquote (>) giving a one-sentence description of what your site does. Keep it factual, not marketing copy.
Group your links into H2 (##) sections by content type: Guides, Docs, Tools, Blog Posts, etc. Don't dump everything — curate your 20–50 most important pages.
Write link descriptions that complete the sentence 'This page explains...' — be specific and factual rather than vague.
Add an ## Optional section for lower-priority content like archive pages, older posts, or supporting material.
Serve the file as text/plain with UTF-8 encoding. No authentication gate, no redirect, no JavaScript rendering.
Keep llms.txt under 100KB. If you need more, use llms-full.txt for the heavy content and keep llms.txt as the lean index.
Next.js / Static Site Implementation
For Next.js with static export, place llms.txt in your /public folder. It will be served at the root automatically. For dynamic sites, you can also create a route handler at app/llms.txt/route.ts that generates the file programmatically from your CMS content.
The Real GEO Signals That Actually Work
Since we've been honest about what llms.txt doesn't do, let's be equally clear about what the Princeton GEO study (the most rigorous academic research on the topic) found actually increases AI citation rates.
None of these require a file in your root directory. They require better content. That's a less convenient answer, but it's the honest one.
Should You Implement llms.txt? (Our Verdict)
Yes — but with calibrated expectations.
The implementation cost is genuinely low: one markdown file, no dependencies, no ongoing maintenance. The potential upside — however small today — exists on multiple axes: developer tools already using it, agent frameworks growing, and the possibility that consumer AI systems adopt it in the future.
What you should not do is treat llms.txt as your GEO strategy. It isn't. The marketers and agencies selling "llms.txt optimization" as a service are selling you something that has essentially no measurable impact on ChatGPT, Perplexity, or Gemini citations today.
Your GEO investment should go into: authoritative content with cited sources, FAQ and HowTo schema markup, consistent brand mentions across authoritative sites, and a clear entity definition for your brand. Those signals move the needle. llms.txt is a nice-to-have that takes 30 minutes to implement.
Practical llms.txt Checklist
Related
What Is GEO? Generative Engine Optimization Explained
Read guide
Related
Schema Markup for AI Search: Complete Implementation Guide
Read guide
Abd co-founded Outline to help brands understand and act on AI-era visibility signals. He's been tracking AI crawler behavior, llms.txt adoption, and GEO metrics since the protocols emerged in late 2024.
