llms.txt: What It Is, Whether It Works, How to Ship One (2026)

llms.txt is a Markdown file at your domain root (yoursite.com/llms.txt) that gives AI systems a curated map of your most important pages with short descriptions. Proposed by Jeremy Howard in September 2024, it's been adopted by roughly 5-10% of top sites as of 2026. The honest verdict: it does not yet measurably increase AI citations from ChatGPT, Gemini, or Google AI Overviews, but Anthropic and Perplexity have signaled support, and it's a low-cost, half-day bet on forward-compatibility. Ship one, but don't expect a ranking bump. Here's the no-hype guide.

There's a lot of breathless marketing about llms.txt being "the new robots.txt." Most of it overstates the present and understates the optionality. This guide separates what's true today from what might be true tomorrow.

What llms.txt actually is

It's a plain Markdown file served at your root that acts as a curated "AI sitemap." Unlike a regular sitemap (exhaustive, machine-only) it's editorial: you pick the pages that matter and describe them in prose, optionally with an authoritative one-paragraph summary of your brand.

The spec is simple:

An H1 with your site/brand name
A blockquote with a one-paragraph description
H2 sections grouping curated links, each with a one-sentence description

That's it. Some sites also publish a heavier llms-full.txt with full content, though adoption of that is much lower (~1%).

What llms.txt is NOT

It is not robots.txt. It does not control crawl access. You still need robots.txt to allow AI crawlers in the first place. (See robots.txt for AI crawlers.)
It is not a sitemap. It's curated and includes prose context, not an exhaustive URL list.
It is not an API or a ranking signal. It's a static text file, and no major provider treats it as a confirmed ranking factor.

robots.txt is access control; llms.txt is content navigation. They're complementary, not competing.

The honest truth about whether it works

Here's where most articles get vague. The data, as of 2026:

Adoption is niche. A May 2026 crawl of the Tranco Top 10,000 found ~5.86% had a valid llms.txt; a separate ~300,000-domain study found ~10%. Either way, it's far from universal, concentrated in dev-tool and documentation sites.
Major LLM crawlers mostly don't fetch it. Server-log analysis consistently shows GPTBot fetches it occasionally and rarely; ClaudeBot, Google-Extended, and PerplexityBot effectively don't request it at meaningful volume. OpenAI's own documented recommendation is to use robots.txt for crawler control.
No measurable citation lift. Multiple analyses show that having an llms.txt does not measurably improve your odds of being cited by ChatGPT, Claude, Gemini, or Perplexity in their answer surfaces today. Google has compared it to the old keywords meta tag.
But the wind is shifting. Anthropic publicly confirmed support, and Perplexity has said it retrieves llms.txt to help prioritize page selection. IDE agents (Cursor, Continue, Cline) already use it.

So the realistic read: not (yet) an SEO play, but a developer-experience play with cheap optionality.

So should you ship one? Yes, with the right expectations

Ship llms.txt if any of these are true, and most sites qualify:

Your readers include developers using AI-assisted IDEs. Those tools already consume it, so it measurably improves how they work with your docs/API.
You have structured content worth surfacing (docs, product pages, key guides). You join the early-adopter corpus providers are most likely to experiment with.
You want low-cost forward-compatibility. The cost is a half-day. The day a major provider decides to respect it for citations, you're already correct.

Do not ship it expecting more AI citations or traffic this quarter. That's not where the evidence is.

How to create your llms.txt (copy-paste template)

Place this at yoursite.com/llms.txt:

# Your Brand Name

> One-paragraph, factual description of what your company does, who it serves, and what makes it distinct. Write it the way you'd want an AI to summarize you.

## Core pages
- [Product overview](https://yoursite.com/): What the product is and who it's for.
- [Pricing](https://yoursite.com/pricing): Plans and what each includes.

## Guides & documentation
- [Getting started](https://yoursite.com/docs/quickstart): Step-by-step setup.
- [Key concept guide](https://yoursite.com/guide): The definitive explainer on [topic].

## Comparisons
- [You vs Competitor](https://yoursite.com/vs/competitor): Honest head-to-head.

Then:

Keep it under ~100 entries and curated, this is a highlight reel, not a dump.
Confirm robots.txt allows AI crawlers (GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot) to reach /llms.txt and the pages it links.
Verify the URL returns plain text in a browser.
Add it to CI/CD so it ships on every release and never goes stale.
Optionally publish llms-full.txt with fuller content if you have the appetite.

Where this sits in your AI-search strategy

llms.txt is a nice-to-have, not the main event. The things that do move AI citations today, extractable answers, sourced data, schema, entity/authority signals, and crawler access, matter far more. Treat llms.txt as the cheap insurance policy you add after the fundamentals. (See generative engine optimization for the parts that actually move citations now.)

Where Okara fits

Okara's GEO agent handles the work that actually drives AI citations today, structuring content for extraction, adding schema, ensuring AI-crawler access, and tracking your citation share of voice, and it can generate and maintain a curated llms.txt as part of that setup so you get the forward-compatibility without the manual upkeep. Point it at your URL and it covers both the high-impact GEO fundamentals and the low-cost extras like llms.txt in one pass. Point it at your URL to see what it would set up for your site.

Frequently asked questions

Does llms.txt actually help with AI search or SEO? Not measurably, as of 2026. Studies show no clear citation lift from having one, and major LLM crawlers rarely fetch it. It's best understood as low-cost forward-compatibility and a developer-experience improvement, not a ranking tactic.

What's the difference between llms.txt and robots.txt? robots.txt controls crawler access (which paths bots may fetch). llms.txt is a curated content map telling AI systems which pages matter and how they're organized. You need robots.txt to allow AI crawlers in; llms.txt gives them a guided entry point once allowed. They're complementary.

Which AI tools actually use llms.txt? Mostly IDE/coding agents (Cursor, Continue, Cline) and some MCP integrations today. Anthropic has confirmed support and Perplexity says it retrieves the file to help prioritize pages. OpenAI, Google, and others don't fetch it in meaningful volume yet.

How do I make an llms.txt file? Create a Markdown file with an H1 (brand name), a blockquote description, and H2 sections of curated links with one-line descriptions. Save it at your domain root as /llms.txt, confirm robots.txt allows AI crawlers, verify it returns plain text, and add it to your build so it stays current.

Is there an llms.txt generator? Yes, several exist, and some GEO tools generate one for you. But the file is simple enough to write by hand in under an hour; the harder part is curating the right pages and writing an accurate brand summary, which is worth doing manually.