May 26, 2026 · 8 min read

What is llms.txt? Why It Matters for AI Search (And How to Generate Yours Free)

If you have been anywhere near SEO or developer Twitter lately, you have seen a new file making the rounds. llms.txt is getting hyped like the second coming of structured data. It's not. It's a text file. The file provides a simple way to introduce your site directly to language models in plain language.

This guide explains what it is, the honest state of adoption (spoiler: it's messy), and how to generate llms.txt for free.

What is llms.txt?

llms.txt is a plain-text Markdown file that tells LLMs and AI crawlers what your site is about and where the good stuff lives. This little text file lives at the root of your site, like yoursite.com.llms.txt.

AI bots do not “see” your website like humans do. They try to make sense of your website by crawling hundreds of pages of raw HTML. Honestly, most websites are a mess of popups, menus, cookie notices, and tracking scripts.

Instead of letting an AI piece things together, you hand it a clear summary: Here's who we are. Here's what we do. Here are the pages worth reading.

It is not mandatory to have one. No search engine is forcing you to add it. As AI search becomes common, a clear llms.txt is a pretty low-effort way of making sure your site is “AI readable.”

Why Was It Created?

Jeremy Howard, data scientist and co-founder of fast.ai and Answer.ai, proposed this idea in September 2024 with a post on X. It blew up in November 2024 when Mintify auto-generated llms.txt for thousands of doc sites.

His reasoning was simple, and frankly, it made perfect sense. AI language models have limited context windows, and most websites waste that context on junk.

When an AI tries to understand your site by parsing raw HTML, it burns tokens on stuff that doesn't matter. A well-written llms.txt file tells the model what matters, nothing extra.

llms.txt vs. robots.txt vs. sitemap.xml

These three files live in the same neighborhood but do very different jobs. People mix these up all the time, but they are not interchangeable:

robots.txt Tells bots what pages they are allowed to look at Blocks bots from indexing staging environments or admin areas

Sitemap.xml Lists every URL you want indexed Help search engines navigate your site without wasting crawl budget

llms.txt Gives AI models a clean Markdown summary of your site so they skip HTML noise Giving AI crawlers context about your most important content

✓ robots.txt = permission ✓ sitemap.xml = map ✓ llms.txt = a briefing

robots.txt and sitemap.xml tell bots where they can go. In contrast, llms.txt basically says, “here's what you need to understand.”

How llms.txt Is Structured

The beauty of llms.txt is that it uses Markdown, not overly messy XML or JSON. Here is a basic example of what it looks like:

-# BrightCart

BrightCart helps online stores manage customer support and order tracking.

-## Key Pages

-## Company

The rules are:

  • A site name at the top
  • A one-sentence description in blackquote
  • Neat little sections with labeled links to your key pages

There is also an optional llms-full.txt version that can include full page content. Most people start with the standard version though.

The Honest Debate: Is Anyone Actually Reading This Thing?

Quite a few big names have jumped on board since it was introduced in late 2024. Amazon, Anthropic, OpenAI, Stripe, Google, and Microsoft all have live files now. Mind you these aren't companies that adopt things casually. They understand how AI consumes information, and if they are implementing this, it is something worth considering.

It is not just the tech giants, smaller companies are using it too. Builtwith reported that 844K sites had an llms.txt file in place by October 2025. Netkodo built one for an interior designer brand, Scout & Nimble, as part of their GEO strategy. They mapped product categories and wrote an AI-friendly FAQ to make sure AI could accurately describe their product when someone asked.

Ryan Law, Director at Ahrefs, put it well, “a low-effort, low-risk bet on a future where AI agents are the primary way people interact with the web.”

The biggest use case today is dev tools. AI coding assistants like Cursor, GitHub, and Claude already pull docs in real-time using llms.txt. If you publish API docs or developer guides, AI is probably trying to read your site right now. An llms.txt file makes that reading more accurate and cuts down on hallucinations.

The Uncomfortable Question: Do AI Bots Even Read It?

Here's the truth: most of them don't. Not yet. Studies have shown that major AI crawlers often skip the file.

  • Otterly AI shared results from a 90-day experiment showing that only 0.1% of AI bot requests read llms.txt.
  • SE Ranking looked at around 300K domains and found zero link between having the file and getting more citations.
  • Another analysis from Limy.ai examined 500M+ bot traffic events. They found that major crawlers like ClaudeBot, PerplexityBot, GPTBot, and Google-Extended overwhelmingly ignore the file.

Understandably, this has led to some questions about the usefulness of the standard. Last year, Google's John Muller flatly rejected the claim that the Google Search System reads or uses llms.txt. As of early 2026, none of the big AI players (OpenAI, Anthropic, Google) has publicly said they use it to generate answers.

3 Reasons You Should Still Implement It

The B2A use case is real (and growing)

Consumer bots like ChatGPT and Perplexity might skip your file, but AI coding agents absolutely read it. Cursor, Copilot, and Claude in tool-use mode pull your API docs in real time. A clean llms.txt helps them surface your content (product features, integration, pricing) accurately instead of inventing details from an outdated training set.

Vercel credits ChatGPT for roughly 10% of its signups. This is bot traffic converting to paying users. Well, lllms.txt is part of what makes your site readable to those bots.

Forward-compatibility is cheap insurance

The standard was introduced in September 2024. Major platforms jumped on it within months. AI companies will likely start reading llms.txt at scale in the future, and they have every reason to, because it makes their answers better. You would want the file in place before the switch flips.

You can implement one in under an hour. However, if a competitor gets cited first, fixing that visibility issue takes way longer.

It forces you to clarify your positioning

Writing that one-sentence blockquote, “what does this site do, for whom, and why,” is a positioning exercise. Most founders sit down to write that sentence and realize that they have never stated it that cleanly anywhere on their site. Writing llms.txt forces you to clarify how your homepage and docs explain your product. This clarity alone is worth it even if no bot ever reads the file.

How to Generate Yours Free?

You can easily write llms.txt by hand since the format is simple enough. That said, most users would rather not collect URLs, format Markdown, and second-guess their one-sentence description. Don't worry, there's a faster way.

Okara has a free llms.txt generator. Drop in your URL, and it maps your site, finds the important pages, writes a one-sentence summary, and formats the Markdown. You review everything, tweak what needs tweaking, and download a finished llms.txt in a few minutes. No subscription required. The llms.txt is genuinely free.

(Okara also does a bunch of other things – SEO audits, content, community monitoring, GSC integration. You don't need any of that to use the llms.txt generator. Totally separate decision.)

For most people, the generator is worth the five minutes it takes. You walk away with a spec-compliant file you can drop into your site root today.

Generate your llms.txt free at Okara.ai

Frequently Asked Questions

What does llms.txt actually do? It is a Markdown file on your website that gives AI models a clear overview of what your site does and its most important content.

Is llms.txt an official standard? It's a proposed one. Jeremy Howard published the spec in September 2024, and over 844K sites had adopted it by October 2025. Having said that, no standards body like the W3C has ratified it, and no major AI provider has confirmed to use it in production.

Does Google use llms.txt? No. Google’s John Mueller confirmed in 2025 that no Google Search system reads or uses it for ranking.

Do ChatGPT or Perplexity use it? Not consistently. Studies from OtterlyAI and Limy.ai show that major AI bots mostly skip the file in production. AI coding assistants and developer tools actively use llms.txt to pull docs in real time.

How is llms.txt different from robots.txt? robots.txt controls what crawlers are allowed to index (permissions). On the other hand, llms.txt explains what your site means in plain language (context).

How long does llms.txt take? Writing one by hand takes 20–40 minutes if you know your site well. Using the free generator at Okara.ai takes under five. Once the file is ready, upload it to your site root and you are done.

Should every website implement llms.txt? It makes the most sense for a site with substantial content, documentation, and product info that AI might reference. A simple landing page with two paragraphs won’t see much benefit. However, given the near-zero cost, there’s little reason not to.

Tags

What is llms.txt? Why It Matters for AI Search (And How to Generate Yours Free) | Okara Blog