What is llms.txt and do I need one?
llms.txt is an emerging convention that gives AI engines a clean markdown summary of your website's structure and key pages. It lives at your site root (yoursite.com/llms.txt), pairs with /llms-full.txt for the full content dump, and helps LLMs ingest your site cleanly. If you care about being cited by ChatGPT, Claude, or Perplexity, you should publish one.
Where llms.txt came from
llms.txt was proposed in 2024 by Jeremy Howard (founder of fast.ai and Answer.AI) as a convention for giving large language models a clean, structured way to understand a website. The proposal sits at llmstxt.org and has been adopted by an increasing number of sites — Anthropic, Stripe, Cursor, Mintlify, and many others all publish one.
The core insight: AI engines parse HTML imperfectly, often miss content hidden in JavaScript-heavy pages, and waste tokens on navigation, ads, and tracking scripts. A markdown summary file cuts through that noise.
What llms.txt looks like
A typical llms.txt is short — usually a few hundred lines at most — and follows a simple structure:
# Your Company Name
> One-sentence description of what your business does.
A paragraph or two of context that an LLM can use to understand
what your business is, who it serves, and how it's structured.
## Section name (e.g. Services)
- [Page title](https://yoursite.com/page): One-line description
- [Another page](https://yoursite.com/page2): One-line description
## Another section
...
The companion file /llms-full.txt contains the actual full content of your site as a single concatenated markdown document. This is what AI engines ingest when they want the source material, not just the index.
How llms.txt differs from sitemap.xml
sitemap.xml is a machine-readable list of URLs with metadata about freshness and priority. It’s designed for search-engine crawlers to find pages. It contains no content — just URLs.
llms.txt is a human-and-machine-readable summary of your site’s content, structure, and purpose. It contains short descriptions of each page so an LLM can pick the right one to ingest, plus a top-level overview of what your business does. Both files coexist; they serve different consumers.
What to include
Your llms.txt should cover:
- A clear H1 with your company name
- A blockquote with a one-sentence description of what you do
- 1–2 paragraphs of context: who you serve, where you operate, what makes you specific
- A
## Services(or equivalent) section listing your main service or product pages with one-line descriptions - A
## Locationsor## Coveragesection if relevant - A
## Case studiesor## Customerssection linking to proof - A
## Resourcesor## Answerssection linking to your highest-quality content - A
## Optionalsection linking to/llms-full.txtand/sitemap.xml
Keep each line under 150 characters. Avoid marketing fluff — LLMs read this directly and will surface what you write.
What goes in llms-full.txt
/llms-full.txt should contain the actual content of your highest-value pages, formatted as markdown, with sections separated by ---. Most teams generate this file dynamically at build time from their content management system or content collections — that way it stays in sync with the live site automatically.
For a typical SME site, llms-full.txt is 5,000–50,000 words. For larger content libraries it can run much longer; LLMs handle it fine.
Do you need one?
If your goal is to be cited by ChatGPT, Claude, Perplexity, or Google AI Overviews — yes. Publishing llms.txt and llms-full.txt is one of the lowest-effort, highest-leverage things you can do for AEO. It takes a couple of hours to set up cleanly, costs nothing to host, and signals to AI engines that you’ve thought about how they consume your content.
If you don’t care about AI search at all, you can skip it. But the trajectory is clear: AI-driven discovery is growing fast, and the convention of publishing llms.txt is becoming standard for any site that takes content seriously.
How to publish one
Static option: Write llms.txt by hand and drop it in your site’s root directory. Simple, gets stale.
Dynamic option (recommended): Generate it from your content collections or CMS at build time. On Astro this is a single endpoint at src/pages/llms.txt.ts that pulls from getCollection() and outputs markdown. Same pattern for llms-full.txt. Stays in sync forever, zero maintenance.
Whichever you choose, link both files from your robots.txt so crawlers find them.