Back to Blog
SEO Tools

Robots.txt File: The Complete Guide for SEO in 2026

2025-12-08 6 min read

A misconfigured robots.txt file can block your entire site from Google. This guide explains every directive, common mistakes, and how to test your robots.txt correctly.

A single misconfiguration in your robots.txt can block all of Google's crawlers from your site โ€” wiping your rankings overnight. It's a small file with enormous consequences. Here's everything you need to know.

What Is robots.txt?

robots.txt is a plain text file placed at the root of your domain (example.com/robots.txt). It uses the Robots Exclusion Protocol to tell web crawlers which pages they can and cannot access. Google respects it; malicious crawlers often ignore it.

Syntax and Directives

# Allow all crawlers everywhere
User-agent: *
Allow: /

# Block all crawlers from /admin
User-agent: *
Disallow: /admin/

# Block Googlebot only from /staging
User-agent: Googlebot
Disallow: /staging/

# Block everyone from everything (DANGEROUS)
User-agent: *
Disallow: /

# Link to sitemap
Sitemap: https://example.com/sitemap.xml

What to Block and What Not To

  • Block: /admin, /cart, /checkout, /search, /login, staging URLs, duplicate content pages
  • Never block: CSS, JS files (Google needs them to render pages), your sitemap, or accidentally your entire site

robots.txt vs noindex

These are different tools with different behaviors:

  • Disallow in robots.txt: Prevents crawling. Google won't visit the page, but may still index it if other pages link to it.
  • noindex meta tag: Page can be crawled, but won't appear in search results. For pages you want de-indexed, this is the right tool.

Testing robots.txt

Use Google Search Console โ†’ Crawl โ†’ robots.txt Tester to check if your rules are correct before publishing. Always test changes in a staging environment first.

robots-txt seo crawling google

More Articles