Allow Only Googlebot in robots.txt — Selective Crawler Access Guide

Sometimes you want Google to crawl your site freely while keeping other bots out. Maybe it's a private staging environment that should only be indexed when ready. Maybe you want to let Googlebot in but block scrapers, AI trainers, and competitor crawlers. Whatever the reason, robots.txt gives you the tools to do it.

The basic structure

A robots.txt file works by matching user-agent strings to rules. Rules apply in order of specificity: a rule for a specific bot overrides the catch-all.

# Allow Googlebot full access
User-agent: Googlebot
Allow: /

# Allow Bingbot full access
User-agent: Bingbot
Allow: /

# Block everything else
User-agent: *
Disallow: /

Sitemap: https://example.com/sitemap.xml

A staging site example

If you have a staging or preview environment at staging.yoursite.com and you don't want it indexed by anyone, including Google:

User-agent: *
Disallow: /

Simple. Every bot sees "disallow everything." Pair this with a noindex header on the staging server for extra protection, since robots.txt blocks crawling but doesn't guarantee de-indexing if other sites link to your staging URLs.

Selective access by section

You can give Googlebot access to public sections while blocking private areas:

# Let Googlebot see everything except admin and user dashboards
User-agent: Googlebot
Disallow: /admin/
Disallow: /dashboard/
Disallow: /api/
Allow: /

# Block all other crawlers entirely
User-agent: *
Disallow: /

What robots.txt cannot do

Blocking a URL in robots.txt stops Google from crawling it, but Google might still index the URL if other sites link to it. The page could appear in search results with no title or description, just the URL. For pages that must never appear in search, use a noindex meta tag instead of (or in addition to) a robots.txt block.

Also, some bots ignore robots.txt. Malicious scrapers, spam bots, and some data-collection tools don't respect the protocol. Robots.txt is for legitimate crawlers.

Build and test yours

Use the Robots.txt Generatorto create a valid file with the exact rules you need. Once it's live, test it in Google Search Console under Settings then Crawl then robots.txt tester before assuming everything works.

The basic structure

A staging site example

Selective access by section

What robots.txt cannot do

Build and test yours

More Articles