Question 1

Is robots.txt a security measure?

Accepted Answer

No. Anyone — humans, scrapers, malicious bots — can read your robots.txt and ignore it. It's a polite signal honored by Google, Bing, and other reputable crawlers. For actual access control, use HTTP authentication or server-side rules.

Question 2

What's the difference between Allow and Disallow?

Accepted Answer

Disallow tells the bot not to crawl matching paths. Allow explicitly permits paths (useful for unblocking a sub-path of an otherwise-disallowed directory: `Disallow: /admin/`, `Allow: /admin/public/`).

Question 3

Does Google respect Crawl-delay?

Accepted Answer

No. Google's official position is to ignore Crawl-delay. To slow Googlebot, use Search Console → Settings → Crawl rate. Crawl-delay is respected by Bing, Yandex, and some smaller crawlers.

Question 4

Can I have multiple sitemap URLs?

Accepted Answer

Yes. List each on its own `Sitemap:` line. Google reads all of them. Useful when you split a large sitemap into multiple files or have separate sitemaps per content type (blog, products, etc.).

Question 5

What's the wildcard syntax?

Accepted Answer

`*` matches any character sequence. `$` matches end-of-URL. So `Disallow: /*.pdf$` blocks all URLs ending in `.pdf`. These wildcards are unofficial but supported by all major search engines.

Question 6

What if I block a page that's already indexed?

Accepted Answer

Counter-intuitively, blocking via robots.txt does NOT remove the page from Google's index — it just stops re-crawling. To remove a page, use `<meta name="robots" content="noindex">` on the page itself (and don't disallow it, so Google can re-crawl and see the noindex). Then once removed, you can disallow.

Robots.txt Generator

About Robots.txt Generator

How to use

Examples

Frequently asked questions

Meta Tag Generator

SERP Snippet Preview

hreflang Generator

Readability Score

Keyword Density Analyzer

Robots.txt Tester