Open In App

Components of Robot.txt File - User-Agent, Disallow, Allow & Sitemap

Last Updated : 23 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The robots.txt file in SEO acts as a gatekeeper, before any good bots entering to your website they first visit the robots.txt file and read which pages are allowed to crawl and which are not.

A robots.txt file tells the Google crawler bot which URLs the crawler can access on your website.

Example of Robot.txt File

You can also visit our robots.txt file by this URL: https://www.geeksforgeeks.org/robots.txt

User-agent: *
Disallow: /wp-admin/
Disallow: /community/
Disallow: /wp-content/plugins/
Disallow: /content-override.php
User-agent: ChatGPT-User
Disallow: /

Components of Robot.txt File

Now lets explain above code

  • User-agent means bots.
    • * means all.
  • Disallow means if the URL contains this keyword don’t crawl.

For example:

If we put Disallow on the URL https://www.geeksforgeeks.org/wp-admin/image.jpg and do not allow it to be crawled.

Now even if the URL is changed to https://www.geeksforgeeks.org/news//wp-admin/image.jpg
then also https://www.geeksforgeeks.org/new/wp-admin/image.jpg is not allowed to crawl (although https://www.geeksforgeeks.org/news/ is allowed to crawl).

  • User-agent: ChatGPT-User
    • blocks the ChatGPT bot from crawling the whole website.

User-agent: *
Disallow: /

Above code block all web crawler to visit any page of website.

Note: If you want any URL to deindex from Google Search quickly you can use Google Search Console removal request from your GSC account.


Article Tags :

Similar Reads