robots.txt
What is it and how to configure it correctly
The robots.txt file is a special text file placed in the root directory of a website and is used to provide instructions to search engine bots (e.g., Googlebot) about which pages or directories should or should not be indexed.
The file format is simple and consists of the following directives:
User-agent: *
Disallow: /admin/
Allow: /public/
Block All Bots from Indexing the Entire Website
User-agent: *
Disallow: /
Allow All Bots to Fully Index the Website
User-agent: * Disallow:
Block Specific Directories
User-agent: *
Disallow: /private/
Disallow: /temp/
Allow Specific Pages While Blocking the Rest
User-agent: *
Disallow: /
Allow: /index.html
Specific Instructions for Googlebot Only
User-agent: Googlebot
Disallow: /no-google/
If you have an XML Sitemap, you can specify it in the robots.txt file to inform search engines:
Sitemap: https://example.com/sitemap.xml