web developement

search engine crawlers are very required to reach audience to your website, but if they hit continuous it will harm your website performance, If you want to prevent search engine crawler to crawl here is first and basic way to do that i.e. robot.txt

robot.txt is file which tells crawlers to what part of your website need to crawl and what part or directory not, but in written format, robot.txt follows a format to let the crawls bots to performs on your website, it also block  and allow different bots by there names

below steps let you how robot.txt will works.

1. If your website name is yourdomain.com then robot.txt should placed here

http://yourdomain.com/robots.txt

2. They are some top Bots :

Googlebot
Yahoo
bingbot
AhrefsBot 
Baiduspider 
Ezooms 
MJ12bot 
YandexBot

3. Robot.txt have two main variables 

  • User-agent - define name of bot
  • Disallow - define which directory is allow to access

example:

1) If you to allow only Googlebot

User-agent : Googlebot

Disallow: *

2) If you want allow all bots

User-agent : *

Disallow: /

3) If you disallow particular bot

User-agent : *

Disallow: *

User-agent : AhrefsBot

Disallow: /

4) If you want to prevent some directory to crawl

User-agent : *

Disallow: /scripts/
Disallow: /themes/

 


Please Comment your thoughts and feedback below and add something if you found good in anywhere to help others

Hit a like Button If you like the Post.

Many Thanks