A robots.txt file is used in the root directory of a domain to tell a spider / search engine bot what to list and what to not list. So if you have folders you do not want to appear in google, you can tell that to the bot with a robots.txt file.
You can also tell it per search engine using the user agent section of a robots.txt file.
User-agent: * applies to all robots that scan your site.
You can tell bots not to list folders in google but people can still access your robots.txt file via http, so if you really want to hide something from the outside world, it is best not to use a robots.txt for this.
To make a robots.txt file, just open notepad and save it as robots.txt, then upload it to the root directory for your website. So the public folder for your site, a robot will look for the file at mywebsite.com/robots.txt
Hiding folders
So lets say we want to hide folder1 and folder2 but we want robots to list everything else on our website, we would insert the below into the robots.txt file
User-agent: * Disallow: /folder1 Disallow: /folder2
Hiding files
You can also use robots.txt to not list certain files, insert the below into your robots.txt file and edit it to match the file name you wish to hide.
User-agent: * Disallow: filename.html Disallow: /folder2/file.php or html
Allow all
So if you have no folders to hide, just use the below.
User-agent: * Disallow:
Including sitemaps
If you have a site map you can also add this to your robots.txt file.
Sitemap: mysitemap.xml
Disallow a bot
And if you don’t want to allow a certain bot, use the below which can be used for multiple bots by doing the same for each bot name.
User-agent: googlebot (replace this with the bot name you want to disallow) Disallow: /
That is pretty much all you need to know about a robots.txt file, it is easy to do and something that can be useful for a variety of things, we have included a short list of bots which are commonly used search engines.
Agents top 3
Googlebot
MSNBot (Known as bing)
AskJeeves