Robots Meta Tag For SEO
Sometimes you want to only exclude a single page, but you don’t want to actually mention that page in a robots.txt.
The best way, of course, is to implement real file-level security on the file. But failing that you have 2 other options:
1. Create a folder and put the files you wish to exclude in the folder. then exclude that folder in the robots.txt file. This will automatically exclude everything under it, but you will not need to actually name the files.
2. Use the robots metatag on the page(s) in question. What will happen is that the search engine will look at you robots.txt, see that the page is not mentioned, and go to it. But then it will see that robots are actually excluded from the page in the robots metatag and it will then obey it.
I’ve gone into the robots.txt file in detail elsewhere, so let’s talk about the robots metatag.
Robots Meta Tag Code Example
<meta name=”robots” content=”noindex,nofollow”>
-or-
<meta name=”robots” content=”none”>
This example will tell all robots not index this page, and to not follow links from this page.
Attributes and Directives
There are other options, of course:
content = all | none | directives
all = “ALL”
none = “NONE”
directives = directive [“,” directives]
directive = index | follow
index = “INDEX” | “NOINDEX”
follow = “FOLLOW” | “NOFOLLOW”
This results in the following choices:
- <meta name=”robots” content=”index,follow”>
- <meta name=”robots” content=”noindex,follow”>
- <meta name=”robots” content=”index,nofollow”>
- <meta name=”robots” content=”noindex,nofollow”>
Plus the two attributes:
- <meta name=”robots” content=”all”>
- <meta name=”robots” content=”none”>
ALL is the equivalent of “index, follow”
NONE is the equivalent of “noindex, nofollow”
You can find more information here: http://www.robotstxt.org/wc/meta-user.html. These attributes are NOT CASE SENSITIVE – you can use upper or lower case.
Important Note: Robots will treat the absence of a specific disallowing tag (“noindex” and/or “nofollow”) as permission. Therefore if you do not use “noindex” for example, the robots will assume you mean “index”.
For this reason, it’s usually a waste of time to use the “index,follow” or “all” directives, since they are assumed already as a default. These are only useful when you wish to specify, for example, that all robots are to noindex,nofollow, but a specific robot (ie googlebot) is allowed to. In this case, you would put in something like this:
<meta name=”robots” content=”noindex,nofollow”>
<meta name=”googlebot” content=”index,follow”>
If you did not specifically tell Googlebot to index and follow, it would in this case assume that you did not want it to. This is the only scenario where the index and follow (and ALL) directives are used. SEO’s who misuse metatags are in danger of losing both respect from their peers, and savvy clients who are checking their basic SEO abilities before hiring.
It won’t hurt you to use the ALL or index,follow robots meta directives by itself, but it makes you look bad. You should know how to your your own tools.
Engine Specific Commands
All 4 major search engines (Google, Yahoo, MSN, Ask) also support the NOARCHIVE attribute.
NOARCHIVE: Google uses this to prevent archiving (caching) of a page. See http://www.google.com/bot.html
Although Google will follow this if it applies to all robots, since only Google uses it, the best method of using this is to specify “googlebot”, rather than “robots” in general.
Example:
<meta name=”googlebot” content=”noarchive”>
You can also combine it:
<meta name=”googlebot” content=”nofollow,noarchive”>
-or-
<meta name=”googlebot” content=”noindex,nofollow,noarchive”>
etc..
NOSNIPPET: A snippet is a text excerpt that appears below a page’s title in Googles search results and describes the content of the page.
To prevent Google from displaying snippets for your page, place this tag in the <HEAD> section of your page:
<META NAME=”GOOGLEBOT” CONTENT=”NOSNIPPET”>
Note: removing snippets also removes cached pages (acts as NOARCHIVE,NOSNIPPET)
Yahoo
Yahoo also obeys the NOARCHIVE attribute.
Ask/Teoma
Ask obeys the NOARCHIVE attribute.
MSN Search
MSN Search obeys the NOARCHIVE attribute, and if you use the NOCACHE attribute, it acts as the NOARCHIVE attribute.
Free Robots Metatag Generator
My free meta tag generator includes a simple section for the generation of a robots metatag for your use.
Conclusion
The robots metatag is a simple method of telling robots what you want and don’t want indexed or followed on your site. Like the robots.txt file, it is NOT secure, nor is it a requirement. Rogue robots can ignore it if they wish to. Further, a human is under no restrictions at all.
The robots metatag has the advantage of being very granular – it allows you to restrict the spidering behavior of a single page. That same granularity is also a disadvantage – it’s a real pain to put one on a lot of pages, and to keep track of what pages you used it on. It’s best used for small numbers of specific pages.
Rule of thumb: If you want to restrict robots from entire websites and directories, use the robots.txt file. If you want to restrict robots from a single page, use the robots metatag. If you are looking to restrict the spidering of a single link, you would use the link “nofollow” attribute.
Granularity | Best Method |
Websites or Directories | robots.txt |
Single Pages | robots metatag |
Single Links | nofollow attribute |