Robots Meta Tag For SEO

Sometimes you want to only exclude a single page, but you don’t want to actually mention that page in a robots.txt.

The best way, of course, is to implement real file-level security on the file. But failing that you have 2 other options:

1. Create a folder and put the files you wish to exclude in the folder. then exclude that folder in the robots.txt file. This will automatically exclude everything under it, but you will not need to actually name the files.

2. Use the robots metatag on the page(s) in question. What will happen is that the search engine will look at you robots.txt, see that the page is not mentioned, and go to it. But then it will see that robots are actually excluded from the page in the robots metatag and it will then obey it.

I’ve gone into the robots.txt file in detail elsewhere, so let’s talk about the robots metatag.

Robots Meta Tag Code Example

-or-

This example will tell all robots not index this page, and to not follow links from this page.

Attributes and Directives

There are other options, of course:

content = all | none | directives

all = “ALL”
none = “NONE”
directives = directive [“,” directives]

directive = index | follow

index = “INDEX” | “NOINDEX”
follow = “FOLLOW” | “NOFOLLOW”
This results in the following choices:

<meta name=”robots” content=”index,follow”>
<meta name=”robots” content=”noindex,follow”>
<meta name=”robots” content=”index,nofollow”>
<meta name=”robots” content=”noindex,nofollow”>

Plus the two attributes:

<meta name=”robots” content=”all”>
<meta name=”robots” content=”none”>

ALL is the equivalent of “index, follow”

NONE is the equivalent of “noindex, nofollow”

You can find more information here: http://www.robotstxt.org/wc/meta-user.html. These attributes are NOT CASE SENSITIVE – you can use upper or lower case.

Important Note: Robots will treat the absence of a specific disallowing tag (“noindex” and/or “nofollow”) as permission. Therefore if you do not use “noindex” for example, the robots will assume you mean “index”.

For this reason, it’s usually a waste of time to use the “index,follow” or “all” directives, since they are assumed already as a default. These are only useful when you wish to specify, for example, that all robots are to noindex,nofollow, but a specific robot (ie googlebot) is allowed to. In this case, you would put in something like this:

If you did not specifically tell Googlebot to index and follow, it would in this case assume that you did not want it to. This is the only scenario where the index and follow (and ALL) directives are used. SEO’s who misuse metatags are in danger of losing both respect from their peers, and savvy clients who are checking their basic SEO abilities before hiring.

It won’t hurt you to use the ALL or index,follow robots meta directives by itself, but it makes you look bad. You should know how to your your own tools.

Engine Specific Commands

All 4 major search engines (Google, Yahoo, MSN, Ask) also support the NOARCHIVE attribute.

Google

NOARCHIVE: Google uses this to prevent archiving (caching) of a page. See http://www.google.com/bot.html

Although Google will follow this if it applies to all robots, since only Google uses it, the best method of using this is to specify “googlebot”, rather than “robots” in general.

Example:

You can also combine it:

-or-

etc..

NOSNIPPET: A snippet is a text excerpt that appears below a page’s title in Googles search results and describes the content of the page.

To prevent Google from displaying snippets for your page, place this tag in the <HEAD> section of your page:

Note: removing snippets also removes cached pages (acts as NOARCHIVE,NOSNIPPET)

Yahoo

Yahoo also obeys the NOARCHIVE attribute.

Ask/Teoma

Ask obeys the NOARCHIVE attribute.

MSN Search

MSN Search obeys the NOARCHIVE attribute, and if you use the NOCACHE attribute, it acts as the NOARCHIVE attribute.

Free Robots Metatag Generator

My free meta tag generator includes a simple section for the generation of a robots metatag for your use.

Conclusion

The robots metatag is a simple method of telling robots what you want and don’t want indexed or followed on your site. Like the robots.txt file, it is NOT secure, nor is it a requirement. Rogue robots can ignore it if they wish to. Further, a human is under no restrictions at all.

The robots metatag has the advantage of being very granular – it allows you to restrict the spidering behavior of a single page. That same granularity is also a disadvantage – it’s a real pain to put one on a lot of pages, and to keep track of what pages you used it on. It’s best used for small numbers of specific pages.

Rule of thumb: If you want to restrict robots from entire websites and directories, use the robots.txt file. If you want to restrict robots from a single page, use the robots metatag. If you are looking to restrict the spidering of a single link, you would use the link “nofollow” attribute.

Granularity	Best Method
Websites or Directories	robots.txt
Single Pages	robots metatag
Single Links	nofollow attribute

Unless otherwise noted, all articles written by Ian McAnerin, BASc, LLB. Copyright © 2002-2004 All Rights Reserved. Permission must be specifically granted in writing for use or reprinting anywhere but on this site, but we do allow it and don’t charge for it, other than a backlink. Contact Us for more information.