Using META Tags
On to more important issues, like how to actually implement META tags in
your Web pages. If you’ve ever had readers tell you that they’re seeing
an old version of your page when you know that you’ve updated it, you may
want to make sure that their browser isn’t caching the Web pages. Using
META tags, you can tell the browser not to cache files, and/or when to
request a newer version of the page. In this article, we’ll cover some of
the META tags, their uses, and how to implement them.
Expires
This tells the browser the date and time when the document will be
considered "expired." If a user is using Netscape Navigator, a
request for a document whose time has "expired" will initiate a
new network request for the document. An illegal Expires date such as
"0" is interpreted by the browser as "immediately."
Dates must be in the RFC850 format, (GMT format):
<META HTTP-EQUIV="expires" CONTENT="Wed, 26 Feb 1997
08:21:57 GMT">
Pragma
This is another way to control browser caching. To use this tag, the value
must be "no-cache". When this is included in a document, it
prevents Netscape Navigator from caching a page locally.
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
These two tags can be used as together as shown to keep your content
current—but beware. Many users have reported that Microsoft’s Internet
Explorer refuses the META tag instructions, and caches the files anyway. So
far, nobody has been able to supply a fix to this "bug." As of the
release of MSIE 4.01, this problem still existed.
Refresh
This tag specifies the time in seconds before the Web browser reloads the
document automatically. Alternatively, it can specify a different URL for
the browser to load.
<META HTTP-EQUIV="Refresh"
CONTENT="0;URL=http://www.newurl.com">
Be sure to remember to place quotation marks around the entire CONTENT
attribute’s value, or the page will not reload at all.
Set-Cookie
This is one method of setting a "cookie" in the user’s Web
browser. If you use an expiration date, the cookie is considered permanent
and will be saved to disk (until it expires), otherwise it will be
considered valid only for the current session and will be erased upon
closing the Web browser.
<META HTTP-EQUIV="Set-Cookie" CONTENT="cookievalue=xxx;expires=Wednesday,
21-Oct-98 16:14:21 GMT; path=/">
Window-target
This one specifies the "named window" of the current page, and can
be used to prevent a page from appearing inside another framed page. Usually
this means that the Web browser will force the page to go the top frameset.
<META HTTP-EQUIV="Window-target" CONTENT="_top">
PICS-Label
Although you may not have heard of PICS-Label (PICS stands for
Platform for Internet Content Selection), you probably will soon. At the
same time that the Communications Decency Act was struck down, the World
Wide Web Consortium (W3C) was working to develop a standard for labeling
online content (see www.w3.org/PICS/
). This standard became the Platform for Internet Content Selection (PICS).
The W3C’s standard left the actual creation of labels to the "labeling
services." Anything which has a URL can be labeled, and labels can be
assigned in two ways. First, a third party labeling service may rate the
site, and the labels are stored at the actual labeling bureau which resides
on the Web server of the labeling service. The second method involves the
developer or Web site host contacting a rating service, filling out the
proper forms, and using the HTML META tag information that the service
provides on their pages. One such free service is the PICS-Label
generator that Vancouver-Webpages provides. It is based on the Vancouver
Webpages Canadian PICS ratings, version 1.0, and can be used as a guideline
for creating your own PICS-Label META tag.
Although PICS-Label was designed as a ratings label, it also has
other uses, including code signing, privacy, and intellectual property
rights management. PICS uses what is called generic and specific labels.
Generic labels apply to each document whose URL begins with a specific
string of characters, while specific labels apply only to a given file. A
typical PICS-Label for an entire site would look like this:
<META http-equiv="PICS-Label" content='(PICS-1.1
"http://vancouver-webpages.com/VWP1.0/" l gen true comment
"VWP1.0" by "scott@hisdomain.com" on
"1997.10.28T12:34-0800" for "http://www.hisdomain.com/"
r (P 2 S 0 SF -2 V 0 Tol -2 Com 0 Env -2 MC -3 Gam -1 Can 0 Edu -1 ))'>
Keyword and Description attributes
Chances are that if you manually code your Web pages, you’re aware of the
"keyword" and "description" attributes.
These allow the search engines to easily index your page using the keywords
you specifically tell it, along with a description of the site that you
yourself get to write. Couldn’t be simpler, right? You use the keywords
attribute to tell the search engines which keywords to use, like this:
<META NAME ="keywords" CONTENT="life, universe, mankind,
plants, relationships, the meaning of life, science">
By the way, don’t think you can spike the keywords by using the same
word repeated over and over, as most search engines have refined their
spiders to ignore such spam. Using the META description attribute, you add
your own description for your page:
<META NAME="description" CONTENT="This page is about
the meaning of life, the universe, mankind and plants.">
Make sure that you use several of your keywords in your description.
While you are at it, you may want to include the same description enclosed
in comment tags, just for the spiders that do not look at META tags. To do
that, just use the regular comment tags, like this:
<!--// This page is about the meaning of life, the universe, mankind
and plants. //--!>
More about search engines can be found in our special
report.
ROBOTs in the mist
On the other hand, there are probably some of you who do not wish your pages
to be indexed by the spiders at all. Worse yet, you may not have access to
the robots.txt file. The robots META attribute was designed with this
problem in mind.
<META NAME="robots" CONTENT="all | none | index |
noindex | follow | nofollow">
The default for the robot attribute is "all". This would allow
all of the files to be indexed. "None" would tell the spider not
to index any files, and not to follow the hyperlinks on the page to other
pages. "Index" indicates that this page may be indexed by the
spider, while "follow" would mean that the spider is free to
follow the links from this page to other pages. The inverse is also true,
thus this META tag:
<META NAME="robots" CONTENT=" noindex">
would tell the spider not to index this page, but would allow it to
follow subsidiary links and index those pages. "nofollow" would
allow the page itself to be indexed, but the links could not be followed. As
you can see, the robots attribute can be very useful for Web developers. For
more information about the robot attribute, visit the W3C’s
robot paper.
Placement of META tags
META tags should always be placed in the head of the HTML document between
the actual <HEAD> tags, before the BODY tag. This is very important
with framed pages, as a lot of developers tend to forget to include them on
individual framed pages. Remember, if you only use META tags on the frameset
pages, you'll be missing a large number of potential hits.
Back to
META Tags, Part 1
On
to META Tags, Part 3
|