Search engines aren't much concerned with your lime-green
background or eye-popping graphics. In essence, they're blind. So, when
optimizing your pages for search engines, you need to look past the presentation
and down into exactly how your code is structured.
It's suprising to me that many people still don't validate their code before
publishing. An HTML or XHTML document, whether created by hand or by some
script or content management system, is a set of instructions to the web
browser on what (and how) to display content.
By validating your code, you ensure that your page conforms to an accepted
set of standards. There are a number of different specifications for HTML that
you can choose from. The W3C has made a number of
recommendations, but the two that are most important are
HTML 4.01 and
XHTML 1.0.
Very roughly speaking, HTML 4.01 is the last non-XML based dialect still in
broad use. XHTML, quoting the W3C, is a reformulation of
HTML 4.01 in XML, and combines the strength of HTML 4 with the power of
XML.
Further, there are actually three "flavours" of XHTML 1.0. You will likely
want to be working with XHTML Transitional most of the time, because of its
backwards support for older browsers.
Now that we've covered the specifications that we will be vaildating to, the
next step is to actually run the code through a validation service. Again,
the W3C Validation Service will be the first stop. Why validate? Search engines
are built to understand standard web page codes. Any code that doesn't
conform to the standard will only confuse search engines, and we want to
make the job of understanding your pages as easy as possible for them. As well, validation often points
out problem areas that you may miss if you only look at the visual
representation of your page.
As part of
validation,
you may have noticed warnings if you failed to add alt attributes
to your img tags. >The alt attribute isn't there just to conform to a standard -
it's actually useful! It stands for "alternative" and can be used to add a
textual description to any image. As we've discussed earlier, search engines
are blind, so this description is vital for letting search engines know what
your image represents.
A picture of a kitten might have an alt attribute as follows:
<img src="kitten.jpg" alt="A white Persian kitten" />
A search engine would then associate the file kitten.jpg with
the text "A white Persian kitty".
Additionally, the title attribute can be used to add text that
acts as a caption for an image. Continuing with the previous example, we might
use the following text:
<img src="kitten.jpg" alt="A white Persian kitten" title="Allie's
Kitten" />
Now we've given the image a short descriptive phrase (alt) as
well as a caption (title). The title attribute can actually be
used for other elements as well, including the anchor tag. Many browsers will
show the text of the title attribute in a floating "tool tip" box
when the mouse pointer is placed over the element.
If you really want to go overboard, you
can consider using the longdesc attribute, which is for including
descriptive text that is too long to fit into the alt attribute.
But back to the wise use of images. Just as a page filled with images won't
be very useful for a blind person, a search engine won't care much either. The
critical portions of a page that should never be replaced with images are the
various headings, company names, and pull quotes. While it is understandable to want to make these areas visually appealing, a
search engine won't be able to "read" any text that appears in these
graphics.
Two options exist. One is to use stylesheets to apply styling to text to
make it standout. The second is to use one of a number of CSS tricks to replace
the text with graphics (the most common one probably being the
Fahrner Image Replacement technique).
longdesc attributeThe concept of the Semantic Web has been equally panned and applauded. At
its core is the concept of adding more markup to what is today mainly plain
text content. In the context of optimizing content for search engines, semantic
markup means thinking about the use of the various heading tags:
h1, h2, etc.
If your content management system doesn't support clean URLs, or at the very
least supports the creation of human-readable aliases, ditch it immediately. I
actually have come to feel so strongly about this that I've formulated it as a
maxim:
URLs used or otherwise generated by a content management system
should not reveal the underlying code or technology used.
In addition, human-readable aliases should be able to be
created, either through user intervention or generated automatically.
Recent comments
14 hours 47 min ago
1 day 9 hours ago
3 days 7 hours ago
3 days 7 hours ago
3 days 7 hours ago
4 days 16 hours ago
1 week 2 days ago
1 week 3 days ago
1 week 6 days ago
1 week 6 days ago