~ Dancing With Crawlers ~
         to basic   

Published @ searchlores in December 2002

Dancing With Crawlers
by Dan Ciammaichella


Well... I received this essay per anonymous mailer yesterday, and I have decided to publish it because some of the matters it deals with (for instance site accessibility for ANY browser) are of OUTMOST interest for seekers. I checked and found out that the original article (subdivided in 4 parts) dwells (without date) at http://www.chipcom.net/searchengine1.php and that Dan Ciammaichella wrote also many more interesting essays about webdesign.


Dancing With Crawlers

An essay about making websites both search-engine friendly and accessible, without sacrificing 'cool' design.
by Dan Ciammaichella


An effective method of optimizing your website for search engines, that includes the added benefit of also making it accessible to the widest range of viewers, is to display your content in a presentation layer that best fits the user-agent. Accomplishing this does not require compromising your 'cutting edge' design or basic accessibility guidelines.

Sounds complicated doesn't it? The first thing many mis-informed 'experts' claim when presented with this idea, is that it is too much trouble to maintain multiple versions of a website to accommodate different browsers. They miss the point entirely. You do not need to maintain multiple versions of your site content - the key is to use the SAME content, placed within custom design templates at runtime. At it's very basic, this strategy only involves two design templates; one for graphical web browsers, the other text-only. Both templates can be coded into the same page, with the content coming from either a database, or a separate include file. While I have used the very-cool extreme of a model-view-controller architecture to dynamically generate presentation layers based on the requesting user-agent, I will only cover a very basic method to accomplish similar results here.
The basic method is effective for most sites, if it is incorporated into the build. It's much more work as an afterthought.

Before we get much farther, I need to mention that this method requires the use of server-side scripting. The flavor doesn't really matter, as long it can read the User-Agent string in the HTTP header. I've used Microsoft Active Server Pages (ASP), Java Server Pages (JSP) and Cold Fusion. The examples here use my favorite, PHP, which is freely available and already installed on the servers of many web hosts. PHP is very easy to learn, so even a relative beginner to building websites can get up to speed fairly quickly.

Assuming that you have PHP installed on your host or local machine, and a basic understanding of how to write PHP pages, we can begin building our template.

Detecting the User-Agent

The first thing we will do is detect the user-agent from the HTTP_USER_AGENT string in the header of the request. We are not going to get crazy trying to determine the exact browser, version or operating system though - there is no need. For our purposes we simply want to know if the user agent is a fairly-modern graphical browser, or something else. Most of the major modern graphical browsers provide one common string - 'Mozilla'. If 'Mozilla' is in the User_Agent string, you can be pretty darn sure the browser is Netscape, Internet Explorer, or one of the lesser known browsers. One exception is Opera, which allows the user to pick from a choice of strings to identify it. Most of the available Opera strings contain 'Mozilla', but one only contains the word 'Opera'. I am quite happy to settle for detecting these two strings alone for our purposes. Below is a simple PHP script to do just that:
<?php
$browser = false;
if (getenv(strstr("HTTP_USER_AGENT"), "Mozilla") != ""
| getenv(strstr("HTTP_USER_AGENT"), "Opera") != "")
{ $browser = true; }
?>
The script initially sets the variable $browser to false, then IF the words 'Mozilla' OR 'Opera' are part of the HTTP_USER_AGENT string, we set $browser to true, indicating a graphical web browser.

Obviously, you could create a more sophisticated function to detect the user-agent down to the version, but again, for our purposes here there is no need. Of course feel free to come up with what works best for you.

Some will say that a user-agent string can be 'spoofed' or faked, and they are 100% correct. Of course if someone does spoof their user-agent string, what is the worst that can happen? They will simply see one of the two versions of your site - not a big deal, unless you are trying to spam the search engines using this method - in that case you're busted and deserve to be exposed. The purpose of this technique is to make your site search-engine friendly and accessible, not to spam and mislead.

Now that we have a simple user-agent detection script built, save it as browserDetect.php. I usually create an 'includes' directory and save all of my included code snippet files there. From a maintenance standpoint, Include files will make your life much simpler. If you decide to make changes, you only need to edit one file, instead of every file that uses the code.

Building Page Templates

Building a page template that incorporates both a graphical design and a text-only design using the same content is fairly simple. The first step is to create your well-designed graphical version in straight HTML. While doing so, keep in mind that you want the actual content of the page to be easily separated from the design elements. The simplest method of seeing how this works is to think of your page as having three distinct parts; header, footer and content.

Header - includes everything from (and including) the closing HEAD tag of your page, to the beginning of the actual content. Using this site as an example, the logo, top navigation bar and breadcrumb trail are all part of the header. If your site includes a left-hand navigation, this would usually be part of the header as well.
Footer - includes everything from the end of the content to the closing BODY tag. Again using this site as an example, the bottom navigation bar and copyright notice would be make up the footer. If you use a right-hand column for navigation and/or promotions, that could be part of the footer as well. Content - this is the actual content of the site. All text for the most part, but can also include graphical and other elements that are required to convey your information. If parts of your content require client-side scripting, flash or other multimedia elements, ensure they also include an alternative version for user-agents that do not support them. Use of the NOSCRIPT tag is one example.

Once you have the graphical page laid out to your satisfaction (and validated to whatever version of HTML you are using), save it. Now recreate the page, replacing the graphical header and footer with text-only versions. You may want to incorporate an alternate style-sheet as well. This will be the version of the page that must be accessible to any user-agent that is not a Mozilla compatible browser. When you have this page completed and validated to your satisfaction, save it as well.

The next step is to incorporate both versions of your page into one page template. This is much easier accomplished and maintained if you cut out each respective header and footer into include files. Of course there will only be one include file for the content itself. I usually create a separate 'content' directory to save these. If you are using dynamic pages and pulling your content from a database, there are many ways to streamline the template, but for the purposes of this article we will stick with manually created content include files.

Now that you have your headers, footers and content separated into include files, we can build our template. The template logic will display the default graphical template and content if the $browser variable is true, or the alternate text-only template with the same content if the $browser variable is false. The template elements will include the following:

The following example contains all of the elements listed above.

An Example Template

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html dir="ltr" xml:lang="en" lang="en">
<?php
# set content file name as variable for ease of creating templates
$content = includes/myContent.php;
# this is where we include our browser detection script
include ("includes/browserDetect.php");
?>
<head>
<title>Page Title</title>
<meta name="description" content="Page Description" />
<meta name="keywords" content="Page Keywords" />
<meta http-equiv="content-type" content="text/html; charset=ISO-
8859-1" />
<?php # if $browser is true then display the default template
if ($browser == true) {
?>
<!-- this is the sytlesheet for the default template only -->
<link rel="stylesheet" type="text/css" href="css/
mainStyleSheet.css" />
<!-- include any javascripts for the default template here -->
</head>
<body>
<!-- include the default header include file -->
<?php include ("includes/defaultHeader.php"); ?>
<!-- include the content include file-->
<?php include ($content); ?>
<!-- include the default footer include file -->
<?php include ("includes/defaultFooter.php"); ?>
<?php } # end default template, begin text-only template $browser
is false
else { ?>
<!-- this is the sytlesheet for the text-only template only -->
<link rel="stylesheet" type="text/css" href="css/
alternateStyleSheet.css" />
</head>
<body>
<!-- include the text-only header include file -->
<?php include ("includes/altHeader.php"); ?>
<!-- include the content include file-->
<?php include ($content); ?>
<!-- include the text-only footer include file -->
<?php include ("includes/altFooter.php"); ?>
<?php } # end text-only template ?>
</body>
</html>
     
Feel free to copy/paste this template and modify for your own use.

Wrap Up

Notice how the same content include file is used in both the default graphical portion of the template and the text-only portion. As I stated earlier, the key to making the site content easily maintainable is not having to edit multiple versions of the same content. The use of include files also makes the site easier to maintain. If you are a more experienced web developer, you will of course find better and more inventive ways of cutting up your templates to incorporate more functionality and a more sophisticated design than in this very basic example. You'll also note how easily this can be modified to use dynamic content from a database or content management system.

Also note the use of two different style sheets. Just because the alternative presentation is text-only does not mean it has to look drab if viewed by a more capable user-agent. Using CSS you can give even a plain-text page a more attractive appearance.

Incorporating the concept of serving the same content in a presentation layer customized for the user-agent is the most effective way I have found to:

Of course this technique alone will not guarantee you anything. It is not a magic bullet that removes the need to apply other essential elements in your website design and coding. The following best practices still apply

One question I often get asked is - "So how do you test your alternative presentations?" Easy. In development you can simply change the initial $browser conditional statment to If ($browser=false) {. Now when viewed in a browser you will see the alternate version. (Note: when validating your graphical page you should set this conditional to false as well, otherwise the validator will be run against your alternative page.) Also, if you leave the $browser conditional set to true and use the Lynx-Me viewer, you will see your alternative version as Lynx - and most search engines, will see it.

Other elements of architecting successful websites will be covered in future articles. Good Luck!

(c) Dan Ciammaichella 2002



Petit image

(c) III Millennium: [fravia+], all rights reserved