Compiling a Working Class
Our requirement for the Link Factory is to create absolute links from any page in the website to any other page in the website. Ultimately, we'll need to continue this test by building a data access layer (DAL), click here to check it out. However, in planning this project, it's clear this would be too much to test intelligently at one time. That's an excellent cue that this project is too big to implement as one class. So we're implementing the Link Factory in two phases. This page will create and test the Business logic of the link factory itself. Then we'll go after the data access layer to support it. If we create the DAL first, how do you check the output? You've got to create an output mechanism to test it. Conveniently, creating the business object is the best possible real world test to see if the DAL output is correct. By contrast, the output of the business object can be checked by doing a view source on the webpage, and we can create some input easily by hard coding some data as if it were coming from a DAL. Then we do testing, testing, testing, until that output meets the proper W3C specifications of a well formed absolute link. A simple and very real world test.
The goal of every programmer is to construct elegant, efficient code that does what it's supposed to. The secret goal of every programmer trying to get there is to first get the damn thing to compile. That'd be nice. Here's what we got up and running, then we'll break it down.
using System;
using System.Configuration;
using System.Text.RegularExpressions;
using System.Web;
/// <summary>
/// Summary description for BetaTesting2LinkFactory
///
/// BetaTesting2LinkFactory is a testing class for creating absolute links with golden keywords embedded in the URLs for SEO optimization. It can be called throughout the Earth Chronicle application.
///
/// </summary>
public static class BetaTesting2LinkFactory
{
.
.
.
}
At the top of the file are the necessary using statements and the summary. Properly commented code is the only way anyone will ever know what you're doing, including you when you revisit a file a couple years after you wrote it. This starts with the summary of why you're writing the class and what it does. Next we define the entire class.
public static class BetaTesting2LinkFactory
{
// Remove all punctuation from URLs to protect against possible errors
private static string stripPunctuation(string possiblyInvalidUrl)
{
.
.
.
}
// Build SEO optimized absolute links from anywhere in the application
public static string insertLink(string pageName)
{
.
.
.
}
}
Note that everything is tagged with the keyword static. Once the class is defined as static everything else must be set to static as well. When I forgot to include the static keyword on one variable declaration, the entire application folded. A static class, method, variable, or any other class member indicates that the code is the definition of a type; by contrast, non-static classes and members are free to define actual instances of a type. Exactly why I'm using it here, I'm not entirely sure. Steve McConnell doesn't address technical details of that low level, Jesse Liberty says it's "magic" and he'll explain later (but then doesn't), while Christian Darie says to do this in one of the few moments that he forgets to explain why. My rationale here is that Christian did it, so I'm building this as a static class too, basically. I believe it's probably for performance reasons that I'll come to appreciate later.
I've also defined the class as public so that I can access it throughout the application, the insertLink() method I want to use is also defined as public. However, the helper method to remove any punctuation accidentally included in the URL is defined as private so that it's not accessible except to functions within the class. Finally, the class keyword declares that, yes, BetaTesting2LinkFactory is a class.
Scratch that. It may not be listed in the index but Jesse Liberty's Programming C# does sneak in some discussion of static classes. C# doesn't allow global methods, however, because the process is so useful, C# uses static methods of static classes to duplicate the functionality in a safer, more object oriented way. Therefore, Christian Darie, chose to make this a static class so he doesn't have to (because you can't) create an instance of BetaTesting2LinkFactory, he can just use it.
[chroniclemaster1, 2009/10/10]
Of course, my methods can't take the class keyword, they define their output. In the case that they provide no output but simply run code, they'd take the void keyword. However, both of my methods return strings, so they take the string keyword. Then I define the name of the method, insertLink() and stripPunctuation(). Inside the parentheses, I'll be passing parameters, so I need to declare the variables for them. For insertLink() my variable is pageName which is a string, so it's declared insertLink(string pageName). For stripPunctuation() my variable is possiblyInvalidUrl which is a string, so it's declared stripPunctuation(string possiblyInvalidUrl). I've also included comments to explain what each method does. Now that my structure is defined, I can get busy coding.
// Remove all punctuation from URLs to protect against possible errors
private static Regex invalidUrlCharacters = new Regex("[^/a-zA-Z0-9]", RegexOptions.Compiled);
private static string validUrl;
private static string stripPunctuation(string possiblyInvalidUrl)
{
.
.
.
}
First, let's look at my private helper method, stripPunctuation(). Christian Darie includes a very complicated link building routine that includes lots of problematic content, and so he needs to build a robust regex-based method to clean every link. My application is going to use links much more extensively in a portal-like application; there's no way it can survive the kind of complicated multi-parameter link building that he does. Earth Chronicle requires a much simpler method and that's why I have to build a database to effect the full functionality that I want. However, that also means the regex requirements are merely a safety issue for situations where someone accidentally includes punctuation in the page name. Therefore, this implementation is much lighter. You will note that contrary to most of my examples, I use extra variables for better semantic clarity; instead of working with a variable named url, I operate on a variable named possiblyInvalidUrl which is "fixed" and passed into the variable validUrl.
I begin by declaring two variables. Since this is a private static method, the variables have to be declared the same way. My first variable is the set of invalid characters that I won't accept in my page names. It is easiest to declare this as a Regex which I've named invalidUrlCharacters. I then specify the exact set of characters I don't want. Note that this is a problem down the line, because I want the website to be truly multilingual. At present, I don't have the capacity to do that, so I'm going to move ahead until I can research the set of UTF-8 characters which are invalid for use in URLs. Then, this statement can be updated appropriately. Now, I'm using the caret, ^, to specify "anything except" English alphnumeric characters defined by the regex group [^/a-zA-Z0-9]. Since I've isolated this issue to this one location, I should be safe, and this will be a simple change when I have the revised set. I also declare a second variable; a string name validUrl. This is the variable that receives the output after the variable, possiblyInvalidUrl is cleaned.
// Remove all punctuation from URLs to protect against possible errors
private static string stripPunctuation(string possiblyInvalidUrl)
{
validUrl = invalidUrlCharacters.Replace(possiblyInvalidUrl, "");
return validUrl;
}
The method stripPunctuation(), declares the final variable, which is the method's single input parameter, a string named possiblyInvalidUrl. Now we're ready to get some work done. Because of the well-defined OOP principles we've used, this is a classic short piece of code. We define validUrl as the result of running possiblyInvalidUrl through the regex. Note that we're calling the Replace() method of the regex we created and passing it two parameters, possiblyInvalidUrl to process and an empty string. This tells the regex to replace the specified invalid characters with an empty string - ie it's removing them - as it searches possiblyInvalidUrl. Finally, we return validUrl as the output of the method. This is how we protect against any unsafe punctuation accidentally making it through to the link.
// Build SEO optimized absolute links from anywhere in the application
public static string insertLink(string pageName)
{
// Grab the elements needed to build the link
string goldenKeywords = "";
string website = "";
string folder = "";
// Purify the linkWithAbsoluteUrl components
website = stripPunctuation(website);
folder = stripPunctuation(folder);
pageName = stripPunctuation(pageName);
// Create and insert the link tag
string linkWithAbsoluteUrl = String.Format("<a title='{0}' href='{1}{2}{3}'>", goldenKeywords, website, folder, pageName);
return HttpUtility.UrlPathEncode(linkWithAbsoluteUrl);
}
First comes the section that hints at the future. In the live version, the first thing I'll need to do is hit the database with the pageName variable and retrieve all the information I need to build the link. For now, without the DB, I'm serving the same purpose by hard coding the results here, to test that the BetaTesting2LinkFactory works. We want to test at every step along the way, and this let's us defer all the DB stuff until later.
Next we remove the punctuation from all the variables that will be incorporated into the URL. This is where our helper method stripPunctuation() comes in. Finally, we construct the output for the link. I create a new string variable, linkWithAbsoluteUrl, using the Format() method of the String object. Then I've built the literal text to display to the page, specifying where each variable will go in the final output. Last but not least, I return the value so it can be output to the page.
Shockingly, this compiles. ;) So let's look at phase 2, making the code produce the output we want.