Main Points

History is Now

Welcome to happily ever after. It does not look like the brochures. With any 3rd party software you've ever purchased, with any in-house project you've ever undertaken, someone had to pitch an idea or pull out a glossy brochure to convince everyone that this was a good idea – now that project is complete or that software has been configured and is up and running. The entire rest of the life of the project will henceforth, look nothing like the advertising.

Maintenance is the most difficult, most demanding, most expensive, most resource intensive of the FSDLC by all measures. This is because it is the longest phase of the FSDLC... by years. This requires some significant differences in how you approach the maintenance cycle of a project, and you can only hope that the original developers were thinking about you when they put things together. If they didn't, you are in a lot of trouble... this is especially embarrassing if you were part of the original development team. So let's dive into the details of maintenance.

Diagnosis

It happens to every piece of software. No matter how far-sighted the developers, no matter how heavily over-built the code, every application gets sick once in awhile. Functionality that end users were enjoying suddenly stops working. Someone added a new module which turned out to be not as fully tested as everyone thought; now it's not working, or worse, causing strange errors in a supposedly unrelated portion of the application. If the development team which built the software did not follow the kinds of best practices we've been discussing in regards to the FSDLC, that kind of thing happens far more than it should and eats up huge amounts of time fixing things that were supposed to be working automatically. There are even places which prioritize hiding the symptoms from customers instead of diagnosing the problem and stopping the flood of problems. Only once you stop the application from generating more work than you can handle, can you begin to have time in your day to clean up the mess.

The first step in the process is to sit down with the testing site and see if you can duplicate the error. If not, there may be a screw loose in the production version and the thing to do is wipe the production site and reload it. That may be all that's necessary to correct the problem. In one of the worst cases, you can't reproduce the error in testing, but you've reloaded the site a time or two and the error is consistent. That means you have an error due to differences in the testing and production sites. On one level, if your architecture is well designed, this only leaves a few places to look, however, you have two strikes against you. One, you're going to be testing on the production site (always a horrible idea), and two, these errors have a tendency to be really nasty.

More typically, you can reproduce the error on the testing server and you immediately sit down to see if you can simply roll back to a version that works. In almost all cases, this is possible and once you identity your fallback position, you can fix the production site by rolling it back to a working version. Test the rolled back site to make sure everything is OK, but you have now at least bought yourself some time to fix the problem.

Obviously, halting development to embrace an old version can only be a very temporary solution. Diagnosis is the first step in the long term solution of your problem. You don't want to panic, but anyone who can be reasonably helpful in this process should be yanked from whatever they are working on until the problem is solved. You don't wind up with a strong working code base by accident – you build a strong code base by planning it well, implementing it professionally, and guarding it... gee, as if your life depended on it. Therefore, you need to diagnose the problem. There are two key things to do when making a diagnosis.

First, use your error handling to spot errors and figure out what's going on. Sometimes errors are misleading, but certainly the first step is to read your errors. See what they tell you and experiment with resolving the problem assuming that the error is correct, and if that fails, take a stab or two in that general vicinity. You will note that the success rate of this technique is directly proportional to the value of the error handling which has been built into the application. What seemed like a huge waste of time in construction is now the only thing that can save your life.

Second, testing is not a random, or a gut feeling process, even if that's how you pursue construction. If you don't understand the results that are coming back from a test, you have wasted all the time it took and learned nothing. As I said, if my error message seems like it's off base, I will take a stab or two to see if the error message may at least lead me to the answer indirectly. However, I'm now on a really short span of patience with the error message. These deep tests, a white box test that check specifically implemented functionality, are usually the kind that finally break the problem open. However, because they're so specific, you may have to run a handful just to check one function, and that means it takes a lot of time to locate a difficult problem. If I feel like I'm starting to wander around in the dark, it's time to take a new tack.

Assuming neither of the first two approaches worked, it's time to test systematically which implies a pattern or method. I now switch from deep tests to shallow, black box tests. The disadvantage of this is that I can't solve the problem. The advantage is that I can test virtually any sized portion of the application that I want, which means I can check large sections of the application relatively quickly. You start by taking your best guess which application path is creating that problem (if you're wrong, you have to make your next best guess and start over). Now, start taking a look at the steps in that application path. Again, shallow, black box tests don't test the inside of the section being examined, but all I want to do is locate where the problem is. So step by step I hardcode input to the steps and check the output; usually I start with the final step to see if that produces the output I expect. If it does then I back up a step, etc.; this allows you to use the application as your testing harness. This lets me identify a section of the program where the problem is, and I can now dive a little bit deeper and do slightly deeper tests on the steps of that portion of the program. This process allows me to create a window around the problem code and narrow it level by level until I can nail it down and then solve the problem.

As you can see, depending on what the problem is, diagnosis can be pretty involved. However, using these techniques, you have a reasonable chance of diagnosing what's going on.

Modifications

Code modifications are the largest issue in maintenance. Even if you diagnose a problem, your main goal is to make the necessary modifications so that it doesn't happen again. If you are concerned, you might even beef up the error handling in this section a little. However, errors are hardly the only time you will undertake modifications. So while modification may come after diagnosis, it's certainly not the only time you will want to be modifying code.

Because of this, making modifications is where you spend most of your time in software development, not just maintenance. This is why it is so important to optimize your code for the maintenance cycle vs. optimizing for construction. Numerous studies have demonstrated that the maintenance cycle is the most time consuming, most resource intensive, most expensive phase of the FSDLC. The dirty little secret of software development is that this is only the tip of the iceberg. When we talk about writing code that's optimized for saving time under maintenance, what we're really talking about is code that is optimized to be easy to change, easy to modify. There are three major times in the FSDLC when you have to modify code, and maintenance is only one of them.

  • Modification Under Maintenance: When something goes wrong, or you need to upgrade a feature in a finished application, you need to modify existing code. This is the most obvious form of modification and the one typically referred to in studies on the importance of and expense of the maintenance cycle.
  • Modification Under Construction: Here is where most studies begin to fall apart and underestimate the importance of optimizing code for modification. Code that is not optimized for modification is a frequent cause of project cost overruns and missed deadlines. Anytime testing finds an error during construction, whether it's a developer testing their own code, peer reviews, QA, or user acceptance testing (UAT), you're going to have to locate and fix the error. Anytime the application design changes, you're going to want to save all the code you can; some will need to be thrown out, some new code will need to be written, but as much as possible you would prefer to save time by modifying the existing code. The feasibility of that goal will depend dramatically on how effectively you've optimized for modification. Anytime a manager allows your project to become a victim of scope creep, you will have to dump old code, write new code, and hopefully modify as much code as you can. The fact is that as soon as you type in the last character of new code, it will henceforth be under modification, even though scientifically it is not yet under maintenance.
  • Reusing Modules / Code in Construction: Assuming you don't like reinventing the wheel, once you've built your first module, you are starting to build your code base. As your code base grows there should be more and more modules of new projects which can be constructed simply by applying code you've already written. Each of these modules save you time because you don't have to write them, but this assumes that you've properly optimized your code for modification. Otherwise, it may literally be faster (though not wiser) to write new code from scratch.

Thus the importance of modification is much larger than any study has ever attempted to measure. But at least it should be apparent now, just how important writing code for easy modification is and why.

Optimizing for Modification

Once you understand the importance of code modification and its overriding priority in the FSDLC, how do you address that? What are the practices which make code simple to change and easy to modify under construction, under maintenance, etc.?

  • Write DRY code: The single most disastrous thing you can do to your code is to write something in two (or more) places. No major computer network is designed without a database, because the consequences of storing your data in two places are so destructive. The costs are just as serious when you have a variable value hard coded throughout the system, or when you repeat a particular piece of code throughout your application instead of encapsulating the functionality into helper function. That is writing code on the WET model (Write Everything Twice). Instead you should follow the DRY model (Don't Repeat Yourself). Create well-abstracted utility classes that contain a lot of basic functionality that you can use in any application. Encapsulate your helper functions to consolidate code in once place.
  • Write Flexible Code: Use composition to provide access to important helper classes. If you need to modify the the program or add functionality, it is easy to modify that helper class or to write new helper classes which can be composed with your run time code to accomplish the modifications that you need. Use interfaces and generics to better parametrize code. This allows the consolidation of similar but not identical pieces of code; e.g. instead of writing a function to find the max value for a set of integers, and another for a set of doubles, and another for a set of floats, you can create one generic function which can serve all those data types. When necessary, you can even use abstract classes to consolidate function implementation for a number of subclasses, though ideally composition is a more flexible choice.
  • Good Error Handling: One of the most problematic parts of fixing errors is diagnosing what they are, and where they are. Well-written error handling can be a lifesaver to identify problems. When a program module you haven't touched in two years throws an error, just figuring out how the module works again can be a major task. Locating the errors is the extra nasty icing on the cake. How much better is it to spend a couple hours during construction to write really intelligent error handling and identification code while you are still intimately familiar with the module's workings. Then when something goes wrong later, you don't burn days, weeks, or months on the problem; the error messages can tell you exactly what's going on.
  • Encapsulate Change: I can't stress the importance of this enough; it's at the root of information hiding. Encapsulate information / data / code away from the rest of the program in one place; it's one of the few proven techniques for dramatically improving your software. Steve McConell notes, "(this) has been true for a long time (Boehm 1987a). Large programs that use information hiding were found years ago to be easier to modify - by a factor of 4 - than programs that don't (Korson and Vaishnavi 1986)." (Code Complete 2, 96). But this goes beyond simple encapsulation and writing DRY code. Build your application so that as much as possible, it does its work through utility functions, utility classes, and utility modules. For one thing, this utility code base can serve as a generalized base for each new project. Under modification, however, you can organize your utilities into large sections of code that do typical grunt programming work and a few specialized classes / modules which contain code that is likely to change and / or need reconfiguration to work for a new application. Now when you need to modify your code, large sections of the application don't care because they're utility code. You don't need to look at them either, so you can focus and become really expert at understanding and manipulating a relatively small part of your program. Only a certain limited subset of the code in the application requires your attention when performing the majority of your modifications.
  • Throw Compiler Errors: It's hard to understate this. If your code is organized so that you throw compiler errors instead of run time errors, you have few cases where you're surprised by errors. Compiler errors are like monsters that jump out and grab you. Given the amount of time we've spent talking about the difficulty of locating most errors, you will immediately appreciate why this is awesome. You don't have to go looking for these errors... they come looking for you! In .NET they come looking for you with location information about line number, file name, and path so you can find exactly where it is quickly. While these can sometimes be misleading, this is golden information that can make error corrections almost a snap. OOP gives you significant abilities to use properties or functions to set internal data, however, these don't throw compiler errors if data is missing. I prefer to pass custom types to parameters in function calls for exactly this reason. Parameters with incorrect data types that don't match the defined function signature(s) throws compiler errors to make it easy to straighten things out. Inappropriate values could cause run-time errors, however, knowing this allows you to test value ranges thoroughly and provide error handling where necessary to eliminate these problems.

These are some of the key techniques you can use to make it easier to modify your code. Given the importance of modification, this then is a list of some of the most important best practices in programming.

Maintenance Best Practices

There are other things you can do to make your maintenance cycle as smooth as possible. Never work on the production site. Never. Never. Never. Developers work on the development site, QA works on the test site, the production site is for end users and never any twain should meet. Changes begin on the dev site where they can be worked out thoroughly. At each checkpoint the application is uploaded to the test site for QA to test thoroughly. In an emergency, work can be done on the test site and checked out at test.mysite.com. If that works OK, the changes can be pushed down to the repository and up to the production site. But you should never write a change into the production site and see if it works. Anything you see on the production site should have at least passed muster on the test site even under the most dire conditions.

Because locating relevant code is a large issue in maintenance, keeping your application structure organized is critical. There are many important benefits to this. For example, C# is one of the languages that note the file which throws an error and the path where it resides in the default error messages you use on the dev site. This makes it easy to find code files. I therefore take advantage of this and base all my application namespaces on the folder structure noted in these errors. This coding standard makes it very easy for me to debug namespace errors because I simply compare the namespace to the folders in the path; if the error notes a particular path and my code is calling a different namespace, then I know that's part of my problem. It also makes it very easy to create a new namespace no matter how long and complicated; just copy the path and replace the backslashes with dots. That helps eliminate a lot of nasty to diagnosis but simple spelling errors in namespace names.

Nor are my folders arbitrary; they're based on the website folder structure. Therefore, I may have pages for the C# testing section of my beta website at "Testing/AspNet/CSharpProgramming". If a page, e.g. Test1.aspx, in that folder requires custom code, it appears in the code behind file or in a C# library file called Test1.cs which lives at the path "App_Code/Testing/AspNet/CSharpProgramming". Further, the code in the Test1.cs file lives in the namespace Testing.AspNet.CSharpProgramming.Test1, it doesn't get any easier than that. Because everything matches, there is that much less complexity in my applications; and that means I never have to look far to find the namespace I should use in a using statement, the location of a webpage, or the location of a webpage's code files.

Managing maintenance is the most important phase of the FSDLC. As you can see the techniques for managing maintenance well are based on making it easier to modify your code, and that is anything but a purely maintenance issue. These are critical ways that you can make code maximally modifiable, and although they are not usually the fastest ways to write code, they will pay you back the time invested a hundred times over.