Main Points

History is Now

OK, you've been waiting for it, here it is. We have real, live developers writing real, live code. We've already taken a good shot at writing the application during the design phase, we just didn't use actual code. Construction is where we sit down and really write, test, and learn from our programs.

Now for the Easy Part

Everything you've done up to this point was to encapsulate out the processes which go along with coding. Of particular interest is the use of OOP best practices and design principles to assess the strength and flexibility of the application. Those steps allow you to address those higher level issues before you actually have to muck around in code. That's a great gift which makes those portions of the process much easier to deal with. It also improves construction directly because when we sit down to write code, all we have to worry about is writing code. Take a look...

Everything we've done up to this point has been to create the right environment in order to best prepare us for success. The requirements phase has given us an excellent outline of our goals and what we're trying to do. Therefore, we shouldn't run into the problem that we don't know what we're trying to accomplish. They're defined in enough detail, including why they are important so that when we need to make design changes, or are filling in low level technical details, we can make good choices that still align with our goals.

The design phase, especially if executed in PPP, has given us the ability to do a quick "no code" build of our entire application. We can see the high level abstractions of the application doing no work but controlling the lower level abstractions which are getting the job done. We can assess if we could make better use of utility functions, really thinking about if the functions we currently have or assigned to the correct classes. We can see if our levels of abstraction are consistent across the program, or if there are any other improvements we want to make. We've also determined the appropriate order of construction and planned out our testing increments. Moreover, everything is written in plain language so it is impossible to mask poor design choices or create confusion about what the design is. There are no variable assignments, function calls, or recursive loops to get in the way of our view of the application design.

All of these things are pieces that we no longer have to consider when we sit down to write code. Moreover, we have radically reduced the chances of project failure. If there's a deal breaker on this application, we have probably found it already. If the project has no goal or no audience, it should have died before now. If the project's design is fundamentally flawed, it should have been reworked... or died before now. On a properly executed project which follows the FSDLC, very few projects that fail reach the construction phase. Construction and deployment are the lowest risk components of application development. As good risk managers, this is exactly why we've put off this work for the final phases of the project.

Save the Right Time

It is so easy to "save time" during development. But it's seductively easy to "save time" on a task that will require you to waste much more time later. Therefore, beware of exactly how you are saving time, especially when you are writing code. While many people think of "writing code" as the "job" of a programmer, saving time when you are writing code can be disastrous.

  • Always prefer read time optimization to write time optimization when coding. You write a piece of code one time, and will read it at least several times (possibly dozens) after that. If you save yourself a minute writing the function, but have to spend an extra five minutes deciphering the function each time you go back, you've made a ridiculous tradeoff. (This is a key flaw of inheritance.)
  • Always prefer maintenance time optimization to construction time optimization when coding. Maintenance is the most critical, longest, least predictable, most difficult, and most expensive phase of the FSDLC. Anything you do that makes maintenance more difficult (even unknowingly, even by accident) is a ridiculous tradeoff. This is one of the key reasons for the design phase, to create good application designs that simplify maintenance.
  • Declare and initialize all variables immediately before they're needed. There are situations where you have to write longer code (especially during development, before you've been able to identify all the functions you're going to encapsulate out). Searching for variables at the beginning of a long function (50-100 lines) is a nightmare.
  • Keep longer functions organized into shorter blocks as much as possible. Preferably encapsulate those blocks out, when this isn't possible however, organize it into blocks that are as well abstracted as you can; this lets you imitate the look and advantages of encapsulated functions.

Binding Time

Binding time is a key concept in programming; this is the time that you initialize the value of a variable. You can bind the value of a variable when you're writing code, when the program is compiled, or when the program is running. Binding time has important site design implications, but it's most important to keep it in mind during construction. That's when last minute modifications to the design can make it so tempting to take short cuts.

I had frequently heard about binding time and its importance, but I had a very difficult time learning even this much about it, much less good rules for it. I had heard people describe early vs. late binding time as a trade off between ease (early binding) vs. flexibility (late binding). Now that I understand binding time and its role in an application, I respectfully disagree with this analysis. The "complexity" involved in late binding is so negligible, and the benefits derived from it so important, that I consider late binding a best practice. Rather than recap the theoretical discussions that gave me such problems, here is a much better way to understand binding time.

  • Coding Time - The earliest form of binding time is to initialize a variable when you write the code. This means using a magic number or a magic string, ie literal numbers and text, e.g. myVar = 21; or employeeSalary = baseSalary * 3.286. This is foolish. Even advocates of early binding time don't go so far as to recommend magic numbers. They are classic examples of WET (Write Everything Twice) code, they're completely inflexible, and they provide no semantics to explain what the number stands for.
  • Compile Time - This is the earliest usable form of binding time. In best practices, this means using named constants and / or enumerations. e.g. myVar = freezingPointOfWater or myVar = (int)Temperature.FreezingPointOfWater This a useful technique because it's highly semantic and prevents the value from being changed. When working with an item whose value does not change, this is the way to go, but obviously it's not feasible for variables. Initializing constants at compile time also means that you benefit from compiler errors if something goes wrong. Misspelling the name of a constant or enumeration is a common error, for example, but the compiler finds all those problems for us automatically so we can eliminate them.
  • Load Time - This is one form of late binding previously lumped together as "Run Time". Binding at any phase of run time provides much more flexibility than the alternatives and should be used except in cases like named constants where we Don't want that flexibility. Load time is when the application first loads. Any variables defined from external resources or configuration files are defined at load time, e.g. myVar = WebConfigurationManager.ConnectionStrings[inputConnectionNameInWebDotConfig].ConnectionString; Load time is an excellent choice for application variables that need to be set once and never again. Load time variables are also excellent choices to store in caches (usually static variables that are permanent for the life of the applicaiton) to avoid having to fetch the values from external resources more than once.
  • Instantiation Time - This is another form of late / run time binding. Instantiation time occurs when a particular object is created. While the application will only load once, it may initialize (create) an object many times. This usually means encapsulating initialization of the variable in a separate function. e.g. myVar = getValueOfMyVar(); or employeeSalary = getSalary(employeeId); Instantiation time is an excellent choice because function calls are so beautifully flexible. Moreover, variables that a user configures at the start of a program or the beginning of an action must be defined at instantiation time or later. If you try to initialize them earlier, they must be reinitialized later (probably at instantiation time anyway). Instantiation time is one of the two key ways of binding to permit user personalization and other dynamic functionality in an application.
  • Just-in Time - This is the latest binding time. Just-in time occurs when a particular object is used. This means encapsulating initialization of the variable in a separate function and calling that function at the time of use. e.g. myVar = getValueOfMyVar(); or employeeSalary = getSalary(employeeId); The technique for binding at just-in time is identical to binding at instantiation time; the only difference is when the function which determines it is called. If the variable is initialized once when the object is created (e.g. in an object's constructor) then it's initialized at instantiation time; if it's initialized repeatedly (e.g. in an event handler) then it's initialized at just-in time. Just-in time is one of the two key ways of binding to permit dynamic functionality in an application.

Encapsulating variable initialization is the most important form of late binding and a best practice; it's at the heart of the factory pattern, one of the workhorse design patterns in software development. While the factory pattern has a number of formal requirements, the primary purpose is to delay binding time until the last possible moment. Inside the factory function (which actually generates the variable's value), I find enumerations to be especially useful, since factory items are typically highly related and have only a limited number of options – which is practically a definition of when enumerations are a best practice. This is also a great pattern, because code throughout your project calls the factory and thereby gains instantiation or just-in time flexibility (and information hiding). Yet inside the factory, you gain the advantage of throwing compile time errors, such as if you misspell an enumeration. You literally get the best of both worlds.

Programming Syntax

One of the key things to focus on during construction is paying attention to language syntax; this is usually not much fun, but it's nonetheless critical. Watch your statement terminators, off by one errors, use of whitespace, the organization of your braces / brackets / parentheses, and beware the perils and pitfalls of working with specific types of data...

Strings

  • Magic strings are frequently (but not always) more semantic than magic numbers, but when used repeatedly cause the same maintenance problems. If you are defining something that is truly one of a kind, it may be efficient to use a string literal, but if it's strewn throughout an application, you will hate your application. Store the literal in a variable and then reference the variable in all the necessary locations; you'll be much happier.

Booleans

  • Booleans have the maddening habit of turning out to be inappropriate because you actually have more than two states. A boolean is either or, true or false, one or the other. On the one hand, the binary nature of early computing fits very well with boolean logic (remember we are not that far removed from the early programmers who wrote in binary or assembler; my father has written assembler code). However, applying that logic to real world problems usually fails. Most cases usually have three (or more) states; that's one reason why the real world is messy. Enumerated numbers (usually an integer, but not always) provide multiple defined states and have better semantics; they've functionally replaced booleans in my code. Whenever possible, replace boolean variables with an enumerated number – hint: it is always possible.
  • If you do use a boolean variable, make sure it's existence is unknown to the outside world. Make sure you never pass it as a parameter; expose it to change (even indirectly) from the outside. Booleans are not that big a problem if you make sure that no outside function knows of their existence. That makes it simple to change later, by switching to enumerated numbers. However, as soon as you pass the boolean as a parameter or allow a property or function to reset the boolean value, it is known to the outside world. If you have to change it, you have to walk your entire code base looking for references when you discover that you actually have a third and fourth state. Ouch! That's why I just use enumerated numbers in the first place.

All Numbers

  • Always be careful of division, if the bottom of the fraction goes to zero, everything goes to hell. You must always break out an if condition to test if the bottom goes to zero. Set the bottom of the equation to 0, and solve for the variable you're going to test. Then define what should happen in that case.

Approximate Numerical Data Types (ANDTs)

  • Watch your accuracy with ANDTs: Floats, decimals, singles, doubles, etc. ANDTs all have serious issues with rounding errors. This does not make them unusable; it only means that you are taking on a major testing burden to ensure they work properly.
  • All ANDTs are subject to rounding errors. This becomes worse and worse as the values in the equations become too different in value. If all values are in the same general ballpark, e.g. 1000369.72 + 430289.9 = 1430659.62, you're probably OK. However, 1000369.72 + 0.0000056009 = 1000369.720056009 will almost certainly have rounding errors and 200060003290000.54 - 200060003290000.0001 = 0.5399 will almost certainly not produce the results you expect either.
  • It's frequently not possible to see if ANDTs are equal because of the rounding errors, so use delta values to compare ANDTs. Build a utility function to subtract two numbers and see if the result is less than the delta value in order to determine equality. By tuning the delta value, it's usually possible to produce a reliable means of comparing ANDTs even under very difficult circumstances.

Integers

  • Integers hold values more consistently than ANDTs, and are fundamentally simpler to work with. That makes them your first choice when you need to use numbers in your program. However, if you need ANDTs, they're available.
  • For many operations, integers are much easier to work with. However, when dividing integers there's one extra thing to watch out for. Integers usually round to the nearest integer (it's the only way to ensure that the result is an integer). Therefore, rounding errors can be much more of a problem. Factor this in and ensure the rounding errors won't be a problem, consider using an ANDT instead, or create a fractional data type that preserves the integers on top and bottom and work with that instead (this is probably the best solution, but it requires the most programming).

As you can see most "real" construction issues are pretty much mechanical: "Watch out for division", "Avoid repetition of numerical and string literals" "Don't bind your variables too soon" etc. However, this helps to elucidate the primary value of the FSDLC. By isolating the other facets of software development, you are free to concentrate on these concrete details here. The process for accomplishing this is very similar to the iterative development methodologies which govern project management; break everything down into the smallest testable pieces that you can. This approach, known as incremental development is a highly effective way to approach construction.