Main Points

  • Managing Complexity The most important goal in software development is to reduce complexity.
  • Abstraction The most important concept in programming: the heart of OOP.
  • Other Design Tips OOP allows a number of other best practices which affect the design process.
  • Prioritizing Construction Taking your design and developing a plan for how to construct the project.
History is Now

Once your general architecture is laid out, it is time to start filling in those details; it's time to begin the formal design. This covers two main bases. One, the web designers (i.e. the people who create the visual "graphic design" of the website) need to finish the static design for the website / web application. Two, the static code from the web designers and the user experience design from the UX designers become the start of the software design. Some of the bigger user experience / user interface (UX / UI) pieces have been under construction since the Requirements phase; but now is the time to finish those up and nail down as many User Acceptance Testing (UAT) checkpoints as possible before we begin writing code.

In the 1970s software design was a huge endeavor. Programming teams would work out the entire application in advance (as part of the Waterfall method); any deviation from the initial design was viewed as failure and blame was laid on the designers and / or the developers. In a backlash against this, many agile methodologies were developed with little to no design at all. Agile methodologies have gained such wide traction because the process of building the application is a tremendous education in how to solve the problem, and the evolution of design on agile projects is a major advantage over the old Waterfall method.

However, a hybrid model is undoubtedly superior even to the new agile methodologies. By preserving the design phase you can reap numerous benefits, especially if you take advantage of the elegant Pseudocode Programming Process (PPP) advocated by Steve McConnell. Yet I whole-heartedly embrace the agile idea that you can and should modify your design based on the lessons you learn throughout construction. By treating the design generated during this phase as a good but provisional design, it's possible to meld the value of a significant design phase with the flexibility and improvements which agile methodologies provide.

Managing Complexity

There are so many things you need to do in programming that it can become overwhelming. This is one of the reasons that I'm such a great admirer of Steve McConnell because he's not only a great developer, but he has an amazing ability to understand and communicate what's really important in software development. Before reading his work, I'd been heavily influenced by agile methodologies, Scrum, etc. and still find the minimalist design and emphasis on coding and experimentation most natural. However, I can attest to the sense of "hacking" that McConnell talks about in relation to agile projects.

The design phase encapsulates important software development steps that are difficult to cope with while you're writing code, that's why it was given its own phase in the Waterfall and the Full Software Development Life Cycle (FSDLC). Software is entirely logical and completely unforgiving of mistakes. Writing code emphasizes the details and language syntax which requires attention to a certain set of mental goals. These are not conveniently combined with the higher level mental tasks needed for design issues, and skimpy design is why agile methodologies can frequently under-perform. Retaining a significant design phase enables us to catch design mistakes even before you make them. Then you can code during the construction phase and not be distracted by design issues.

This is critical, because the sooner you can capture a bad idea and correct it, the less work you have to do. Once you design a bad idea, it is relatively easy to fix it. Once you write a bad piece of code, you have to throw it out along with any code that depends on it; that wastes valuable code and time which prudent design could have prevented.

Good design can be very hard work and it requires an intense focus; it also requires a very different skillset than is required for good coding. Therefore, it's easy to find fault with design (especially with such a good whipping boy as the Waterfall method), but agile methodologies do not promote the same kind of professionalism and discipline which can help solve major design questions which are at the center of the primary technical imperative of software development: "Manage Complexity" (McConnell Code Complete 2, 80). In general, this means reducing complexity, but we're talking about the total complexity of the system. That means we may increase the complexity of one part of the system (for example using inheritance) in order to simplify other parts and / or achieve mission critical tasks (like security). OOP was developed specifically for this purpose; by programmers who wanted to simplify their work. They wanted their programs to be simpler and easier to construct and they wanted the work the program performed to be easier to comprehend and design.

OOP Historically

Object Oriented Programming (OOP) has been described as standing on three pillars: Inheritance, Encapsulation, and Polymorphism. I disagree with this for a couple of important reasons. Take polymorphism. It's unquestionably important, but it's not a technique; it's a programming goal. That's like saying my three favorite fruits are 1. oranges, 2. apples, and 3. because they're healthy for you. Second, inheritance is highly problematic. Reams and reams have been written about inheritance because it's so dangerous, and the reason is simple. Inheritance violates the primary technical imperative; it increases the complexity of the program, especially deeply nested inheritance trees. Inheritance is a supremely elegant technique from a design perspective, however, it is not the Design that we are trying to simplify. From a construction standpoint inheritance hides the program code throughout the inheritance tree; this means you have to remember lots of "invisible" code – the opposite of self-documenting code – and you don't even know where the code is. Inheritance is difficult to use, increases the complexity of the program, makes maintenance a nightmare, and helps errors hide throughout your code. So with two out of three pillars knocked out, what's left?

Encapsulation: The Primary Technique of OOP

Actually, this is beautiful because it allows you to focus on the single most important technique in OOP, maybe in all of programming. The simple fact is that when you get confused or overwhelmed, you need to break things down into smaller, simpler pieces that you can cope with. Ta da! Encapsulation. Encapsulation is the technique of taking a small section out of a piece of code, and placing it in its own function. Your code now simply calls that function to do the work required, but instead of having one large complicated piece of code, you have two smaller and more understandable pieces of code.

Encapsulation's ability to reduce program complexity is unbeatable, especially since it has no downside. Inheritance is the most powerful tool in any programming language, but the complexity it creates usually makes it too problematic to use. Under certain circumstances, inheritance is the lesser of two evils, but that's hardly a ringing endorsement. In contrast there is no penalty to encapsulation, and the inherent simplification that comes from breaking one big piece of code into two smaller ones is always a major benefit, whatever reasons you choose to apply it. And there are of number of other reasons to encapsulate functions.

One of the most important is code reuse. When you're programming, there is code that you need to use in many places around the program. If you write the same code in all those places, when you need to change something, you now need to make the exact same changes everywhere. This a nightmare, and you never really know how many different places you need to change – or where they are. Therefore, never copy and paste or rewrite code. Instead, you should encapsulate the code as a function. Then all of the locations in your program simply call the encapsulated function. Now, if you need to make a change, you just update the code in one place – inside the function – and you're done.

Another important reason to encapsulate is to remove parts of code that are likely to change. Future-proofing an application can be a waste of time – however, if you already know that particular pieces of application logic will change, you would be ridiculous not to encapsulate those sections of code in order to hide them. In fact, this is why the principle is called information hiding and it has been shown to decrease maintenance costs by a factor of four. Because that code is hidden away in one place (where the rest of the program can't see it), changes can be made in just one place and the rest of the program never even knows that anything has happened; no changes are needed outside of the function. Encapsulation is the technique which makes information hiding work.

A slightly more technical reason for encapsulating code, but a very important one, is to code more accurately to your abstractions. If you're not familiar with OOP, this probably won't mean anything to you, but it's such an important concept, that we need to examine it in detail.

Abstraction

OK, so we have this fabulous tool, encapsulation, but we're still staring at an empty architecture. How do we start filling in the details? One of the truly great achievements in OOP is how it simplifies design (and by extension construction). Historically, programmers attacked design in linear steps: what's the first thing the program has to do? What next? And what after that? You had to program in strict logical order. Sometimes this worked well, but other times it was nightmarish. This was one of the key issues that OOP was developed to resolve. OOP allows you to design (and code) by abstraction. If managing complexity is the most important technical imperative in programming, abstraction is the most important concept in programming because it's the best way to manage complexity.

Abstraction frees you from talking about a particular piece of data or a block of code, and allows you to talk about a particular object, frequently a real-world object. These objects are abstractions, i.e. something that you know and understand much more fundamentally than programming constructs like inheritance, virtual functions, interfaces, etc. OOP is much more understandable if we look at a hard example. It may be the first step, the last step, or anywhere in between (after all OOP frees us from a linear order of operations), but one frequent abstraction that developers use in their programs is called "user". This idea in your head will be translated into the machine world when you program the abstraction "user" as a class called "User". Now that abstraction lives in your mind and translates directly into the world of your code; it is part of your software architecture, it is utility code encapsulated in one location, the User class.

Now you can use that class to create instances of User objects throughout your run-time code; and not only is that code encapsulated neatly, not only do you get all the benefits of information hiding, but it is also packaged in this clear and understandable mental construct. You've tied that code to the abstraction "user" which has real meaning for you.

As the primary OOP tool for realizing abstractions in your code, classes allow you to organize data and code for your program into these mental constructs that you better understand. For example, users in the real-world have data like names, email addresses, security roles, etc. Users can also perform real-world actions like logging in to a website (through a login() function) or editing their personal data (through an editUserInfo() function). The class User defines that abstraction and is a convenient and clear way to organize all that data and code in one place. e.g. our User class so far would have firstName, lastName, emailAddress, and securityRoles data stored in it and contain the code for the login() and editUserInfo() functions.

However, abstraction not only allows you to design and organize code much more flexibly, it also gives you an amazing ability to mentally manage complexity. Let's say I'm trying to figure out how I'm going to handle login operations for the website. I can dive into the User class, seek out the login() function and deal just with that code. What happens in the rest of the program? I don't know and I don't care except for any specific ways in which the login function is dependent on other pieces of code in the program, for example I probably need to interact with a UI class that defines the XHTML for the login form. Those pieces are important to me, but quite literally nothing else is. I can use the abstraction of a User's login with the UI component; the rest of the program is something I can clear out of my mind. This clarity of mind and focus on just the things that matter is a key to managing complexity more easily in your program; it's the major innovation of OOP.

You can see how reducing the dependency of any piece of code thus makes it easier to program as well. If login() is dependent on three other functions, I have a lot to think about vs. if it is only dependent on one. It's also more difficult to manage the complexity of login() if a function it depends on is lying around in some other class vs. if it's a function (possibly even a private function) that's in the same User class.

Consistent Levels of Abstraction

Abstraction helps you narrow your focus to consider just the issue you're trying to cope with. However, it also helps you in another very important way when you are trying to mentally wrap your head around your program. When we've finished with our login() function, let's say we want to complete an email list to send out a newsletter to all our users (if you do this, be sure that this is an opt-in feature users can choose when they sign up, otherwise you'll have a lot of people reporting you as a spammer).

I'm going to create a sendNewletter() function which will accept an email message that I've composed and send it to each user. This will involve creating a queue of user objects and sending out the email to each one. Note that nowhere in this code am I concerned about details inside the User class. All that stuff which was so important to me just moments ago when I was writing the login() function, I'm now free to put out of my head. The only thing I care about is that the User class exists and my sendNewletter() function is going to send email to a bunch of them. All other details about users are unimportant and so OOP gives me permission to forget all about that for now. This concept is known as levels of abstraction.

Levels of abstraction mean that if one piece of code just grabs an object, I can work at a "high level" of abstraction where I don't need to know any details about it. Then in another piece of code, I can deal with very intricate details of the object, because I'm working at a lower, more detailed level of abstraction. Thus, abstraction works on multiple levels which gives you the luxury of thinking about your users in many ways and at multiple levels of detail. In one moment, I can work with a User all its glorious detail to take full advantage of the power of that code. In the next moment, if I don't need those details I can ignore them, saving my attention for more relevant details in the code that I'm working with. Between the concept of abstractions, and the ability for abstractions to work on many levels of detail, you have a tremendously powerful way to mentally "turn off" any complexity that is not mission critical for each piece of code at the time that you're working with it. This is the single most powerful tool ever developed in software, because it manages complexity in your mind and improves your ability to understand what you're doing.

Design by Abstraction

If you appreciate the importance of abstraction now, then we're ready to start filling in that empty architecture. The concept of abstraction and OOP's ability to realize mental abstractions as classes gives you a powerful combination for conceptualizing application design and then realizing it in your architecture. If you have ever been writing code, and encountered difficulties figuring out what your code was trying to do; you have wanted to design by abstraction. Design by abstraction is the technique of writing out your first function at a very high level of abstraction. This basically describes everything that your program will need to do. In writing out these high level steps, you will identify the high level abstractions for your program, things like login(), sendNewletter(), very high level real world tasks. Then you can tackle each of these abstractions and repeat the process.

For example, we need to write out the steps for sendNewletter(). It's going to grab a list of users, getAllUsers(), create the email message, buildEmail(), and then send it to each user, sendEmail(); these functions are lower level tasks and lower level abstractions that we've now identified. These are the next set of abstractions that we have to write. We proceed to progressively lower and more detailed layers of abstraction until we've written the final run time code which actually copes with nitty gritty implementation details.

The most sensible way to perform this is using pseudocode. Indeed Steve McConnell recommends a very formal Pseudocode Programming Process (PPP) which I like, though I've modified it to my taste. PPP is excellent because not only does it achieve the goal of design by abstraction, it's in plain language (English, Chinese, or whatever you are most comfortable reading). You are basically coding the application without having to write lines of code; you explain each step along the way in enough detail that you can execute the code for each step simply. If you don't know how to write the code to implement a step that you have written, that's where you delegate the job to a lower level function that handles the specifics of that lower level of abstraction.

At its best, pseudocode is completely technology and language agnostic; this is an excellent goal to keep you focused on the issues which the design phase handles best. Pseudocode, too, is ideal because the plain language doesn't get in the way of application design questions the way that code can. If an important step has been missed, it's not so clear when you're looking at a set of variables being initialized and run through a loop. However, it's glaringly obvious when you're trying to explain what's happening in plain language. This is why abstraction, levels of abstraction, and pseudocode are such a powerful combination working through a formal design, and filling in the detailed design inside your architecture.

Other Design Tips

A good design process allows you to "write" a complete application before even beginning the construction phase. You can see where custom objects are used, and determine all the data that needs to be in them. You can examine if there are any tasks you've missed which could be encapsulated into helper functions to provide good code reuse. You can look over your classes and determine if your abstractions are good and the data and functions all make sense in relation to that abstraction. You would like the design to be as real world as possible to front-load those nasty "Ooops" moments; better to lose some design work than a lot of code. You always have a certain amount of redesign that occurs throughout the project, but anytime you have to rework the design, you'd really rather do it during... design. This prevents you from throwing out good code. The most powerful pieces of OOP we have already looked at in depth. Managing complexity is the number one goal of OOP. Abstraction is the number one concept of OOP. And encapsulation is the number one technique in OOP. Yet, there are a number of other good OOP best practices which are relevant when you're designing your program too.

  • Realize Your Abstractions Through Semantics: Abstractions are key mental constructs that reduce the complexity of your software. However, once they are lost from your mind (which happens every time you change levels of abstraction) you have to rediscover them again unless they're documented. Nor do you want to switch out of your code editor to look up development documents, you want the abstractions documented in code; this is known as self-documenting code and it's a best practice. So how do you achieve this? Semantics. It was not that long ago that programmers created variables named x, y, z, i, and j, and functions named asxvbNumLimit() and HandleFlag(). What the hell are those things?? What do those functions do? Semantics means communicating what a variable is and what a function does through its name. If you have a small program with a single class to manage the UI, you call the class UiManager. If it contains a function to create a login form, you name the function createLoginForm(). If it contains two variables, one for a password you've hashed from the login form and the other a hashed password that was stored in the database, you name the variables hashedPasswordFromLoginForm and hashedPasswordFromDatabase. Now when you're reading your code you can clearly identify the abstractions in an instant; the semantics of the names in your code preserve those abstractions and create self-documenting code.
  • Program to Interfaces Not Implementations: When you write your run-time code, you are dealing with concrete classes and telling them exactly what to do. This is how real work gets done. However, this is not very flexible. That's why good programs maximize code reuse through helper classes and functions, generics, etc. Code in these architectural elements are much more powerful ways to design your program, and helper functions and utility classes should be as flexible and widely applicable as possible so you can reuse those functions over and over again. There is always a tightrope between writing code that is specific and powerful and gets a lot done vs. code that is generalized and flexible and can be used just about anywhere. Therefore, the secret is to write code to use "just enough" power to accomplish the function's task and no more (that would make the code unnecessarily inflexible). Enter interfaces. Interfaces are the perfect way to walk this tightrope; it specifies the exact function names and signatures necessary to support a certain task. You can then write any concrete classes to implement that interface if necessary (which means it has all the public functions and data defined by the interface). You then write the utility functions based on the interface and the functions that it defines. Now you can pass any concrete class which implements that interface into the utility function. Thus you get exactly the power you require while maintaining maximum flexibility otherwise.
  • Favor Composition Over Inheritance: We've looked at some of the drawbacks of inheritance; whenever possible you want to avoid it. So what do you use? Composition. Composition defines a relationship between two classes in which one class declares an instance of the other. This creates a dependence; this isn't bad, it's powerful; it just limits your flexibility, so don't depend on anything you don't have to. Composition let's you pull off some neat tricks. For example, I've written factory functions to select a class for composition and then used the factory in run time code; this lets you easily switch which class your current function composes itself with. Thus composition allows you to choose your functionality in one place and then choose different functionality in another, and if necessary, change both later. In contrast, at the moment you write the code which inherits from a class, you are forever after stuck with it. More importantly, composed classes keep all working code visible which is clear and self-documenting. Inheritance actually hides important steps in parent classes invisibly. Deeply nested inheritance trees can do amazing amounts of work, but at the expense of making errors very difficult to trace.
  • Practice Information Hiding: Give functions and classes exactly the information that they need, but never tell them how you get it. This is a huge benefit of encapsulation. It's easy to write a line of code that depends on the lines around it. This creates highly dependent code. When you need to make a small change to such code, the consequences start to cascade through all of the rest of the code, and small fixes become big complicated fixes. That's bad. By encapsulating a piece of code in a function, you not only get code reuse, you hide away the implementation of how that code happens. You no longer have to worry or care how the function does what it does, those details are unimportant and you can simply operate on the result. What happens when you make a change to the code that's encapsulated? Nothing! The code which calls the function never knew what happened in the first place; it doesn't even know anything has changed! It just keeps asking the function for the information that it needs and uses the result. This is a very effective way to gain the power of another function but limit your dependence on it. This is also why information hiding provides such profound and demonstrated cost savings under maintenance, because it so powerfully simplifies program modifications.
  • Higher Level Abstractions are Masters to Lower Level Abstractions: High level abstractions are masters. They do very little actual work, if any, but they control a lot of lower level work and call the appropriate functions when that work needs to be done. Low level abstractions are where the rubber meets the road and things get done, but they have virtually no control over when or how they get called. They simply get called by a higher level abstraction and then do what they're supposed to. While most abstractions live in between these two extremes, every action still follows this distinction religiously based on it's relative level of abstraction. Code should never have contact or know where it can be called from. This gives those higher level abstractions total power over our code. However, when it deals with lower level abstractions, it only acts as a master, it calls them and controls when they do what they do. Designs that obey this rule are much simpler to work with.
  • Strive for Unity: A function should do only one thing. A class should have only one clearly defined area of responsibility, i.e. it should conform to one well-defined abstraction. Any changes that require you to modify your class should not weaken the abstraction it models. Any function you change should still only do one thing. Otherwise you need to redesign, not just rewrite your code. You may need to expand the scope of a class to cope with the extended functionality you're building, but that's it. If you're adding a whole new set of responsibilities to an application it's time to create a new class. If you find a function that does not match the abstraction presented by a class, it's time to move it to a more appropriate place. And if you find a function that does two or more things, it's time to split it up so that one function takes care of each task.

This is a quick catalog of programming wisdom created over decades of industry and university experience in programming best practices. Some of it is OOP specific, but many principles were practiced long before OOP even existed, and are good ideas anywhere. These are all ideas that you can check in the design before a single line of code is written which is why design is such an important phase of development. It gives you the ability to assess if you are violating best practices and fix it early in the process; moreover it's easier to identify those problems in pseudocode because they're written in plain language rather than in formal programming syntax.

Prioritizing Construction

In my discussion of project development, I mentioned writing two versions of the project list. That allows you to sort out the best approach to each iteration of development. I conduct a similar process with my project increments prior to construction.

First, I order each function and custom object by risk, based on the difficulty and challenge of executing the item. The riskier it is, the closer to the front of the list it goes. If code is going to fail and take down the project, you want it to do so quickly. The less time expended on a dead project the better. If code demonstrates a flaw in your design that's going to require major restructuring, better to deal with it as soon as possible. The logic is simple: you don't want to write a lot of code that you're just going to have to throw out. So the order of the first list of functions and objects is solely designed with risk management in mind.

The second list you create is based on the first, but modified for testing. This is where you define your construction increments and each checkpoint in the application. A checkpoint is where you hand over what's been built so far for detailed testing, end user testing, and / or QA testing. You don't want checkpoints to be too abundant because each new checkpoint creates an additional testing burden (though too few and you can hit nasty surprises). Therefore, checkpoints need to be balanced between too few and too many. Increments on the other hand are as small and numerous as you can pull off, and these are what define the order and process of your construction phase. I won't go into too much detail here because incremental development is an important methodology and deserves its own discussion. Nevertheless, it is this second list that defines the order in which we're actually going to execute construction.

Now that we have our program planned out, and prioritized construction, we are finally ready to dive into construction. Before looking ahead to that however, it is worth taking a look at PPP as a design tool in more detail. Steve McConnell presents an excellent explanation of how it works and why it is so useful. I've taken the liberty to summarize those arguments and add my own thoughts. Maybe even more importantly, I will very specifically demonstrate exactly how I create a pseudocode design, including a full sample, which is something I would have liked to see when I was first studying application design.