Higher Level Languages
One notch up from assembly language is what might be called higher level languages, like BASIC, C, C++, Java, etc. The programs are composed of statements, or lines of code, each of which may translate into many machine instructions. Translator programs at this level are called “compilers”, because they must compile each statement into many machine instructions.
These languages are a great labor saver for programmers. It is much easier and more readable to write a statement like:
X = Y + 3 * Z
Rather than a stack of assembly code like this.
LOAD R1, Y (Loads Y into register 1)
LOAD R2, Z (Loads Z into register 2)
MULT R2, 3 (Multiplies register 2 by 3)
ADD R1, R2 (Adds register 2 to register 1)
STOR R1, X (Stores register 1 into X)
I don’t even want to think about what the equivalent binary bit strings of machine code would look like.
This automatic compilation of a fairly readable programming language into almost totally unreadable binary machine code is both the strength and weakness of the compiler. Most of the programs being developed today would be totally impossible without the high level languages. Assembly code is simply not up to the complexity, and would take 10 – 20 times longer if you tried to use it. And nobody in their right mind writes real programs in machine code (binary). However, being able to read them is sometimes a help in debugging.
The down side of a compiler is that you (the programmer) and your program are now separated from the instructions that the computer is actually executing. You assume/hope that the machine code is an accurate representation of your program, and usually this is a fairly good assumption. However the compiler may change the order of operations or substitute “equivalent” operations, if it thinks this may improve things. This is especially true of optimizing compilers, which can move large chunks of code around to try to make the program run faster or use less memory. When the compiler does a good job, it creates code that runs very fast and doesn't use a lot of memory. If the compiler makes a bad guess, it can create bugs that are the very dickens to find and fix.
The output from a compiler can be any of a number of things, depending on how you set the options:
-
A file of machine instructions, binary, ready to run, like an exe file.
-
A file of Assembly code, which may be “tweaked” by hand, then passed to an assembler. Many FORTRAN, C, and C++ compilers will generate either of these.
-
A file of intermediate level code that is always the same, no matter what type of computer it is compiled on. Then you need another program, called a “run time environment” (RTE), that is written just for that type of machine. The RTE acts like a virtual computer to execute the intermediate code. This is how the internet programming language, Java, works. The theory is that, once you compile a Java program on any machine (PC, MAC, Sun, etc.), it will run on any other computer. As long as the person who runs it has the RTE for that machine, the Java program will run on ANY computer. It never works quite that well in actual practice, of course, but that is the theory. If you're interested in more on Java and internet related languages, we have it.
-
Then there are the “interpreters”, which don’t produce any output file at all. There is an interpreter program (kind of like the RTE) which scans your program and performs the operations immediately. The down side is that each statement is scanned each time it is executed. If a statement is inside a loop and executed 1000 times, it is scanned 1000 times. Many of the cheaper/older flavors of BASIC are like this. They are R – e – a – l – l – y S – l – o – o – o – o – w.