Learning Objectives
By the end of this section, you will be able to:
- Define low-level programming languages, including assembly language
- Define middle-level and high-level programming languages, such as C and JavaScript
- Compare and contrast the various programming paradigms
Algorithms are used to solve computational problems and create computational models. A computational model is a system that defines what an algorithm does and how to run it. Examples of such computational models include physical devices that can run software, programming languages, or a design specification of such. A programming language is a linguistic application of an algorithm, which uses computational models focused on defining notations for writing code.
Many computational models have been devised for a host of other applications. There are many different roles and perspectives within the worlds of computer science and software development. The end goal of software development is to create working software that can run on a hardware model, which itself uses a (hardware) realization of an algorithm that enables specific physical computers to execute software programs. A hardware model is designed for the convenience of a machine, not a human software author, so hardware models are poorly suited to writing code. Computer scientists have created programming languages which are designed specifically for programmers to develop practical applications. These languages are usually classified into high-level (Java, Python) and low-level languages (assembly language). A high-level programming language operates at a high level of abstraction, meaning that low-level details such as the management of memory are automated. In contrast, a low-level programming language operates at a low level of abstraction. Languages like C and C++ can perform both high-level and low-level tasks.
Most software is designed, written, and discussed in terms of how a program should work. It is basically a series of steps that provide a direction of how the program must be executed. An example of this would be the “Map Reduce model” which is used in distributed systems like the Google search engine to produce search results for large data sets using a complex algorithm. Moving even further away from hardware models, computer scientists have also defined an abstract model, which is a technique that derives simpler high-level conceptual models for a computer while exploring the science of what new algorithms can or cannot do.
Modern computers are equipped with a central processor, referred to as a central processing unit (CPU), which is a computer chip capable of executing programs. A CPU’s hardware model relies on a specific CPU instruction set architecture (ISA) that defines a list of operations that the CPU can execute, such as storing the results of calculations (Figure 4.2). With the advancements of technology, computer engineers have designed computer architectures with increasing sophistication and power. Examples of hardware models include the MOS Technology 6502 architecture used by the Nintendo Entertainment System, the ARM architecture used by mobile phones, and the x86-64 and AMD64 architectures used by modern personal computers. Computer engineers design architectures with hardware specifications, such as execution speed or energy use, in mind. Therefore, hardware models are not suitable for humans to use for communicating algorithms.
A programming model is designed for humans to read and write. A programming model focused on defining a notation for writing code can also be called a programming language. A programming model can be used to implement a software algorithm using a strict set of syntactical rules. A language’s syntax can define keywords such as “if.” The syntax may include a mathematical operator, a fundamental programming operation that combines values, such as “+.” The syntax can define punctuation such as “;”. Essentially, the syntax gives the precise meaning for what each of these elements directs a computer to do. The text of a program written in a programming language is called source code. Software engineers have created practically all software by writing source code in various programming languages. Since a programming model cannot directly execute a program, a compiler or interpreter must translate source code from a middle-level or high-level language into something a computer can execute.
As mentioned, abstract models are computational models used to think about algorithms in the abstract, rather than being used to create and run software. The goal of an abstract model is to make it easy for people to devise and convey algorithms. Computer scientists use abstract models to create new algorithms, analyze the efficiency of algorithms, and prove facts about what algorithms can and cannot do. An abstract model is not concerned with the details of computer architectures, which makes it easier to focus on these sorts of deep questions. Examples of abstract models include the Random Access Machine, the Turing machine, and the Lambda calculus. The Random Access Machine (Figure 4.3) is a CPU that consists of unlimited memory cells that can store any arbitrary value. Just like any other CPU, the PC determines the statement to be executed next. A Random Access Machine can be used to analyze the efficiency of algorithms. A Turing machine (Figure 4.4) is a mathematical model that can implement any algorithm. The Lambda calculus is a theoretical computation concept using lambda functions. It was defined by Alonzo Church and inspired the functional programming paradigm, which you will learn more about in Programming Language Paradigms.
A function defines how to convert an input into an output, and functional programming is a paradigm in which algorithms are written as mathematical functions. An example of a functional programming paradigm is a recursion (refer to the following code snippet), where a function is used to call itself. The factorial of an integer can be computed using a recursive function.
import java.util.*;
public class Recursion {
public static int Factorial_Recursion(int Val){
if(Val==0){
return 1;
}
else
return Val*Factorial_Recursion(Val-1);
}
public static void main(String[] args){
Scanner s =new Scanner(System.in);
System.out.println("Enter an input value: ");
int Val=s.nextInt();
System.out.println("The factorial of " + Val + " is: " + Factorial_Recursion(Val));
}
}
An algorithm described in an abstract model cannot be run directly. First, a software developer must implement the algorithm, which means translating the abstract algorithm into source code in a programming language.
It may be surprising that so many different programming models can exist, and that algorithms can be translated from one model to another. However, this translation is by design; computer science established the Church-Turing Thesis, which is a scientific theory stating that an algorithm can be converted from any reasonable computational model to another. The Church-Turing Thesis provides a lens through which computer scientists can invent many computer architectures and programming languages, all of which can run algorithms.
Computer scientists have created terminology to make sense of the similarities and differences among all these programming languages. For example, any programming language can be low level or high level, or it can fall anywhere on the spectrum.
Low-Level Programming
A programming language’s level of abstraction is the degree to which a computational model, programming language, or piece of software relates to computer hardware. A low-level language has a low level of abstraction, while a high-level language has a high level of abstraction. In a low-level language, the programmer must describe an algorithm in terms that the computer hardware can understand directly. Source code must describe details such as the location of data in memory and the particulars of how the computer calculates arithmetic.
Generally, low-level programs execute faster but are more labor-intensive to create and maintain. In a low-level language, the programmer is forced to think deliberately about how the computer hardware executes, so the finished program usually executes efficiently. However, that deliberate thought takes time and effort. In a high-level language, the programmer is not burdened with thinking about so many details and can finish their work faster while preventing certain types of programming errors from occurring. A compiler automates converting high-level code to low-level code, but that automated process can introduce some inefficiency. In some settings code performance is more important, and in other settings programmer productivity is more important, which is why we have both kinds of languages.
We can think of low-level programming languages in terms of cooking: when you cook a meal from scratch, you control every ingredient and every detail of preparation, so the finished meal has precisely the taste and nutrition that you desire. An alternative is to prepare a meal that uses some prepared ingredients, and when you do that, you lose a lot of control over details, but the process is significantly faster and easier.
There are many examples of low-level programming languages, but the most fundamental language understood by computers is made up of a sequence of digits.
Machine Code
The sequence of binary digits (bits) that can be understood and executed directly by a computer is called machine code (Figure 4.5). Machine code is the most low-level language. It is also known as binary code. It is a program in the native format that can be understood by a CPU, in the form of a long series of 0s and 1s. Machine code, or binary code, is the only computational model that a computer can execute; a program written in any other language must be compiled or interpreted into machine code before the program can run. The CPU of a computer is a computer chip capable of executing machine code programs (Figure 4.6). It is impractical for humans to work with machine code directly because a machine code program is not designed to be human-readable. The patterns of 0s and 1s are designed to be convenient for a CPU to decode, not for humans to manipulate; and such programs are long, typically millions or billions of bits long. Another obstacle is that machine code is hardware dependent. As discussed in 5.3 Machine-Level Information Representation, every processor architecture has its own machine language, so machine language written for one architecture (for example, INTEL X86) cannot work on any other architecture (such as ARM). When the very first digital computers were built, and programming languages had yet to be invented, programmers had no choice but to write machine code by hand. However, this is extremely time-consuming and prone to errors, so is almost never done today.
Assembly Language
The low-level language in which every statement corresponds directly to a machine instruction is called assembly language. Assembly language is a small step above machine code but is still a very low-level language. Assembly language is a textual representation of machine code. Just like a machine code program, an assembly language program is a series of instructions that a CPU will execute. However, rather than writing the instructions in a binary format of 0s and 1s, each instruction has a textual name such as “ADD” or “MOVE.” An assembler is a program that translates assembly language source code into machine code. As shown in Figure 4.7, an assembler translates each textual instruction into the corresponding list of 0 and 1 bits.
While it is practically impossible for a human to write a complete program in machine code, writing programs in assembly language is viable. Because assembly language is extremely low-level, these programs tend to run quickly, but they are labor-intensive to write, and are machine-dependent. This type of programming was sufficient in the 1960s, 1970s, and 1980s when software was written for one-off capital-intensive machines, such as multimillion-dollar mainframes or space vehicles. Programmer labor was comparatively cheap then, and there was no need to move programs to different hardware. But today, we expect applications to be compatible with multiple kinds of platforms, including phone, computer, and gaming systems. Programmer labor is more expensive than computer hardware, so writing entire programs in assembly language is uneconomical. Consequently, programs are often written predominantly in a high-level language, with assembly language used to write short excerpts on an as-needed basis. Writing code in a higher level language makes it easier to write correct code that does not have defects.
Link to Learning
Assembly language has been used in high-profile, high-budget projects, such as Apollo 11, the NASA spaceflight that first landed humans on the moon. You can examine the assembly code for the embedded computers in the space vehicles, which has been released publicly. Notice how it is quite low-level, perhaps difficult to follow, and reflects an immense amount of fastidious work.
Middle-Level and High-Level Programming
As the name implies, a middle-level programming language is at a level of abstraction in between low-level and high-level language, and allows for direct hardware access. The C programming language is a middle-level language that has been in wide use since the 1970s. The C++ programming language is a middle-level object-oriented language based on C. In general, the trade-off between low-level and high-level languages is that writing low-level code is laborious and error-prone, but the finished code executes very quickly; high-level code is faster, easier, and safer to write but does not run quite as quickly. Middle-level code is a compromise that executes nearly as fast as low-level code yet has some of the productivity benefits of high-level code. Like low-level languages, middle-level languages allow direct access to computer hardware, making it possible to write hardware-specific programs such as operating systems and device drivers. An operating system is the software that provides a platform for applications and manages hardware components. A device driver is a piece of code responsible for connecting to a hardware component, such as a video card or keyboard. Figure 4.8 summarizes the trade-offs between low-level, middle-level, and high-level programming languages.
Middle-level languages are ideally suited to writing systems software, programs that provide infrastructure and platforms that other programs rely upon. The core part of an operating system that is responsible for managing and interfacing with hardware components is called the kernel. Kernels need direct hardware access, so high-level languages are inadequate. Practically all widespread kernels are written in C and/or C++ (such as Windows, MacOS, Linux, iOS, Android, Xbox, and PlayStation). Compilers for high-level languages, such as Python, Java, and C#, are themselves implemented in middle-level languages such as C.
Figure 4.9 summarizes the various types of programming languages and how middle-level languages overlap with low-level and high-level programming languages.
A high-level language is farther from a hardware model, and closer to an abstract model. Source code in a high-level language does not address low-level details, and instead focuses on how an algorithm proceeds, such as “visit every item in a list.” A high-level language is like viewing Earth at a high altitude, revealing large features such as the contours of rivers and highways, whereas a low-level language is like viewing Earth at ground level, which allows for focusing on fine details such as the activity of individual people and animals.
Web application frameworks (e.g., React, Node) are written in high-level languages, principally JavaScript. A web framework is a special tool for building and managing web applications. Some common ones used in web clients are HTML5, CSS, and JavaScript. Native Android apps are primarily written in the high-level language Java, and iOS apps are primarily written in the high-level language Swift.
Programming Language Paradigms
So far, we have categorized programming languages according to their level of abstraction into low-level, middle-level, and high-level languages. A different approach is to categorize languages into paradigms. A programming language paradigm is a philosophy and approach for organizing code, the ideas in a program, and the layout of its source code. Real-world programs involve many thousands of lines of source code, which is too much for a human to digest without some kind of organizational structure. Computer scientists have developed several different paradigms for creating this structure.
Unlike level of abstraction, paradigms do not fall on a spectrum. Instead, a particular programming language either adheres to the philosophy of a paradigm, or it does not. For example, C is a structured procedural language and not an object-oriented language. Without getting into too many details, C is procedural because it allows programmers to place code in functions that can be called from various places in a program. However, C is not object-oriented because C does not allow, like Java does, the creation of objects that are instances of classes. We will broadly explore these different types of paradigms later in this section, but the chapter on Chapter 7 High-Level Programming Languages elaborates on these and other paradigms in more detail.
The Imperative Programming Paradigm
In imperative programming, the programmer writes a series of steps that must be followed in order. Source code spells out a precise series of operations that the computer must execute in order. Since the computer is told to take specific actions and execute these statements, the language is referred to as “imperative.” An imperative is an order or command. Low-level languages are imperative languages, and middle-level languages, such as C, are imperative and include another paradigm. While low-level languages can mimic the style of a structured language, these properties are not inherent in the language itself and must be imposed by the programmer as a practice. Assembly code can easily be written in a non-structured way.
Declarative and Functional Programming
Another type of programming, declarative programming, is a paradigm in which code dictates a desired outcome without specifying how that outcome is achieved. Declarative languages are an alternative to imperative languages. In a declarative language, the programmer declares the desired outcome, and it is the compiler’s job to create a series of imperative steps that obtains that outcome. For example, the Structured Query Language (SQL) used to query database systems makes it possible to specify what data should be retrieved from a database, without specifying how the database system should retrieve that data.
Functional languages are another alternative to imperative languages. Recall that a function is a mathematical object that defines how to convert an input into an output. For example, given x = 4, the function converts the input 4 into the output 7. Functions can be defined in most programming languages and correspond to small sections of code that perform a specific task such as a calculation. Functions can be defined in most programming languages.
Functional programming is a programming paradigm in which algorithms are written as mathematical functions (Figure 4.10). In functional programming, practically every part of the program is written as a function. The programmer writes functions that convert inputs to outputs, and it is the compiler’s job to create imperative steps to evaluate the functions.
Declarative and functional languages are considered high-level languages because the compiler is creating these steps on behalf of the programmer. Functional languages are discussed in more detail in Chapter 7 High-Level Programming Languages.
Structured Programming
In low-level languages and early high-level languages such as BASIC, some special statements called conditional statements (using “if/then”) and iteration (called “loops”) are programmed using an operation called GOTO, a non-structured operation that instructs a computer to jump to an entirely different part of the program. In large programs, these jumps from one spot to another interact in complex ways, so the flow of execution is difficult to understand when attempting to read the code. These sorts of programs are criticized for being messy “spaghetti code” (Figure 4.11).
Newer languages were developed to help avoid spaghetti code. In a structured programming language, control flow leverages conditionals (e.g., “if-then”) and iteration statements (e.g., “while” or “do while”) and never uses GOTO statements. For example, C is a structured language that includes the conditional statements “if” and “switch,” and the iteration statements “for,” “while,” and “do.” In proper C, all the code sections that involve conditionals and iteration are written with these statements, and GOTO should not be used. Note that the fact that GOTO is provided as a keyword in the C language relates to the fact that C is a low-level language and programmers at that level are given the choice of using unstructured programming if necessary, although it is not recommended.
Link to Learning
Read this seminal article about how GOTO statements can be considered harmful written by Edgar Dijkstra.
The benefit of using these statements is that they make the flow of execution clear in the source code. When writing an “if,” for instance, it is clear which code is inside the “if” and which is outside. And when mixing an “if” with a “for” loop, it is clear whether the “if” is inside the “while” or vice-versa. These sorts of inside/outside relationships are difficult to perceive in unstructured code. In a conditional statement like “if”, the compiler executes a line if the condition has been met or is true. Otherwise, it moves to the next statement:
import java.util.*;
public class Main {
public static void main(String[] args) {
Scanner s = new Scanner(System.in);
System.out.println("Enter an input value: ");
int Val = s.nextInt();
int Curr_Val = 10;
if (Val > Curr_Val) {
System.out.println("The Value that you entered is greater than the current.");
} else {
System.out.println("The Current value is greater than the value that you entered.");
}
}
}
Inside a loop, like “while”, the statements are executed only if the condition in the loop is true. Otherwise, the loop execution terminates, and the compiler moves to the statements after the loop:
import java.util.*;
public class Main {
public static void main(String[] args) {
Scanner s = new Scanner(System.in);
System.out.println("Enter an input value: ");
int Val = s.nextInt();
int Prod = 1;
while (Val != 0) {
Prod = Prod * Val;
Val--;
}
System.out.println("The factorial is " + Prod);
}
}
There is a substantial upside to making a language structured, and the only significant downside is that it makes the language a bit more high-level. Therefore, among programming languages that are currently in widespread use, all the middle-level and high-level languages are structured.
Procedural Programming
In a procedural language, each part of the program is a procedure, which is a function in the context of programming. Known as procedural programming, this is a paradigm in which code is organized into procedures (Figure 4.12). It is a sub-type of imperative programming. All procedural languages, then, are imperative, but not all imperative languages are procedural. A programmer designs each procedure to accomplish a specific task and gives it a descriptive name. This allows the programmer to break a large and complicated program into smaller, more manageable pieces, which are easier to write and easier for other programmers to understand. This property of code being divided into small, reusable piece is called modularity, and it is considered a virtue.
For example, in C, a procedure that opens a socket, or an Internet connection between two computers, is called g_socket_connect()
(Figure 4.13). This procedure involves executing a series of imperative commands that use operating system features (such as the transport layer), system calls, and networking hardware (such as network interface cards, or NICs), to set up a socket connection. To close that connection and end communication, C uses the g_socket_close()
procedure, which executes a series of commands that shut down the connection. These procedures use the imperative approach to accomplish their respective tasks, so each function contains a series of imperative statements.
A procedural programming language provides syntax for defining procedures but cannot force individual programmers to follow through with breaking their code up into small procedures and giving the procedures descriptive names. So, even when a program is written in a procedural language, the source code may not necessarily be written in a procedural style.
Object-Oriented Programming
Object-oriented code is organized around objects. An object has both data, or variables, and procedures that work together to represent a specific human concept. For example, in a desktop or mobile application, every button on the screen is an object. Each button has variables to represent information, such as the location and color of the button, and procedures that perform tasks, such as clicking, hiding, or displaying the button. This programming paradigm is known as object-oriented programming. It is a programming paradigm in which code is organized into objects, where each object has both data and procedures. It is a sub-type of procedural programming. All object-oriented languages, then, are procedural (and by extension, imperative), but not all procedural languages are object-oriented. A simple example of an object can be a rectangle used to represent meaningful concepts in real life, such as the rooms in a house or a person or robot and what it can do. Different rooms may have different attributes, representing features that are specific to a kitchen, a living room, or a bedroom. A robot can have a name and age and can receive input commands and respond or print a greeting, as illustrated in Figure 4.14.
Object-oriented programming was invented to help programmers organize their code, and it has been very successful. The most widely used high-level languages, including C++, C#, Java, JavaScript, and Swift, are all object-oriented. Object-oriented programming is discussed further in Chapter 7 High-Level Programming Languages.