Dr. Jean-Claude Franchitti

Learning Objectives

By the end of this section, you will be able to:

Discuss and compare HLL data types
Demonstrate the use of variables
Examine HLL expressions and statements
Describe the implementation of flow of control in HLLs
Introduce the concept of functions
Classify well-structured programs
Explain the concept of exception handling
Summarize files and input/output

HLLs exist to communicate to a computer the logical steps for approaching a given task or application, and many HLLs act the same. Because of this, once you have mastered a modern HLL, it becomes easier to learn additional languages since you now know the correct questions to ask. For example, a starting point might be to find out how to obtain a simple program output which allows you to see how to run a program and test the concepts we are about to learn.

In this section we will describe the structural concepts of HLLs to give us the tools with which to compare them and learn them in a consistent way. A good starting point to examine programming language constructs is to demonstrate the fundamental building blocks of HLLs. These include the data types that languages can legally manipulate, how they store such data, how they structure the expressions and statements by which they communicate, and the control of the programming flow.

HLL Data Types

The data types of a language form the legal set of the kinds of data which an HLL may manipulate. These data types may be very simple, or they may be more complex. The simplest data type of a language is a primitive data type (also, basic data type), for example, integers and char in the C programming language. Data corresponding to variables of these types can usually be represented and manipulated directly using the machine hardware both in memory and via registers.

However, languages usually contain complex data types as well. A complex data type consists of multiple primitive types that are used as their building blocks. An example of this is the string data type which represents a sequence of characters. In the C programming language, character strings are complex data types represented using arrays of characters. In JavaScript, string is a primitive data type. Figure 7.9 relays the various data types.

Chart of Data Types divided into Primitive (Nuber, String, Boolean, Undefined, Null) and Non primitive (Object, Array, Function).

Figure 7.9 JavaScript data types are divided into primitive and complex types. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

In general, data types are collections of values from a given domain: the JavaScript number data type covers the domain of floating-point number values that can be represented in 64 bits. It also consists of a legal set of well-defined operations that may be performed on the values of the domain covered by the number data type. The operations on these numbers in JavaScript are defined by the arithmetic operators of the language.

Some languages separate numbers into integer types (whole numbers) and floating point types (decimal numbers). Even among similar languages such as C++ and Java, there may be different numbers of primitive data types.

Primitive Data Types

These data types are considered primitive because they relate very closely to the machine hardware. This means that the format, or bit pattern, of the actual values can be recognized by the registers and arithmetic-logic unit (ALU) of the computer. Some examples of primitive data types are number, character, and Boolean (hold the values true or false), as visible in Table 7.2.

Java	C++	Size	Value Range
short int	short int	2 bytes	–32,768 to +32,767
short int	unsigned short int	2 bytes	0 to +65,535
int	int	4 bytes	–2,147,483,648 to +2,147,483,647
int	unsigned int	4 bytes	0 to +4,294,967,295
long int	long int	4 bytes	–2,147,483,648 to +2,147,483,647
long int	unsigned long int	4 bytes	0 to +4,294,967,295
long long int	unsigned long long int	8 bytes	–9,223,372,036,854,775,808 to 9,223,372,036,854,775,808
long long int	unsigned long long int	8 bytes	0 to 18,446,744,073,709,551,615

Table 7.2 Contrast Between Integer Data Types in Java and C+

We learned that strong typing refers to the characteristic of an HLL in which a variable is restricted to holding values of the type with which it is defined. The concept of coercion refers to the ability of a variable of a data type to be forced to hold a value of a different data type. In other words, coercion rules are a relaxation of type checking. For example, a Java int data type holds a whole number of size four bytes, while a short int holds a whole number of two bytes. We can legally assign the value of the short int to the int: it is coerced by the assignment, which makes absolute sense because the short value can fit into the longer value.

On the other hand, we cannot assign a Java 8-byte long data type to a Java 4-byte int; it is too big to fit, and if we try, we will get a compile time error that will not allow the program to run. However, we can coerce the assignment by using a mechanism called a type cast. This is a mechanism in many HLLs which allows us to force the larger value into the smaller space given to us by the smaller variable. This can have side effects, which must be known by the programmer to use the mechanism effectively. The side effect of the Java long to int example is truncation: four of the bytes are dropped.

Complex Data Types

We have learned that some of the data types of a language are primitive types, meaning that data of that type can be directly represented in the registers and memory locations of the machine. However, languages usually contain complex data types as well.

A complex data type is one consisting of multiple primitive types used as its building blocks which is why we also call them composite types. These multiple types may be of the same type, as in a complex data type known as an array, or they may consist of collections of different data types in one construct, such as a C# class.

Arrays

An array is a typical composite type that is used as a data container. A great way to visualize an array is as a shelf unit, a connected structure where we can place items on each element. An instance of an array in this case could be a bookshelf that is meant to contain books (a book would be another composite type).

An array is a named variable that references a block of contiguous memory locations, and each “shelf” of the array is an element, which occupies exactly as many of the contiguous bytes as it takes to accommodate a value of the data type being stored. In the simplest type of array structure, an indexed array, the shelves are numbered with an index, starting at zero, or the lowest memory location. In a strongly typed language, all elements of an array must be of the same data type which means that every element will be of a uniform length in bytes.

Figure 7.10 illustrates an array in any number of HLLs including Java, C, and C#. We start off with the array declaration, which gives the data type of each of its elements, names the array variable numbers, indicates it is an array with the opening and closing square brackets ([]), and assigns five values to it with what is known as an array initializer (values separated by commas placed between curly braces). We can see that the length of the array is 5, the indexes run from 0 to 4, and each of the elements are contiguous in memory and are 4 bytes in length. The following statement assigns an element of the array to a variable:

int myNumber = numbers[5];

Illustration of array with an array declaration, Values, Indexes, Memory addresses (with contiguous memory locations).

Figure 7.10 Each value in an array is assigned an index and a memory address. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Strings

Strings, which are another composite data type, are arrays of characters in most HLLs. In object-oriented languages, they are quite a bit more complex. They are implemented in some HLLs as an array which holds individual characters as its elements. The OOP languages usually implement strings as objects with built-in functions (methods). Here is an example of a string in Java:

String myUniversity = "Union Technical College";

Reference Types and Pointers

In our study of variables and data types so far, we have examined the concept of a variable being a name-value pair. With primitive types, the memory location referenced by the name stores the actual value of the variable directly at that spot.

In the case of complex data types, things are not quite so simple. Let us take the example of a string. We can store a name in the string such as “Jimmy” to start. Now let us say that we change the value to “Johnathan” at some later time. The memory required to store the value has now changed and perhaps it will no longer fit in the original location. We have learned that the C language has a primitive data type known as a pointer. It is used for that reason.

Pointers are variables that hold actual computer memory addresses (references). They match the word size of the machine, which is typically 64 bits. We call a variable that holds memory addresses a reference variable. There is no such thing in Java or C. Therefore, when we create an array or a string in these languages, the value that is actually stored in the variable is the memory address of the place where the complex object exists. So, in the case of our string example, if we change the name, we can just change the value in its variable to be a different memory address—refer to Figure 7.3.

Variables

We learned in 7.1 Programming Language Foundations that a variable is a container that is used in an HLL to hold a value. In computer science we have many instances of this type of construct, which we call a name-value pair, a construct-like variable that is named and can hold values. The types of values that they may hold consist of the legal data types of the language.

Identifiers

A variable name is called an identifier. Different HLLs have different rules about legal identifier syntax. For example, in C# rules are as follows:

An identifier cannot be a keyword, which is a word reserved by the language and that has a special meaning.
A letter, @symbol, or an underscore must start an identifier while the remaining portion may be digits, the underscore symbol, and/or letters (different from this, an identifier in PHP starts with a dollar sign ($)).
Identifiers are case-sensitive, where uppercase and lowercase letters are treated as distinct. Therefore, the C# identifier myAge is a different entity than myAGE (Fortran, BASIC, and Pascal are not case-sensitive).

Global Issues in Technology

Learning About Programming: A Language Barrier for Non-English Speakers Learning HLLs?

Have you ever struggled to understand something because it was explained in a language that you do not speak? That is the challenge that many non-English speakers face when learning HLLs. Most HLLs use English keywords that make sense to the compiler but not necessarily to someone unfamiliar with the language. Since programming has become a worldwide endeavor, English keywords can be a stumbling block for non-English speakers learning HLLs. Fortunately, there is a bright side! While keywords are in English, they comprise a relatively small set of words in a program. The real power of programming lies in its ability to work with data and instructions in any language, which is made possible via Unicode. Unicode can represent most international character sets, allowing programmers to use characters from almost any language alongside the English keywords.

But what about the future? As technology evolves, will programming languages find ways to become even more natural language-independent? Perhaps future HLLs will offer interchangeable keywords or entirely new approaches that do not rely on any given language.

Variable Declarations

A variable must be made known to a compiler or an interpreter before it may be used by a computer program. This process is variable declaration and/or definition. In strongly typed languages, a variable declaration consists of a statement which specifies the variable name and data type. Weakly-typed languages omit the data type when values are assigned to the variable, which may be different types at different times.

Variable definitions in various languages are as follows:

Java: int myAge;

JavaScript: var myAge;

PHP: $myAge = 21;

Assigning its first value to a variable is known as initialization, which may be done at any time after declaration, such as in the following Java snippet:

int myAge;
myAge = 21;

It is a best practice to always initialize variables when they are declared. This is known as declaration and initialization. This keeps the value that is stored from being undefined at any time, which can have grave consequences in code in various situations. For example, in the C programming language, failing to initialize a pointer to an array of characters in a program and copying a string of characters to the (uninitialized) memory location referred to by that pointer later in the program will crash the program. Here is another example in Java:

int myAge = 21;

Assignment

A literal is a value of one of the legal data types of an HLL that can be written directly into the code. For example, in JavaScript, one of the data types is numeric, which may be represented by either the literal whole number 2 or by the floating-point number 2.0. In C++, a literal of the type char may be written as the single quoted sequence a.

Storing a value in a variable is carried out by creating an assignment statement: The value assigned may be a literal, or it may be the value that has been placed in another variable or the result of an expression. The value in a variable may also be replaced by using assignment. Therefore, variables may hold different values at different times.

In a PHP expression that makes up an assignment as shown, the variable is located at the left. Notice that the identifier starts with the dollar sign ($), complying with the identifier rules of PHP. The equals sign (=) is known as the assignment operator, as in most languages with C-like syntax (C/C++, Java, C#, Python, JavaScript, PHP).

$myAge = 21;

In programming languages, we refer to the left hand of a variable assignment statement (the variable) as the lvalue. The right-hand value (the literal) is referred to as the rvalue. The assignment operator is a binary operator, meaning it is surrounded by two operands. The operand is the lvalue or the rvalue on either side of the operator. The rvalue of a variable assignment statement may be the value of another variable as shown here or the result of an expression. An example of this in Java is as follows:

myAge = yourAge;

Let us examine the concept a little more deeply. Variables may be named memory locations. We give a variable a name so that it is easy for humans to deal with it. It is a best practice to use names that are indicative of both the purpose of the variable (what it will be used for) and the data type that it will hold. So the variable myAge in the previous example meets both characteristics. One HLL best practice is to use an agreed upon convention for variable names. An example is camelCase, a naming convention that eliminates spaces and punctuation in favor of capitalization of specific words; in this case, the first letter of the first word is lowercase and if the name has multiple words, the later words start with a capital letter (e.g., firstName and lastName). Other conventions exist such as snake case (e.g., first_name, last_name), kebab case (e.g., first-name, last-name), and Pascal case (e.g., FirstName, LastName).

When a program is compiled, the compiler allocates a memory location to hold variables’ values and reserves the amount of memory necessary to hold such value based on the data type of the variable. The addresses of the memory locations that the compiler assigns to variables are relocatable, meaning that the linker may change these addresses when creating the executable version of the program, and the program loader will also change them when running the program to match actual memory addresses in the machine memory. Figure 7.11 illustrates what this looks like.

Illustration of Identifier (myNumber) and its location in Memory (Address - 0012CD9CA0; Value – 53).

Figure 7.11 In JavaScript, the variable, in this case, myNumber, has a value that is assigned to a memory address. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Named Constants

A variable may hold different values within the data type restrictions of the language at various times during program execution. There are times when we would like a value to be assigned to a specific memory location and not allow it to be changed thereafter. Most modern HLLs and associated DLLs (dynamic link libraries) provide a construct for this purpose called a named constant. It is also a best practice in most language cultures to use capitals with underscores separating multiple words. The following statement in C++ creates a named constant:

const int PAY_RATE = 15;

The word const is a keyword, int is a C++ integer data type followed by the name of the constant, equals (=) is the assignment operator followed by the rvalue, and the statement ends with a terminator. It is a best practice in programming to use named constants whenever indicated and possible. In this example, if an employee pay rate was to change, it would only have to be changed in one spot in the program rather than having to locate its usages in many lines of code.

Operators

Much like in algebra, an operator in HLLs performs various types of operations (processes) on values. Different HLLs may support operators differently. Some examples of the types of operations by which values may be manipulated are arithmetic operators (mathematical), relational operators (comparison), logical operators, and string operators (sequences of characters). They perform these operations within expressions of the language. To review, an expression is a construct in a programming language that evaluates two values. Operators may act upon different numbers of operands, usually one (a unary operator), two (a binary operator), or three (a ternary operator). Operators perform under certain rules that dictate the order of operations, or precedence. Although many HLLs use the same operators, when learning a new language, it is necessary to research its operators to identify the very few exceptions.

Arithmetic Operators

The arithmetic operators perform the four familiar mathematical operations on their operands: addition (+), subtraction (-), multiplication (*), and division (/). There is one more unfamiliar operation known as modulo operator (%), which evaluates to the remainder left after division. Some languages also employ an exponentiation operator (**), the operator that raises the value of one operand to the power of the second operand. The increment operator (++) and decrement operator (--) raise or lower the value in a variable by one, respectively, and are used in the vast majority of HLLs. These Java arithmetic operators are outlined in Table 7.3.

Operator	Name	Example Expression	Meaning
*	Multiplication	a * b	a times b
/	Division	a / b	a divided by b
%	Remainder (modulus)	a % b	remainder after dividing a by b
+	Addition	a + b	a plus b
-	Subtraction	a - b	a minus b

Table 7.3 The Java Arithmetic Operators

Relational Operators

The relational operators compare their operands, and expressions using them evaluate to the Boolean values true or false. Table 7.4 lists the relational operator symbols that are used in the vast majority of HLLs.

Operator	Name	Example Expression	Result
<	Less than	1 < 2	True
>	Greater than	1 > 2	False
<=	Less than or equal to	1 <= 2	True
>=	Greater than or equal to	1 >= 2	False
==	Equal to	1 == 2	False
!=	Not equal to	1 != 2	True

Table 7.4 The C# Relational Operators

Logical Operators

The logical operators are used to connect two or more expressions. They evaluate the entire expression to the Boolean values true or false. Table 7.5 lists the logical operator symbols that are used in the vast majority of HLLs.

Operator	Name	Example Expression	Result
&&	Logical AND	(1 < 2) && (2 > 1)	True
\|\|	Logical OR	(1 < 2) \|\| (2 > 1)	True
!	Logical NOT	!(1 < 2)	False

Table 7.5 The JavaScript Logical Operators

Expressions using logical operators are evaluated based on a truth table, a chart that shows what the resulting value would be given each combination of operands. Table 7.6 shows the truth tables for the three logical operations we have studied.

A	B	A && B	A \|\| B	!A
True	True	True	True	False
True	False	False	True	False
False	True	False	True	True
False	False	False	False	True

Table 7.6 Truth Table for Logical AND, OR, and NOT Operators

Combined Assignment

The assignment operator may be combined with other operators, usually mathematical, as a shortcut notation. This is called combined assignment. In this construct, the operation is carried out first, followed by the assignment. Table 7.7 shows the most common combined assignment expressions.

Operator	Example	Equivalent to
+=	x += 7	x = x + 7
-=	y -= 4	y = y - 4
*=	z *= 2	z = z * 2
False	a /= b	a = a / b
False	c %= 9	c = c % 9

Table 7.7 Combined Assignment Operators

Documentation and Comments

All programming languages allow a programmer to document their code to improve code readability. This is a best practice in programming and a prospective job applicant may not get a job without demonstrating documentation skills.

Documentation is carried out by using a structure called a comment, a container used to hold documentation in code. A comment begins with a defined symbol or set of symbols of the language followed by the comment text itself. Comments may be single line or multiline depending upon the symbol used. Single line comment symbols indicate that whatever follows on the line is ignored by the compiler or interpreter. Multiline comments have an opening and a closing symbol or set of symbols. Examples are as follows:

// single line comment in C#
int myAge = 21;
int myAge = 21; // another single line comment in C#
/* this is a multiline comment in JavaScript
myAge will be used to hold the age of a student
*/
int myAge = 21;

Other languages may use different syntax for comments; in Python the symbol is the pound sign (#). In BASIC it is the REM keyword, which is short for “remark.”

HLL Expressions and Statements

Commands in HLLs are structured as statements, and statements are made up of expressions. We have learned that an expression in a programming language evaluates to a value. Examine the following statements in a C-like language:

int a = b + c * d;

If it is given that b = 4, c = 6, and d = 2, a simple scan of the statement would have the first addition statement evaluate to 10, with that being used to evaluate the resulting statement of 10 * 2, giving a value of 20. However, this calculation is incorrect because expression evaluation within statements follows order of precedence, the rules that determine the order in which operators in statements are evaluated. Therefore 6 * 2 will be evaluated first, giving the value 12, and then 4 + 12 will be evaluated, giving a final result of 16.

Parentheses may be used to modify the order of precedence in statements. Expressions within parentheses are always evaluated first. If we were to rewrite the example:

int a = (b + c) * d;

we would come up with the result of 20. Table 7.8 shows the results of various expressions when following the rules of precedence and applying parentheses to them.

Expression	Value	Parenthesized Expression	Value
5 + 2 * 4	13	(5 + 2) * 4	28
10 / 5 - 3	-1	10 / (5 - 3)	5
8 + 12 * 2 - 4	28	8 + 12 * (2 - 4)	-16
4 + 17 % 2 - 1	4	(4 + 17) % 2 - 1	0
6 - 3 * 2 + 7 - 1	6	(6 - 3) * (2 + 7) - 1	26

Table 7.8 Expressions and Their Values

Some HLL expressions that use the logical operators (Java, C++, C#, JavaScript) do not have to be completely evaluated for their results to be known, a concept called short circuiting. Referring to the truth tables of the logical && and || operators, we can see that in a && operation, the only way a result of true can be obtained is if both operands are true. Therefore, if the first operand is false, the expression does not have to be evaluated further—it is false. This is short circuiting. Here is an example in JavaScript:

if (x == y || today == "Tuesday") {
    // do_something
}

If the operand on the left side of the logical operator || evaluates to true, then the expression is true, and the expression on the right side of the || operator does not need to be evaluated. Short circuiting increases efficiency and performance.

Flow of Control

As we learned in Introduction to Data Structures and Algorithms, we need to define the steps to be taken to solve a problem or complete a task. The order in which, or if, the statements of a program are executed is called flow of control. By default, program statements execute in sequential order from an established starting point; however, the flow of control can be modified. This is necessary in order to have the ability to model the situations of the real world that our algorithms represent.

Sequential Execution

Executing statements in the order in which they appear, sequential execution, is a linear ordering of statements in which one statement directly follows another. This is the default flow of control; it is automatic. An example of sequential execution would be the following:

int myAge = 35;
String myName = "Johnathan";
boolean isStudent = true;

These statements will execute one at a time in the order in which they are written.

A code block is a statement that consists of one or more statements that are structured in a sequential group and delineated as such. This is necessary so that an entire group of commands may be executed as a single sequential structure. In the C-like languages, opening and closing curly braces ({ }) are usually used to denote the beginning and end of a code block. We can modify the preceding example into a code block as follows in Java:

{
   int myAge = 18;
   String myName = "Johnathan";
   boolean isStudent = true;
}

Note that variables defined within code blocks only exist in the context of that code block, which makes it possible to manage the scope of variables within code more precisely. It is a best practice in programming to indent code contained within structures so that the code is both readable and maintainable.

Selection

Decision making in programming is called selection. A decision-making construct allows us to choose from one or many paths of execution in a program. To see how they work, one must understand the concept of conditional expressions.

Conditional Expressions

Sometimes called a Boolean expression, a conditional expression evaluates the Boolean values of true or false. Based on a Boolean result, a computer may decide which path of execution to take. Some examples of conditional expressions follow:

value1 == value2
value1 != value2
value1 < value2
value1 <= value2

Selection Statements

The simplest form of selection statement is a decision structure known as a one-way branch, as displayed in Figure 7.12. Most languages represent a selection statement that is a one-way branch with a construct known as an if statement. Other languages may use a slightly different syntax. In Java, a relatively simple if statement using a code block looks like this (it is recommended to use code blocks for single-line selection statements to avoid errors subsequently when/if more statements are added):

short temp = 90;
(…)
if (temp >= 95) {
   System.out.println("It is hot");
}

Illustration of one-way branch. Circle illustration displays arrows to diamond, then arrow to a circle, with arrow from diamond labeled with “temp>=95”, pointing to Print “It is hot” then to same circle.

Figure 7.12 This unified modeling language (UML) activity diagram represents a selection statement that is a one-way branch. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

The only type of conditional statement that a language needs is the simple one-way branch and any number of them may be placed in sequential order to carry out any decision-making task. However, a variety of decision structures have evolved into different flavors that make programs easier to code, more readable, and more maintainable. Its logic is shown in Figure 7.13.

Two-way branch statement displaying (1) number % 2 == 0 going to Print “Number is odd”; (2) number % 2 !=0 leading to Print “Number is even”.

Figure 7.13 A unified modeling language (UML) activity diagram representing a selection statement that is a two-way branch. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

In Java, this might be coded with a two-way branch using an if…else statement:

int myNumber = 4;
if (myNumber % 2 == 0)
{
   System.out.println("The number is even.");
}
else
{
   System.out.println("The number is odd.");
}

Sometimes decisions might result in three or more possible pathways. In the C-like languages, this situation might be coded with a multi-way branch using an if…else…if statement:

if (myGrade >= 90)
{
   System.out.println("You got an A.");
}
else if (myGrade >= 80)
{
   System.out.println("You got a B.");
}
else if (myGrade >= 70)
{
   System.out.println("You got a C.");
}
else
{
   System.out.println("You need to work harder.");
}

Many programming languages have employed an additional selection statement for times when decisions result in three or more possible pathways. This is known as a switch or case statement, and it can be much more readable and maintainable than many if statements strung together. This looks like the following:

switch (myGrade)
{
   case 90: System.out.println("You got an A.");
   break;
   case 80: System.out.println("You got a B.");
   break;
   default: System.out.println("You need to work harder.");
   break;
}

Iteration

Also known as looping, iteration is going around and re-executing sequences of statements. One of the great strengths of computers is their ability to repeat actions over and over. Iteration structures allow us to write statements to be repeated just one time with the repetition monitored by flow of control structures.

The number of iterations is controlled by conditional expressions much like selection. The type of conditional statements used to control loops are categorized as either condition-controlled or count-controlled. In the condition-controlled scenario, the loop will continue to iterate until a condition is met. In a count-controlled situation, the loop repeats a specific number of times.

Iteration structures may also be categorized as top-tested or bottom-tested. In a top-tested loop, a condition is set at the entrance to the code block. If the condition is met, the loop executes and will continue to repeat until the condition is false. Note that if the condition in a top-tested loop is never true, the loop will not execute at all (for example “while True:” as the top-test of a loop in Python will result in the loop being executed forever).

In the bottom-tested loop, the sentinel, the expression that sets the condition at the entry or exit of a loop for continued iteration, is at the exit after the code block. A loop such as this is guaranteed to execute at least one time. The sentinel decides if the loop runs again. It is useful to guarantee at least one repetition of a loop.

As with selection statements, most languages have a variety of structures to choose from and just like in selection, there is only one that is necessary to do any kind of iteration: the top-tested condition-controlled loop. Its logic is represented by Figure 7.14. Note that it is assumed in this example that the variable “raining” can change as a result of an external event in reaction, say, to the output of a sensor that detects rain and issues a callback to an event handler (not shown here) that changes the value of the “raining” variable to being true.

Figure 7.14 This unified modeling language (UML) activity diagram represents a while loop. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

In most modern languages, a top-tested loop is known as a while loop. Its syntax in C# is as follows:

Boolean happy = true;
Boolean rainin = false;
while (happy == true) 
{
   Console.out("I am happy");
   if (raining == true) // raining value controlled by event handler 
   {
      Console.out("I am frowning");
      happy = false;
   }
}

The sentinel in this loop is the variable happy, and if this condition remains true the loop will iterate.

This type of loop is non determinative, a loop for which we cannot predict the number of iterations (condition controlled). A determinative loop that is predictable in that its number of iterations will execute exactly five times and could be constructed in C# as follows:

int count = 1;
while (count <= 5) 
{
   Console.out("I am smiling");
   count = count + 1;
}

In most modern languages a bottom tested loop is known as a do…while loop as highlighted in Figure 7.15. It’s syntax in C# is as follows:

Boolean happy = true;
Boolean raining = false;
do 
{
   Console.out("I am smiling");
   if (raining == true) // raining value controlled by event handler
   {
      Console.out("I am frowning");
      happy = false;
   }
} while (happy == true)

The sentinel in this loop is still the variable happy; the loop will iterate at least once, and if this condition remains true the loop will continue to iterate.

Figure 7.15 This unified modeling language (UML) activity diagram represents a do…while loop. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

The sentinel in this loop is again the variable happy; it will iterate at least once and will continue if this condition remains true. This type of loop is non determinative because we cannot predict when it will rain and, therefore, how many times it will execute. We could construct it in such a way as to make it count-controlled. For count-controlled loops, another style is a for loop. This structure is top-tested and count-controlled. It is concise, elegant, easy to maintain, and efficient. An example of a for loop in PHP is as follows:

for ($count = 1; $count <= 5; $count = $count +1) {
   print("the count is: "", $count);
}

The rules are as follows: inside the parentheses, the first expression is executed only once at the first entrance to the loop. The second expression is the sentinel and it is tested at all iterations of the loop. The last expression is executed at the bottom of each iteration and sets the stage for the next iteration.

Functions

Functions are sometimes referred to as, or replaced by, subroutines, subprograms, procedures, or methods. Some computer scientists consider them to be flow of control constructs because calling upon a function to do a job pulls the execution out of the regular sequence. In general, function call is code which invokes the function as well as passes to it values that may be needed for it to do its job while being designed to return values back to the place at which the function was called. However, in some programming languages such as C, functions can only return a value of a given type or no value. In that case, function parameters may be passed as references (i.e., by passing a pointer as a parameter) and the values associated with these parameters may be changed by the function.

A JavaScript alert is an API function that is designed to send a message to a web page and wait for the user to acknowledge it via an OK button, for example. The following illustrates code that invokes such an alert function:

alert("This message is for you, click OK to continue");

Modularity

This refers to the need for a program to be broken into various tasks. An employee program would probably be carrying out many tasks, one of which might be to compute the weekly salary. Modular design allows us to specify these tasks, name them, and call upon them without the finished code. We can define the tasks as functions and build the stub of them without all the details. Then we can call them from our main flow of control and have them respond to the call.

Maintainability

If we had to change the way we pay our employees, perhaps due to a change in deductions, we have to change that code in a multitude of places. By using a function to carry out the job for all employees, changes can be limited to just one place in the code.

Reusability

Building functions to carry out tasks means that we can either reuse the code in other software and/or call upon the same code from other locations. An API that is shipped with most languages is a perfect example. Our code calling upon the PHP print routine shows the efficacy of this—it is not a part of the language itself; rather, it is part of the API that is shipped with it.

Function Signature

To build a function in an HLL, one must define its function signature. The signature defines a function and informs the compiler or interpreter of details that it needs in order to call upon that function to execute. It also defines what it may return to the code which called the function. The following illustrates a function signature in Java (in this case, the function may be a method) for the task of paying an employee:

public double payEmp(String empName, double empBasePay, double empHours);

The function can be called upon to do its job with the following code:

myPay = payEmp("John Doe", 15.0,40.0);

The pay amount would be computed by the function and the resulting value would be returned to the point of the call and placed in the variable.

The word public is a keyword in Java that is called an access modifier; it denotes where in code this method may be accessed from. In this case, it is available from anywhere in the program code.

The keyword double indicates that this method will return a value of that data type to the place where the method was called. In this case, we want it to return the total pay for the employee. In Java, if we use the keyword void for the return, it tells the compiler that we will not be returning any value.

The identifier payEmp is the name we have given to the method. Its syntax follows the same rules and best practices that are used for a variable identifier in this language.

The parentheses indicate that this is a function (method), thereby delineating it from a variable identifier.

A formal parameter represents values that the function needs to receive to do its job. Parameters are not required, but if they are specified, the call of the function must pass actual parameters values for them into the function. Not doing so would result in a compile error which will not allow the program to run. In fact, not complying with any of the signature items results in a compile error.

Function Call

A function call in an HLL must comply with the function signature. A Java call to the function looks like this:

double thisSalary = payEmp(thisName, thisPayRate, thisHrs);

Starting from left to right, we declare a variable thisSalary and assign to it the value that will be the return of the function. We then call the function, passing the values that are required for its defined parameters. The term argument is used for the values of a function call which must match the parameters in both number and data type.

Parameter Passing

Parameters are passed using one of two methods. In pass by value, a copy is made of the value and the copy is passed as an argument to the parameter. When it encounters the parameter, the value is assigned and the values in the original variable cannot be affected by any changes which may occur inside the function.

On the other hand, pass by reference is employed in Java to pass a complex data type, meaning that the value stored inside the variable is a pointer to the memory address of the actual complex object. The value copied into the parameter in this case gives the code access to the object itself within the scope of the function so any changes made to its value will be reflected in the actual object that was passed to the function as shown in Figure 7.16.

Illustration of passing a parameter by the value 7 and one passing a parameter by reference Address 0f91hh0c.

Figure 7.16 Compare passing a parameter by value versus passing it by reference in C++. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

Let’s code the entire function in Java to get a complete view of its components:

public double payEmp(String empName, double empBasePay, double empHours) {
    double empPay = empBasePay * empHours;
    System.out.println(empName + " is being paid " + empPay);
    return empPay;
}

In the example, empBasePay and empHours are passed by value of primitive data types. On the other hand, empName is passed by reference because string is a complex data type.

Call Stack

The call stack (execution stack) is a data structure that controls the calling and return of functions. Other duties include storing local variables, which dictates scope, defined as the visibility and lifetime of variables. It also has control of parameter passing.

Describing the call stack is usually done by comparing it to a cafeteria tray unit. Clean trays are pushed onto the unit as they come in and are popped off the unit as required. This concept is known as last in, first out (LIFO). Thus, when a subroutine is called, it is pushed onto the call stack. Its presence and all of its parts on the stack are called a stack frame. When it finishes executing, it is popped off. The flow of control of the program follows the current state of the call stack. A call stack is outlined in Figure 7.17.

Depiction of a call stack from Higher addresses down to Lower addresses (“top of stack”).

Figure 7.17 This is representative of a call stack. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

In this implementation, function1 is called first. It is pushed onto the stack in stack frame format and the parameters with their values are pushed first followed by the return address so that the flow of control can revert to the location of the function call after the stack frame has been popped off the stack. Lastly, the local variables are pushed onto the stack frame as they are defined. In this example, function1 has called function2 and has been pushed onto the stack. When it is popped off the stack, control will revert to the function1 stack frame.

Looking at function1, we see that the parameters come into existence when the function is pushed onto the stack and disappear when the function is popped from the stack. The same is true of any local variables declared in the function. This is an illustration of the scope of a variable which controls both the visibility and lifetime of a variable (i.e., a variable that is local to a function is only visible within the scope of that function for the duration of execution of that function). This scope is local because the visibility and lifetime is local to the function; they do not exist and cannot be seen from anywhere else in the program.

Recursion

Recursion is the sequence in which a function calls itself. If a function calls itself, then the next instance of the function will also call itself. This proceeds onward infinitely until the call stack runs out of memory, unless there is a means within the function to shut down the process. This condition is known as stack overflow.

Certain problems or tasks that we want a program to engage with lend themselves to recursion, such as with a rocket launch. If we want to program the display of a countdown sequence to launch, we can do it recursively. The signature, known as a prototype in C++, might look like this:

void countDown(long);

As indicated by the void, the function will return nothing to its calling point. It has a parameter that represents the starting point of the countdown. Therefore, if we want to start the countdown at 100, the call will look like this:

countDown(100);

Building out the recursive function could be done as follows:

void countDown(long count) {
   if (count < 0)
   {
      return;
   }
   cout << count << endl;
   count = count - 1;
   countDown(count);
}

Well-Structured Programs

A major challenge with writing well-structured programs is the complexity of today’s systems, especially when they are influenced by so many programmers. A common dilemma is how to divide and conquer the task of writing programs when the effort relies on the notion of modularization, or splitting the large job into independent units which may be then called upon. We can now expand that concept to include entities that exist in completely separate pieces of source code.

Programs are very efficiently built out of modules, as shown in Figure 7.18. A module is a component of a program and has a public interface. For example, in Java the interface only includes documentation of what a user needs to access an object derived from a class, along with its attributes and methods. The interface documents exactly how components can be used by other programmers: what they are named, what, if anything, they require to do their jobs, and what they may or may not return. Documentation is necessary for another programmer to know how to employ them.

Main Program depiction with Module 1 and Module 2 both leading to two separate functions.

Figure 7.18 This diagram shows a program made up of multiple modules, each of which contains multiple functions. (attribution: Copyright Rice University, OpenStax, under CC BY 4.0 license)

This allows us to engage in information hiding: the hiding of bits and bytes of implementation that the user does not need to know to make use of the module. A good example is the PHP print function; its implementation is not located in any place that we can see. Rather it comes from a module that somebody has programmed and is shipped with the language API, such as in Figure 7.19.

Figure 7.19 This portion of the PHP manual gives the documentation for the PHP print construct. (credit: The PHP Documentation Group, CC BY 3.0)

Exception Handling

No code is perfect; therefore, any computer program can generate runtime errors. This type of error usually indicates that an exceptional condition has occurred. These errors are generated by a hardware-detected run-time error or an unusual condition detected by software. Some of these errors include arithmetic overflow, division by zero, end-of-file on input, and user-defined conditions (not necessarily hardware or software errors).

Most modern languages divide errors by category: a runtime error is an exception that is serious enough that it cannot or should not be handled by the software, and an exception is an unusual behavior that can be recovered using exception handling support the programming language provides (e.g., try/catch clause in Java). The proper terminology is that an error or exception has been thrown.

Files and Input/Output

File handling features and the input/output (I/O) capabilities (facilities that allow a program to communicate with the outside world) are highly dependent on both the operating system which is hosting the program and the HLL. All HLLs differ in the way that these are handled. The following are some examples of different output statements in various HLLs:

Java: System.out.println("Hello");
C#: Console.Writeln("Hello");
C: printf("Hello");
C++: cout << "Hello";
JavaScript: alert("Hello");
PHP: print("Hello");

There are as many ways of handling file I/O as there are HLLs. This is a subject that requires research with every new language.

Link to Learning

Aside from being one of the most used programming languages in the world, Java is a very teachable language. As such, the Basic Language Constructs of Java serves as a great reference for learning this new language.

7.2 Programming Language Constructs