178 Chapter 14 ■ The basics
Programming languages should also display a high degree of orthogonality. This
means that it should be possible to combine language features freely; special cases and
restrictions should not be prevalent. Java and similar languages distinguish between two
types of variables – built-in primitive types and proper objects. This means that these two
groups must be treated differently, for example, when they are inserted into a data struc-
ture. A lack of orthogonality in a language has an unsettling effect on programmers; they
no longer have the confidence to make generalizations and inferences about the language.
It is no easy matter to design a language that is simple, clear and orthogonal. Indeed, in
some cases these goals would seem to be incompatible with one another. A language design-
er could, for the sake of orthogonality, allow combinations of features that are not very use-
ful. Simplicity would be sacrificed for increased orthogonality! While we await the simple,
clear, orthogonal programming language of the future, these concepts remain good meas-
ures with which the software engineer can evaluate the programming languages of today.
The syntax of a programming language should be consistent, natural and promote the
readability of programs. Syntactic flaws in a language can have a serious effect on pro-
gram development.
One syntactic flaw found in languages is the use of
begin-end pairs or bracketing
conventions, {}, for grouping statements together. Omitting an
end or closing bracket
is a very common programming error. The use of explicit keywords, such as
endif and
endwhile, leads to fewer errors and more readily understandable programs. Programs
are also easier to maintain. For example, consider adding a second statement with the
Java
if statement shown below.
if (integerValue > 0)
numberOfPositiveValues = numberOfPositiveValues + 1;
We now have to group the two statements together into a compound statement using
a pair of braces.
if (integerValue > 0) {
numberOfPositiveValues = numberOfPositiveValues + 1;
numberOfNonZeroValues = numberOfNonZeroValues + 1;
}
Some editing is required here. Compare this with the explicit keyword approach in the
style of Visual Basic. Here the only editing required would be the insertion of the new
statement.
if (integerValue > 0)
numberOfPositiveValues = numberOfPositiveValues + 1;
endif
14.4
●
Language syntax
>
>
>
>
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 178
14.5 Control structures 179
In addition, explicit keywords eliminate the classic “dangling else” problem preva-
lent in many languages – see the discussion of selection statements below
Ideally the static, physical layout of a program should reflect as far as is possible the
dynamic algorithm which the program describes. There are a number of syntactic con-
cepts which can help achieve this goal. The ability to format a program freely allows the
programmer the freedom to use such techniques as indentation and blank lines to high-
light the structure and improve the readability of a program. For example, prudent inden-
tation can help convey to the programmer that a loop is nested within another loop. Such
indentation is strictly redundant, but assists considerably in promoting readability. Older
languages, such as Fortran and Cobol, impose a fixed formatting style on the program-
mer. Components of statements are constrained to lie within certain columns on each
input source line. For example, Fortran reserves columns 1 through 5 for statement labels
and columns 7 through 72 for program statements. These constraints are not intuitive to
the programmer. Rather they date back to the time when programs were normally pre-
sented to the computer in the form of decks of 80-column punched cards and a program
statement was normally expected to be contained on a single card.
The readability of a program can also be improved by the use of meaningful identi-
fiers to name program objects. Limitations on the length of names, as found in early ver-
sions of Basic (two characters) and Fortran (six characters), force the programmer to use
unnatural, cryptic and error-prone abbreviations. These restrictions were dictated by the
need for efficient programming language compilers. Arguably, programming languages
should be designed to be convenient for the programmer rather than the compiler, and
the ability to use meaningful names, irrespective of their length, enhances the self-
documenting properties of a program. More recent languages allow the programmer to
use names of unrestricted length, so that program objects can be named appropriately.
Another factor which affects the readability of a program is the consistency of the syntax
of a language. For example, operators should not have different meanings in different con-
texts. The operator “=” should not double as both the assignment operator and the equal-
ity operator. Similarly, it should not be possible for the meaning of language keywords to
change under programmer control. The keyword
if, for example, should be used solely for
expressing conditional statements. If the programmer is able to define an array with the
identifier
if, the time required to read and understand the program will be increased as we
must now examine the context in which the identifier
if is used to determine its meaning.
A programming language for software engineering must provide a small but power-
ful set of control structures to describe the flow of execution within a program unit.
In the late 1960s and 1970s there was considerable debate as to what control struc-
tures were required. The advocates of structured programming have largely won the
day and there is now a reasonable consensus of opinion as to what kind of primitive
control structures are essential. A language must provide primitives for the three basic
structured programming constructs; sequence, selection and repetition. There are,
however, considerable variations both in the syntax and the semantics of the control
structures found in modern programming languages.
14.5
●
Control structures
BELL_C14.QXD 1/30/05 4:23 PM Page 179
180 Chapter 14 ■ The basics
Early programming languages, such as Fortran, did not provide a rich set of con-
trol structures. The programmer used a set of low-level control structures, such as the
unconditional branch or
goto statement and the logical if to express the control flow
within a program. For example, the following Fortran program fragment illustrates
the use of these low-level control structures to simulate a condition controlled loop.
n = 10
10 if (n .eq. 0) goto 20
write (6,*) n
n = n - 1
goto 10
20 continue
These low-level control structures provide the programmer with too much freedom
to construct poorly structured programs. In particular, uncontrolled use of the
goto
statement for controlling program flow leads to programs which are, in general, hard
to read and unreliable.
There is now general agreement that higher level control abstractions must be pro-
vided and should consist of:
■ sequence – to group together a related set of program statements
■ selection – to select whether a group of statements should be executed or not based
on the value of some condition
■ repetition – to execute repeatedly a group of statements.
This basic set of primitives fits in well with the top-down philosophy of program
design; each primitive has a single entry point and a single exit point. These primitives
are realized in similar ways in most programming languages. For brevity, we will look
in detail only at representative examples from common programming languages. For
further details on this subject refer to Chapter 7 on structured programming.
Java, in common with most modern languages, provides two basic selection constructs
The first, the
if statement, provides one or two-way selection and the second, the case
statement provides a convenient multiway selection structure.
Dangling else
Does the language use explicit closing symbols, such as endif, thus avoiding the
“dangling else” problem? Nested
if structures of the form shown below raise the
question of how
ifs and elses are to be matched. Is the “dangling” else associ-
ated with the outer or inner
if? Remember that the indentation structure is of no
consequence.
14.6
●
Selection
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 180
14.6 Selection 181
if (condition)
if (condition)
statement1
else
statement2
Java resolves this dilemma by applying the rule that an else is associated with
the most recent non-terminated
if lacking an else. Thus, the else is associated
with the inner
if. If, as the indentation suggests, we had intended the else to be
associated with the outer
if, we have to resort to clumsy fixes. But the clearest and
cleanest solution is afforded by the provision of explicit braces (or key words) as
follows.
if (condition) {
if (condition) {
statement1
}
}
else {
statement2
}
Nesting
Nested if statements can quite easily become unreadable. Does the language provide
any help? For example, the readability of “chained”
if statements can be improved by
the introduction of an
elsif clause. In particular, this eliminates the need for multiple
endifs to close a series of nested ifs. Consider the following example, with and with-
out the
elsif form. Java does not provide an elsif facility, but some languages do,
for example, Visual Basic.Net.
if condition1 then if condition1 then
statement1 statement1
else if condition2 then elsif condition2 then
statement2 statement2
else if condition3 then elsif condition3 then
statement3 statement3
else if condition4 then elsif condition4 then
statement4 statement4
else else
statement5 statement5
endif endif
endif
endif
endif
>
>
>
>
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 181
182 Chapter 14 ■ The basics
Case
Like other languages, Java provides a case or switch statement. Here is used to find
the number of days in each month:
switch (month) {
case 1:
case 3:
case 5:
case 8:
case 10:
case 12:
days = 31;
break;
case 4:
case 6:
case 9:
case 11:
days = 30;
break;
case 2:
days = 28;
break;
default:
days = 0;
break;
}
The break statement causes control to be transferred to the end of the switch
statement. If a break statement is omitted, execution continues onto the next case and
generally this is not what you would want to happen. So inadvertently omitting a
break
statement creates an error that might be difficult to locate. If the default option is
omitted, and no case matches, nothing is done.
The expressiveness of the
case statement is impaired if the type of the case selector
is restricted. It should not have to be an integer (as above), but in most languages it is.
Similarly, it should be easy to specify multiple alternative case choices (e.g.
1|5|7
meaning 1 or 5 or 7) and a range of values as a case choice (e.g. 1 99). But Java does
not allow this.
The reliability of the
case statement is enhanced if the case choices must specify
actions for all the possible values of the case selector. If not, the semantics should, at
least, clearly state what will happen if the case expression evaluates to an unspecified
choice. The ability to specify an action for all unspecified choices through a
default
or similar clause is appealing.
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 182
14.7 Repetition 183
There is something of a controversy here. Some people argue that when a case state-
ment is executed, the programmer should be completely aware of all the possibilities that
can occur. So the
default statement is redundant and just an invitation to be lazy and
sloppy. Where necessary, the argument goes, a
case statement should be preceded by
if statements that ensure that only valid values are supplied to the case statement.
if-not
It would be reasonable to think that there would no longer be any controversy over lan-
guage structures for selection. The
if-else is apparently well established. However,
the lack of symmetry in the
if statement is open to criticism. While it is clear that the
then part is carried out if the condition is true, the else part is rather tagged on at the
end to cater for all other situations. Experimental evidence suggests that significantly
fewer bugs arise if the programmer is required to restate the condition (in its negative
form) prior to the
else as shown below:
if condition
statement1
not condition else
statement2
endif
Control structures for repetition traditionally fall into two classes. There are loop struc-
tures where the number of iterations is fixed, and those where the number of iterations
is controlled by the evaluation of some condition. Fixed length iteration is often imple-
mented using a form similar to that shown below:
for control_variable =
initial_expression to final_expression step step_expression
do
statement(s)
endfor
The usefulness and reliability of the for statement can be affected by a number of
issues as now discussed
Should the type of the loop control variable be limited to integers? Perhaps any ordi-
nal type should be allowed. However, reals (floats) should not be allowed. For example,
consider how many iterations are specified by the following:
for x = 0.0 to 1.0 step 0.33 do
Here it is not at all obvious exactly how many repetitions will be performed, and things
are made worse by the fact that computers represent real values only approximately.
14.7
●
Repetition
>
>
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 183
184 Chapter 14 ■ The basics
(Note how disallowing the use of reals as loop control variables conflicts with the aim
of orthogonality).
The semantics of the
for is greatly affected by the answers to the following ques-
tions. When and how many times are the initial expression, final expression and step
expressions evaluated? Can any of these expressions be modified within the loop? What
is of concern here is whether or not it is clear how many iterations of the loop will be
performed. If the expressions can be modified and the expressions are recomputed on
each iteration, then there is a distinct possibility of producing an infinite loop. Similar
problems arise if the loop control variable can be modified within the loop.
The scope of the loop control variable is best limited to the
for statement, as in
Java. If it is not, then what should its value be on exit from the loop, or should it be
undefined?
Condition-controlled loops are simpler in form. Almost all modern languages pro-
vide a leading decision repetition structure (
while-do) and some, for convenience, also
provide a trailing decision form (
repeat-until).
while condition do repeat
statement(s) statement(s)
endwhile until condition
The while form continues to iterate while a condition evaluates to true. Since the
test appears at the head of the form, the
while performs zero or many iterations of the
loop body. The
repeat, on the other hand, iterates until a condition is true. The test
appears following the body of the loop, ensuring that the
repeat performs at least one
iteration. Thus the
while statement is the more general looping mechanism of the two,
so if a language provides only one looping mechanism, it should therefore be the
while.
However the
repeat is sometimes more appropriate in some programming situations.
>
>
SELF-TEST QUESTION
14.1 Identify a situation where repeat is more appropriate than while.
Some languages provide the opposites of these two loops:
do
statement(s)
while condition
and:
until condition do
statement(s)
end until
>
>
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 184
14.7 Repetition 185
C, C++, C# and Java all provide while-do and do-while structures. They also pro-
vide a type of
for statement that combines together several commonly used ingredi-
ents. An example of this loop structure is:
for (i = 0; i < 10; i++) {
statement(s)
}
in which:
■ the first statement within the brackets is done once, before the loop is executed
■ the second item, a condition, determines whether the loop will continue
■ the third statement is executed at the end of each repetition.
We will meet yet another construct for repetition – the
foreach statement – in the
chapter on object-oriented programming language features (Chapter 15). This is con-
venient for processing all the elements of a data structure.
The
while and repeat structures are satisfactory for the vast majority of iterations
we wish to specify. For the most part, loops which terminate at either their beginning
or end are sufficient. However, there are situations, notably when encountering some
exceptional condition, where it is appropriate to be able to branch out of a repetition
structure at an arbitrary point within the loop. Sometimes it is necessary to break out
of a series of nested loops rather than a single loop. In many languages, the program-
mer is limited to two options. The terminating conditions of each loop can be enhanced
to accommodate the “exceptional” exit, and
if statements can be used within the loop
to transfer control to the end of the loop should the exceptional condition occur. This
solution is clumsy at best and considerably decreases the readability of the code. A sec-
ond, and arguably better, solution is to use the much-maligned
goto statement to
branch directly out of the loops. Ideally, however, since there is a recognized need for
n and a half times loops, the language should provide a controlled way of exiting from
one or more loops. Java provides the following facility where an orderly
break may be
made but only to the statement following the loop(s).
while (condition) {
statement(s)
if (condition) break;
statement(s)
}
In the example above, control will be transferred to the statement following the loop
when
condition is true. This may be the only way of exiting from this loop.
here:
while (condition) {
while (condition) {
>
>
>
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 185
186 Chapter 14 ■ The basics
statement(s)
if (exitCondition) break here;
statement(s)
}
}
In the second example above, control will be transferred out of both while loops
when
exitCondition is true. Note how the outer while loop is labeled here: and
how this label is used by the
if statement to specify that control is to be transferred to
the end of the
while loop (not the beginning) when exitCondition is satisfied.
>
SELF-TEST QUESTION
14.2 Sketch out the code for a method to search an array of integers to find
some desired integer. Write two versions – one using the
break mech-
anism and one without
break.
The languages C, C++, Ada and Java provide a mechanism such as the above for
breaking out in the middle of loops.
There is some controversy about using
break statements. Some people argue that it
is simply too much like the notorious
goto statement. There is a difference, however,
because
break can only be used to break out of a loop, not enter into a loop. Neither
can
break be used to break out of an if statement. Thus it might be argued that
break is a goto that is under control.
Handling errors or exceptional situations is a common programming situation. In
the past, such an eventuality was handled using the
goto statement. Nowadays features
are built in to programming languages to facilitate the more elegant handling of such
situations. We discuss the handling of exceptions in Chapter 17.
Procedural or algorithmic abstraction is one of the most powerful tools in the pro-
grammer’s arsenal. When designing a program, we abstract what should be done
before we specify how it should be done. Before OOP, program designs evolved as
layers of procedural abstractions, each layer specifying more detail than the layer
above. Procedural abstractions in programming languages, such as procedures and
functions, allow the layered design of a program to be accurately reflected in the
structure of the program text. Even in relatively small programs, the ability to factor
a program into small, functional modules is essential; factoring increases the read-
ability and maintainability of programs. What does the software engineer require
from a language in terms of support for procedural abstraction? We suggest the
14.8
●
Methods
BELL_C14.QXD 1/30/05 4:23 PM Page 186
14.8 Methods 187
following list of requirements:
■ an adequate set of primitives for defining procedural abstractions
■ safe and efficient mechanisms for controlling communication between program units
■ simple, clearly defined mechanisms for controlling access to data objects defined
within program units.
Procedures and functions
The basic procedural abstraction primitives provided in programming languages are
procedures and functions. Procedures can be thought of as extending the statements of
the language, while functions can be thought of as extending the operators of the lan-
guage. A procedure call looks like a distinct statement, whereas a function call appears
as or within an expression.
The power of procedural abstraction is that it allows the programmer to consider the
method as an independent entity performing a well-described task largely independent
of the rest of the program. When a procedure is called, it achieves its effect by modify-
ing the data in the program which called it. Ideally, this effect is communicated to the
calling program unit in a controlled fashion by the modification of the parameters
passed to the procedure. Functions, like their mathematical counterparts, return only
a single value and must therefore be embedded within expressions. A typical syntax for
writing procedures and functions is shown below:
void procedureName(parameters) {
declarations
procedure body
}
resultType functionName(parameters) {
declarations
function body
return value;
}
It is critical that the interface between program units be small and well defined if we
are to achieve independence between units. Ideally both procedures and functions
should only accept but not return information through their parameters. A single result
should be returned as the result of calling a function.
For example, to place text in a text box, use a procedure call as illustrated by the fol-
lowing code:
setText("your message here");
and a function call to obtain a value:
String text = getText();
>
>
BELL_C14.QXD 1/30/05 4:23 PM Page 187