Please Do Not Spare the Parentheses

Parentheses often seem an afterthought. Readable code is deemed important, but often the emphasis seems to focus on typographical concerns such as indentation and spacing. The same can be said for good commenting, which is non-operative commentary. These practices are part of every introductory programming class. Documentation and readability are important, yet correctness of code is even more critical. The proper use of parentheses often does not receive similar attention. This is a severe shortcoming.

Such parenthetical parsimony is misplaced. Failure to adequately parenthesize expressions is a poor practice that often leads to errors. Some of these errors are subtle; others are not. It is a problem that occurs far too often.

Ever since I attended elementary school, I have been using parentheses to group elements of a computation. Programming is an even more demanding regime. This is particularly true when writing macros. The complication often appears when macros are nested, or potentially nested, within other macros. Relying on order of operations is certainly safe in much code, but there are areas which are particularly prone to accidents. The other week, I ran into another such case.

I have been writing macros in a variety of languages since I first started programming. I started with FORTRAN statement function definitions, then macros in a variety of assembler languages on a wide range of architectures,[1] and subsequently wrote macros in LITTLE[2] and C/C++. In all cases, I found that omitting parentheses is a linguistic blunder that often has catastrophic consequences.

Insufficient parentheses can create a syntactically invalid macro expansion. This is the best case. More pernicious cases involve syntactically valid macro expansions that do not operate as intended. Hidden in layers of macros, such errors can appear transient. In reality the error depends on the precise nature of the surrounding text. Problems like these often cause developers to avoid using or writing macros. Avoidance is unnecessary. A few simple rules prevent these problems in all cases.

Regardless of the language, macro processors are simple. They expand a simple input into a different text. As their name implies, generally this is expected to be a longer string, but this need not be the case.[3] As a powerful, but admittedly simplistic case, macro processors are often used to implement symbolic constants. Macro expansion is done at the earliest stages of processing during lexical analysis, the front end of the compilation process.

As an example, consider the C macro amacro (Figure 1):

#define amacro(x)     x * 5 + y
Figure 1 – C macro amacro

When the macro is expanded during lexical analysis the string “x * 5 + y” is substituted each time amacro occurs in the program text. In typical C compilers, the output of the lexical pass can be examined and the results of the expansion seen directly. A careful examination of the macro illustrates why the example above is poorly written.

When x is a simple variable, the results are as expected. However, when an invocation of amacro is within a more complicated expression, the first error comes to light.

z = ~amacro(c);
Figure 2 – Invocation of amacro
with bitwise complemented result

Figure 2 appears innocuous. amacro could be a function call, or it could be a macro. If it is a function call, amacro is invoked with a parameter of c, and the bitwise complement of the result is assigned to z.

If amacro is a macro, there is a potential problem. The problem become apparent when one looks at the expansion of amacro(x). Bitwise complement has a higher precedence than the other operations, thus the actual expression generated by the above is: “z = ~c * 5 + y”. Most likely, this was not the expression that was intended by the author. It is also not stable. Some parameters expand to produce one formula, others produce a mathematically different formula. Small changes in the unexpanded source code can have surprising and most likely unintended side effects. This is particularly true when macros are nested. These problems are particularly difficult to identify as they are often not apparent on a casual reading of the text. This is one of the fears when working with macros. This weakness is curable by adding a pair of parentheses (Figure 3).

#define amacro(x) (x * 5 + y)
Figure 3 – amacro computation wrapped in
parentheses

Wrapping the entire expansion in parentheses ensures that the text surrounding the macro invocation is irrelevant. The extra parentheses assure us that the expansion of the macro will be treated as a single entity for evaluating expressions.

The second flaw in amacro is the lack of a containing set of parentheses around the use of the formal parameter (e.g., x). This omission creates a second latent hazard.

Consider what happens if the parameter is an expression (Figure 4).

s = amacro(c+4);
z0 = Radius * s * 2;
z1 = Radius * amacro(c+4) * 2;
Figure 4 – amacro with expression as parameter

The expansion of the expression in Figure 4 is: “z = Radius * (c + 4 * 5 + y) * 2”. Looked at algebraically, one would expect z0 and z1 to be equivalent. A look at the textual expansion reveals that these two computations are not interchangeable. Once again, the problem is a concealed problem involving operator precedence.

The problem of concealed operator precedence hazards is resolved using the same tool as was used earlier: parentheses. In this case, the hazard is eliminated by wrapping each use of a formal parameter in its own set of parentheses. This ensures that implicit operator precedence will not create a problem when the macro is expanded.

With both oversights eliminated, the revised definition of amacro (Figure 5) is safe. There is no need for concern when the parameter to the macro is an expression, or when the macro itself is used in an expression.

#define amacro(x) ( (x) * 5 + y)
Figure 5 – Corrected C macro amacro

Many examples of this approach can be found within the standard macro collections contained in the C/C++ header files.

When writing macros of any flavor, sparing parentheses is a dangerous practice fraught with hazard. The parsimony is misplaced. Extra parentheses do have any deleterious effects. The overhead during compilation is negligible, and there is no cost during execution.

Notes

[1] IBM System/360/370, PDP-11 MACRO-11, and VAX MACRO-32 among others
[2] A portable systems implementation language conceived by Jack Schwartz of NYU
[3] Symbolic constants are often defined with names far longer than their actual numeric value (e.g., The ANSI C standard defines that the constant FILENAME_MAX shall be defined as the implementation-defined maximum length of a filename. On one of the author's OpenVMS systems, it is defined as 256).

References

URLs for referencing this entry

Picture of Robert Gezelter, CDP
RSS Feed Icon RSS Feed Icon
Add to Technorati Favorites
Follow us on Twitter
Bringing Details into Focus, Focused Innovation, Focused Solutions
Robert Gezelter Software Consultant Logo
http://www.rlgsc.com
+1 (718) 463 1079