Rewriting Naur's "Programming as Theory Building"

October 23, 2020 - 9 minutes read - 1711 words

Peter Naur was undoubtedly a great computer scientist: Turing award winner, co-creator of the influential Algol language, and co-inventor of the Backus-Naur notation used to explain the options of programs and program routines. The significance of Naur’s contributions are beyond reproach. One of my favorite works of his is “Programming as Theory Building”.

However, the language, style, and approachability make this article less-accessible, and that is a shame. In this post I’d like to humbly submit my edit: “Programming as Theory Building 1.01.”

Programmings as Theory Building (1.01)

Introduction

Programming feels like many different activities as we do it: meditation, craft, equation-solving, conversing. All these activities can fit under a common activity name: “theory building.” This idea comes from Gilbert Ryle’s 1949 Book, The Concept of Mind.

“Theory,” as Ryle and I use it, means a certain kind of insight of the matters at hand so that the thinker can expound to themself, or the world, the whole theory or any part of it.

Theory Building is an activity by which programmers obtain the theory. They “rehearse” it in their minds or with other verbally or written and find dimly understood areas. The understand that the theory of today might have some dim areas and is preparation for a fuller understanding tomorrow.

We don’t talk much about “theory building” as the work of programmers. The typical view is the “production view” that suggests programming’s purpose to be the production of a text bloc (“code”) document. This is to our detriment.

Communication suffers when this view rules. It leads team members to explain “what code to type” versus sharing “what does this part is about” or “why this design accommodates changes and extensions well.” It also creates frustration in interactions with non-technical stakeholders. If a program is seen as its code, stakeholders might well ask why it takes so long to produce a program. “Write one word then another then another! What’s so hard?”

With “theory building” as shared framework, more meaningful communication is possible. With it, we may hope to have more productive teams, more fruitful discussions, and more empathy for one another.

Against the Production View

The production view holds that access to well-written code and documentation (the “products”) is sufficient for extending codebases in a well-integrated fashion. Real-world experiences suggest this view to be incorrect.

For example, Team 1 wrote a product and released it for a given platform. Team 2 sought to port that product to a new platform. To help in this effort, Team 2 contracted to get both the code and documentation as well as a certain amount of review hours from Team 1.

In post-delivery analysis, the review process was identified as having been the most valuable practice. Team 1 frequently told Team 2 that a proposal from them bypassed an carefully-added, elegant and documented feature or interface. By using Team 1’s API, Team 2 introduced fewer lines of code and avoided bypasses (or, “hacks”) on well-tested code paths.

Ten years past Team 2’s successful delivery, a Team 1 member gained access to the maintained code. They found a mess. Team 1/2 hybrid’s original, powerful, clean structure was buried under heaps of bypasses. In many cases, these bypasses undermined the power and simplicity of the original releases. Beyond the mere aesthetics, because the predicted code paths’ flow had been circumvented, error-handling code had also been circumvented and new bugs introduced which, in turn, necessitated new layers of testing.

The thing that prevented such degradation was passed from Team 1 to Team 2, but not from Team 2 to the maintenance programmers despite the same code and documentation being of similar quality.

Access to perfect documentation failed to prevent inelegant extensions and updates.

This suggests that the production view is missing a product.

This missing item is the theory of the program. It seems that the theory can be readily (and perhaps accidentally) communicated in interpersonal collaboration. The product view makes no accommodation for it. Likely possibilities for this failure are:

It is not recognized as an artifact
The understood artifact is not deemed worth capturing / an experience worth facilitating
It is impossible to capture such a thing as an artifact

The remainder of this document describes the artifact that results from “theory building” and explains how it helps us avoid unnecessary costs beyond the anecdotes above.

The Theory To Be Built by the Programmer

In terms of Ryle’s notion of theory, what has to be built by the programmer is a theory of how certain affairs of the world will be handled by a computer program.

But code created by those without the theory shows the same syntactic elements (if, collection[2]) as those with it. Code written by theory-havers would superficially to be the same as theory-deficient programmers.

To detect theory possession, one must interact with the programmers at a human level. They are able to articulate:

How the program represents the known interactions in the world

For example, when a user inserts a valid credit card, presses the “Dr Pepper” icon, and “Dr Pepper” cans are present, a charge is sent to the actuator on the “Dr Pepper” bay and the machine releases one can of soda. Consequently this group of conditional logic serves as a gate to executing the code that opens the bay.

How the program’s choices are defensible by means of appealing to the theory

For example, when asked “Why is there an actuateBay(int bayId) method?” Because vending machines of this mechanical design actuate bays to release cans, which subject to gravity fall into the retrieval bin.

How the program can be modified to match new affairs of the world in a sensible fashion

Supposing a new model of vending machine is developed which uses pull-out drawers, theory-holders might immediately suppose an unlockDrawer(int drawerId) routine.

Theory possessors can be said to possess a superior insight into the domain. Accordingly, their code mirrors real world phenomena at hand more naturally. Extensions will have obvious-seeming places to be inserted. Under interview they will be able to determine whether a procedure call is a misplaced bypass (“hack”) or fits with the harmonious entirety of the code driven by theory.

Due to their inferior insight, programmers without a theory of the program can only appeal to the mechanistic operation of the code, operating as a human implementer of the programming language’s rules and operation. At best they’re competent, but slow, manipulators of abstract data. It cannot, therefore, be surprising that their code would be less robust or thoughtful: they cannot see around corners, intuit edge cases, or ask why instead of how.

Lack of comprehension, that is, lack of theory, creates one of the biggest costs in software development: costs of modification.

Problems and Costs of Program Modification

Programs will be modified. We seek to ensure that modified code 1) satisfies new requirements and 2) does not degrade any pre-existing capability.

ASIDE: The expectation that program modifications ought be possible at low cost merits a closer look. We don’t expect this of buildings: the design of a sports arena is not evaluated as a success in terms of the ease with which it supports the functions of a hospital (“ease of modification”). Likewise a building that’s built to function both a sports arena as well as hospital (“flexible”) is likely to excel at neither purpose. To support a flexible purpose, additional costs will have to be taken or custom, flexible-use structures will have to be built (blood bag storage / ice cream storage refrigerators in our hospital / arena example). Additionally if one of the flexible “modes” is rarely used, those investments’ returns are never realized.

The belief that software should be easily modified and / or flexible might constitute a critical cognitive error in thinking about software.

In order to modify existing code, the present code must be analyzed and its shortcoming against an ideal must be found. To “project” an ideal requires theory-possession. Theory-possessors can then turn this evaluation into an insight that allows them to identify 1) the natural place where a new path must be followed and 2) the natural place where that path’s results ought be reintegrated into understood, pre-existing code flows.

NOTE: We would not say that the modification requires a new theory. As stated earlier, possession of the theory requires the ability to make changes.

But what of those without the theory? They interpret the code and, since correct outcome is their only criterion for “correct modification,” they edit the code to behave in such a way as the modification requires without reference to an external, harmonizing criterion: the theory. A desired modification may usually be realized in many different ways, all correct, per the rules of the programming language, yet it is only the theory-holder who knows the right place to make that change. Indeed it is only a theory-possessor who could suppose this notion of “rightness.”

Modifications made in deafness to the harmony of the theory will be viewed, by those theory-possessors as being “hacks” or “patches.” Coding “rightly” is oftentimes called “taste” or “elegance” among software developers.

When “hacks” become integrated into the code, error, confusion, and unforeseen costs enter. Edits lacking foundation in theory obscure and confuse. Single paths suddenly become multiple, magical parameters appear in a single case, but in no others, etc. It is these asymmetries and special cases that create the bugs and costs that flummox those who expect software modification to be cost-effective, swift, and easy.

When all changes are made in accordance with the theory, they each contribute to strengthening the quality of the program. The text reinforces the theory; the theory nourishes the code.

Program Life, Death and Revival

If the continued existence of a theory is the determinant of whether code will be “full of hacks” or “an elegant and internally-consistent system,” it is natural to ask how the theory comes to be, ceases existence, and how it might be reinstated. I write these to match the normal life phases of “life,” “death,” and “revival.”

END

Motivation

In sharing the original article with colleagues, several noted that the work runs against our general guidelines of readability, friendliness, and approachability.

Footnotes

Naur, Peter. “Programming as Theory Building.” Microprocessing and Microprogramming Volume 15, Issue 5, (May 1985): 253-261.