Long Live the GOTO Statement

Introduction: Infamous GOTO

Sure, since Dijkstra’s letter outlining the harmful aspects of the goto statement, few have voiced even modest amounts of tolerance for the statement, let alone condoned it’s use. Even those who’ve described practical uses of the goto statement have questioned its existence in higher level languages (e.g., although Donald Knuth noted some utility for goto, he also suggested that he would likely never use it in a language that had sufficiently capable iteration and event constructs.)

Today you can find a myriad of online resources that set the goto statement ablaze. The 5.3 release of PHP provided a unique look at the perception of the goto statement, as prior to that release, PHP lacked the statement. It’s one thing to work with languages that have grandfathered in the statement but deprecated its use, but adding goto to a language that lacked the construct fueled the flames of a thousand suns.

However, even Dijkstra was careful to limit his critique to the basic form of the goto statement prevalent during that era of programming:

The go to statement as it stands is just too primitive, it is too much an invitation to make a mess of one’s program.

So, what could the goto statement possibly do to benefit today’s programmers? Even though we don’t need the goto statement to implement our programs, does that mean we can’t benefit from its thoughtful use?

Hypothesis: GOTO Can Be the Right Tool

Let me state a bold, audacious hypothesis: The goto statement can significantly facilitate the creation of new code and the readability/editability of existing code whilst maintaining a high level of performance.

Problem: Deeply Nested Logic

When writing new code, a programmer works from a mental model of a problem, and the mental model drives the code generation. In contrast, when reading or editing existing code, the programmer attempts to synthesize a mental model from existing code, and the code facilitates generation of the mental model. The differences between these two underlying processes are significant, and they can lead to code that was easily written, but proves very difficult to read.

Deeply nested conditionals, a code structure some refer to as arrow code (see Arrow Anti Pattern, Flattening Arrow Code), provide an example of code that suffers from poor readability. When crafting new code, I’ve whipped out five levels of nested conditionals without breaking a sweat. Such is the power of working from a comprehensive mental model. However, returning to deeply-nested code (even after a short break) has often lead to hours of exasperating re-learning and refactoring.

Fortunately, there are techniques that can be used to limit the depth of the conditionals, including:

These techniques can be very effective, as they all help flatten the display of the program flow and limit the levels of nesting. However, these techniques are not without their own potential issues, as they can hurt readability (proximity of relevant factors can be hampered, the flow of operations might not match the natural mental schema), complicate maintainability, and degrade performance (e.g., adding function calls to the stack.)

Example: Deeply Nested Value Store Function

Let’s look at an example of some code that’s deeply nested. We’re going to use PHP to craft our example because PHP offers a form of the goto statement that significantly restricts its use:

  • Goto targets must point somewhere within the same file and context, so goto cannot jump out of the current function/method.
  • Goto targets cannot be used to jump into a control structure, so goto cannot jump into a loop or switch statement.
Let’s work through a simple function that works as a value store. It allows you to store or set individual values, retrieve the entire set of stored values, and declare values immutable so subsequent updates throw an exception. I’ve omitted comments so the characteristics of the code flow and structures remain the focal point of the example.
function val_nested($name = null, $value = null, $is_mutable = false)
{
	static $values = array();
	static $mutables = array();

	if ($name === null) {
		return $values;
	} else {
		if ($value === null) {
			if (isset($values[$name])) {
				return $values[$name];
			} else {
				return null;
			}
		} else {
			if (isset($values[$name])) {
				if (!$val_is_mutable = in_array($name, $mutables)) {
					throw new Exception('The value "' . $name . '" is immutable and has already been set to '.$values[$name].'.');
				} else {
					return $values[$name] = $value;
				}
			} else {
				if ($is_mutable) {
					$mutables[] = $name;
				} 

				$values[$name] = $value;
				return $value;
			}
		}
	}
}

Refactored Example: Nesting Reduced Using Standard Practices

Below, I’ve provided a refactored version of the value store function that utilizes a combination of the refactoring approaches outlined earlier. Guard clauses have been utilized to bail out as early as possible in the function. The get and set operations have been pulled out into separate functions. And, conditions have been grouped to avoid nesting conditionals. The result is code that requires no more than one level of if-blocks.

My concerns with the refactored version include:

  • The flow of the code neither matches the mental model I have when writing new code nor promotes the acquisition of a mental model when I’m reading existing code.
  • The proximity of relevant information seems to suffer.
  • Adding functions merely for organization (i.e., the functions are unlikely to be reused) degrades performance solely for the sake of organization. Some may say that this concern is tantamount to premature optimization. I disagree, as we’re talking about the techniques that will be used to organize every single function/method block in the code base, and doubling the function calls used in any code base (which in this example roughly doubles the execution time of the refactored function) is a meaningful performance concern.
function val_refactor_get($name, $values)
{
	if (isset($values[$name])) {
		return $values[$name];
	} else {
		return null;
	}
}

function val_refactor_set($name, $value, $is_mutable, &$values, &$mutables)
{
	$val_already_set = isset($values[$name]);

	if (!$val_already_set && $is_mutable) {
		$mutables[] = $name;
		return $values[$name] = $value;
	}

	if (!$val_already_set && !$is_mutable) {
		return $values[$name] = $value;
	}

	$stored_val_is_mutable = in_array($name, $mutables);

	if (!$stored_val_is_mutable) {
		throw new Exception('The value "' . $name . '" is immutable and has already been set to '.$values[$name].'.');
	}

	return $values[$name] = $value;
}

function val_refactor($name = null, $value = null, $is_mutable = false)
{
	static $values = array();
	static $mutables = array();

	if ($name === null) {
		return $values;
	}

	if ($value === null) {
		return val_refactor_get($name, $values);
	}

	return val_refactor_set($name, $value, $is_mutable, $values, $mutables);
}

GOTO Example: A New (Old) Hope

The last code example utilizes PHP’s goto construct to eliminate the deep nesting. As earlier noted, PHP provides some restrictions on its version of the goto construct, and we’ll add one more for the sake of our usage: All goto branches must return a value. This self-imposed restriction ensures that no flow of execution will fall unintentionally into any other labeled block.

I see a few primary concerns with this version. First, this example is the longest of the three. Second, IDE/debugger support for this approach likely ranges from poor to non-existant. Third, and this is a biggie: THIS EXAMPLE USES GOTO AND SOME OF YOU WOULD NEVER, EVER LET THAT HAPPEN!!!

That said, I sincerely believe this version holds significant advantages over the previous versions thanks to the goto construct. While writing new code, I can follow the natural flow of the mental model I’ve developed without worrying about controlling the nesting of the logic through the other techniques. And, while reading and applying edits to existing code, the goto labels provide valuable meta information about the sections of code. If the need arises to add an additional check in the logic, the goto version facilitates finding the location to add the appropriate code, and the structure accommodates the changes with relative ease. Finally, because the goto version doesn’t require more function calls, its performance is on par with the version utilizing deeply-nested if-blocks.

function val_goto($name = null, $value = null, $is_mutable = false)
{
	static $values = array();
	static $mutables = array();

	if ($name === null) {
		goto get_all_values;
	} else {
		goto access_value;
	}

	get_all_values:
		return $values;

	access_value:
		if ($value === null) {
			goto get_value;
		} else {
			goto set_value;
		}

	get_value:
		if (isset($values[$name])) {
			return $values[$name];
		} else {
			return null;
		}

	set_value:
		if (isset($values[$name])) {
			goto set_existing_value;
		} else {
			goto set_new_value;
		}	

	set_existing_value:
		if (!$val_is_mutable = in_array($name, $mutables)) {
			throw new Exception('The value "' . $name . '" is immutable and has already been set to '.$values[$name].'.');
		} else {
			return $values[$name] = $value;
		}

	set_new_value:
		if ($is_mutable) {
			$mutables[] = $name;
		}

		return $values[$name] = $value;
}

Conclusion: Goto Can Be Your Friend

OK, you’ve heard my case for goto, at least the restricted version of goto found in PHP (augmented with our one additional restriction) when used to combat deeply nested logic. In fact, I believe the use of the goto construct in the final example does in fact facilitate the creation of new code and the readability/editability of existing code whilst maintaining a high level of performance.

What do you think?

Update Feb. 17, 2012:
Here’s a great read from the Linux kernel mailing list that discusses why the goto statement is used in the code base: http://kerneltrap.org/node/553