Long Live the GOTO Statement

Introduction: Infamous GOTO

Sure, since Dijkstra’s letter outlining the harmful aspects of the goto statement, few have voiced even modest amounts of tolerance for the statement, let alone condoned it’s use. Even those who’ve described practical uses of the goto statement have questioned its existence in higher level languages (e.g., although Donald Knuth noted some utility for goto, he also suggested that he would likely never use it in a language that had sufficiently capable iteration and event constructs.)

Today you can find a myriad of online resources that set the goto statement ablaze. The 5.3 release of PHP provided a unique look at the perception of the goto statement, as prior to that release, PHP lacked the statement. It’s one thing to work with languages that have grandfathered in the statement but deprecated its use, but adding goto to a language that lacked the construct fueled the flames of a thousand suns.

However, even Dijkstra was careful to limit his critique to the basic form of the goto statement prevalent during that era of programming:

The go to statement as it stands is just too primitive, it is too much an invitation to make a mess of one’s program.

So, what could the goto statement possibly do to benefit today’s programmers? Even though we don’t need the goto statement to implement our programs, does that mean we can’t benefit from its thoughtful use?

Hypothesis: GOTO Can Be the Right Tool

Let me state a bold, audacious hypothesis: The goto statement can significantly facilitate the creation of new code and the readability/editability of existing code whilst maintaining a high level of performance.

Problem: Deeply Nested Logic

When writing new code, a programmer works from a mental model of a problem, and the mental model drives the code generation. In contrast, when reading or editing existing code, the programmer attempts to synthesize a mental model from existing code, and the code facilitates generation of the mental model. The differences between these two underlying processes are significant, and they can lead to code that was easily written, but proves very difficult to read.

Deeply nested conditionals, a code structure some refer to as arrow code (see Arrow Anti Pattern, Flattening Arrow Code), provide an example of code that suffers from poor readability. When crafting new code, I’ve whipped out five levels of nested conditionals without breaking a sweat. Such is the power of working from a comprehensive mental model. However, returning to deeply-nested code (even after a short break) has often lead to hours of exasperating re-learning and refactoring.

Fortunately, there are techniques that can be used to limit the depth of the conditionals, including:

These techniques can be very effective, as they all help flatten the display of the program flow and limit the levels of nesting. However, these techniques are not without their own potential issues, as they can hurt readability (proximity of relevant factors can be hampered, the flow of operations might not match the natural mental schema), complicate maintainability, and degrade performance (e.g., adding function calls to the stack.)

Example: Deeply Nested Value Store Function

Let’s look at an example of some code that’s deeply nested. We’re going to use PHP to craft our example because PHP offers a form of the goto statement that significantly restricts its use:

  • Goto targets must point somewhere within the same file and context, so goto cannot jump out of the current function/method.
  • Goto targets cannot be used to jump into a control structure, so goto cannot jump into a loop or switch statement.
Let’s work through a simple function that works as a value store. It allows you to store or set individual values, retrieve the entire set of stored values, and declare values immutable so subsequent updates throw an exception. I’ve omitted comments so the characteristics of the code flow and structures remain the focal point of the example.
function val_nested($name = null, $value = null, $is_mutable = false)
{
	static $values = array();
	static $mutables = array();

	if ($name === null) {
		return $values;
	} else {
		if ($value === null) {
			if (isset($values[$name])) {
				return $values[$name];
			} else {
				return null;
			}
		} else {
			if (isset($values[$name])) {
				if (!$val_is_mutable = in_array($name, $mutables)) {
					throw new Exception('The value "' . $name . '" is immutable and has already been set to '.$values[$name].'.');
				} else {
					return $values[$name] = $value;
				}
			} else {
				if ($is_mutable) {
					$mutables[] = $name;
				} 

				$values[$name] = $value;
				return $value;
			}
		}
	}
}

Refactored Example: Nesting Reduced Using Standard Practices

Below, I’ve provided a refactored version of the value store function that utilizes a combination of the refactoring approaches outlined earlier. Guard clauses have been utilized to bail out as early as possible in the function. The get and set operations have been pulled out into separate functions. And, conditions have been grouped to avoid nesting conditionals. The result is code that requires no more than one level of if-blocks.

My concerns with the refactored version include:

  • The flow of the code neither matches the mental model I have when writing new code nor promotes the acquisition of a mental model when I’m reading existing code.
  • The proximity of relevant information seems to suffer.
  • Adding functions merely for organization (i.e., the functions are unlikely to be reused) degrades performance solely for the sake of organization. Some may say that this concern is tantamount to premature optimization. I disagree, as we’re talking about the techniques that will be used to organize every single function/method block in the code base, and doubling the function calls used in any code base (which in this example roughly doubles the execution time of the refactored function) is a meaningful performance concern.
function val_refactor_get($name, $values)
{
	if (isset($values[$name])) {
		return $values[$name];
	} else {
		return null;
	}
}

function val_refactor_set($name, $value, $is_mutable, &$values, &$mutables)
{
	$val_already_set = isset($values[$name]);

	if (!$val_already_set && $is_mutable) {
		$mutables[] = $name;
		return $values[$name] = $value;
	}

	if (!$val_already_set && !$is_mutable) {
		return $values[$name] = $value;
	}

	$stored_val_is_mutable = in_array($name, $mutables);

	if (!$stored_val_is_mutable) {
		throw new Exception('The value "' . $name . '" is immutable and has already been set to '.$values[$name].'.');
	}

	return $values[$name] = $value;
}

function val_refactor($name = null, $value = null, $is_mutable = false)
{
	static $values = array();
	static $mutables = array();

	if ($name === null) {
		return $values;
	}

	if ($value === null) {
		return val_refactor_get($name, $values);
	}

	return val_refactor_set($name, $value, $is_mutable, $values, $mutables);
}

GOTO Example: A New (Old) Hope

The last code example utilizes PHP’s goto construct to eliminate the deep nesting. As earlier noted, PHP provides some restrictions on its version of the goto construct, and we’ll add one more for the sake of our usage: All goto branches must return a value. This self-imposed restriction ensures that no flow of execution will fall unintentionally into any other labeled block.

I see a few primary concerns with this version. First, this example is the longest of the three. Second, IDE/debugger support for this approach likely ranges from poor to non-existant. Third, and this is a biggie: THIS EXAMPLE USES GOTO AND SOME OF YOU WOULD NEVER, EVER LET THAT HAPPEN!!!

That said, I sincerely believe this version holds significant advantages over the previous versions thanks to the goto construct. While writing new code, I can follow the natural flow of the mental model I’ve developed without worrying about controlling the nesting of the logic through the other techniques. And, while reading and applying edits to existing code, the goto labels provide valuable meta information about the sections of code. If the need arises to add an additional check in the logic, the goto version facilitates finding the location to add the appropriate code, and the structure accommodates the changes with relative ease. Finally, because the goto version doesn’t require more function calls, its performance is on par with the version utilizing deeply-nested if-blocks.

function val_goto($name = null, $value = null, $is_mutable = false)
{
	static $values = array();
	static $mutables = array();

	if ($name === null) {
		goto get_all_values;
	} else {
		goto access_value;
	}

	get_all_values:
		return $values;

	access_value:
		if ($value === null) {
			goto get_value;
		} else {
			goto set_value;
		}

	get_value:
		if (isset($values[$name])) {
			return $values[$name];
		} else {
			return null;
		}

	set_value:
		if (isset($values[$name])) {
			goto set_existing_value;
		} else {
			goto set_new_value;
		}	

	set_existing_value:
		if (!$val_is_mutable = in_array($name, $mutables)) {
			throw new Exception('The value "' . $name . '" is immutable and has already been set to '.$values[$name].'.');
		} else {
			return $values[$name] = $value;
		}

	set_new_value:
		if ($is_mutable) {
			$mutables[] = $name;
		}

		return $values[$name] = $value;
}

Conclusion: Goto Can Be Your Friend

OK, you’ve heard my case for goto, at least the restricted version of goto found in PHP (augmented with our one additional restriction) when used to combat deeply nested logic. In fact, I believe the use of the goto construct in the final example does in fact facilitate the creation of new code and the readability/editability of existing code whilst maintaining a high level of performance.

What do you think?

Update Feb. 17, 2012:
Here’s a great read from the Linux kernel mailing list that discusses why the goto statement is used in the code base: http://kerneltrap.org/node/553

43 thoughts on “Long Live the GOTO Statement”

    1. It’s partly elegance vs readability. I refused to use GOTO in VB6 just because it formatted to the left…but it got me to do something like mentioned for C++ and use an error handling class that turned out to be very powerful.

      it is curious how class and GOTO keep getting mentioned together, and the example PHP here is a keyed static array in an access funtion, which is exactly a class turned on its side–the key is like the instance name and access is data through process structure rather than method through data structure. Is there perhaps a deep underlying tie between this role reversal and GOTO?

      1. At least in the case of my specific concerns about dealing with deeply nested logic, I’m not sure that the role reversal is at play. While my example does use static arrays (it’s a real example from my functionally inspired web framework), most of the resources online dealing with deeply nested logic (a few are linked from the article) apply whether the code is imperative or OO.

        That said, some have promoted the idea that you can combat deeply nested logic through use of polymorphism:

        http://sourcemaking.com/refactoring/replace-conditional-with-polymorphism

        http://stackoverflow.com/questions/494506/dealing-with-nested-if-then-else-nested-switch-statements

        That’s an interesting approach, but I’m just not sure that adding classes to deal with this type of deeply-nested logic is the most developer-friendly approach long-term.

        Nice feedback!

    1. I’m not sure either. I’m going to give it a try for a while in limited regions of my codebase and follow-up with tracking the number of bugs I find in those sections compared to sections refactored using the standard practices. Thanks for reading and considering the idea.

      And, thanks for providing another refactoring example.

  1. When writing C code, I usually add ENTER() and LEAVE() macros at the start and exit of each function, to log the execution of the program. Often a function will have several logical points of exit. I use the goto to jump to a label at the end of the function just before the LEAVE() statement, to enforce a single point of exit, and to ensure that the function exit is always logged.

    The goto can also be useful in error situations in the function to jump to the end of the function where error handling, freeing of resources, etc can be handled, before exiting.

      1. He said C, not C++. How would one go about “instantiating a lightweight class” in C? That being said, it’s a reasonably common practice to wrap setjmp and longjmp into fake exception handling clauses all to circumvent the use of a goto-or-two. Seems counter-intuitive to me when a “goto error” and “goto done” are reasonably self-explanatory.

  2. I think it’s important to understand that what you are trying to do by using goto and requiring that each goto branch returns a value is to create functions without the function-call overhead. It seems to me it’s a bad idea – that’s what functions are for.

    I’m not at all familiar with PHP. In C/C++/Java/C# (and just about any compiled language) the right way to do this is to make the functions real functions. The compiler is then responsible for inlining those functions where that is the right thing to do for performance reasons, though depending on the language you can give it hints, too. If PHP is JIT-compiled then I’d expect the compiler to take care of it there, too.

    1. True, the inlining offered through the compiled languages you mentioned does allow the use of more functions without degrading the performance. However, PHP (and many other scripting languages) don’t offer inlining.

      Whether written as standard functions or as closures within the val function (to avoid having to thread the state), I’m still not sure that the functions buy much in terms of cognitive load, readability, or facilitating with the development of a mental model when reading.

      I’m putting an experiment together and hopefully I’ll be able to speak to these potential differences in the near future.

  3. Nicely written article but most definitely don’t agree with it. The underlying factor is that your single method is probably too big. In your GOTO example; for each goto turn that into a call that returns the value of a called method – so your log original method becomes

    static $values = array();
    static $mutables = array();

    if ($name === null) {
    return get_all_values($values, $mutables);
    } else {
    return access_value($values, $mutables);
    }

    Your code blocks then become much easier to read and the flow is much easier to follow.

    1. You may be right, but I’m not sure that I can think of research that suggests that functions help the readability of this particular example more than the goto blocks. The goto version doesn’t require that you thread the state, enhance the proximity of relevant information, and do so without a performance hit.

      If you have some research, I’d love to read it. Thanks for the feedback!

  4. I absolutely agree with Tom. I’d prefer using functions, better than goto tags, in your example. Using GOTO is all right, but its use is a lot more dangerous than using functions.

    Even for error situations, as meant by Paul Moore, I don’t think using goto is a good idea… that’s why exceptions exist.

  5. Hi
    First off, I like your article.

    I have found what I believe are valid uses for the GOTO statement quite a few times through out my carrier, but I must admit that GOTO is not for everyone, you have to be careful not to loose oversight of your code when using it.

    See below snippet taken from my old C reference book..
    for ( … ) // Jump out of
    for ( … ) // nested loops.
    if ( error
    goto handle_error;

    handle_error: // Error handling here

    Also I have used it a few times to implement bypass/debugging functionality in my programs with success.

    Best regards

    Torving

    1. I was going to give the same example that you gave of the need to use GOTO, but you were faster.

      I always find it requires more code than it should to break two or more loops and GOTO would be a good solution for this, or maybe a new keyword “break(int x)” where x is the number of loops to break.

    2. Finite state machines are also a great example for when to use goto as opposed to function calls. While subroutine threading is an acceptable approach, the goto statement is substantially faster and less abusive on the stack. It’s also the one case where the power of PC’s isn’t enough to overcome the performance hit — nearly all virtual machines are now coded this way.

  6. A co-worker came back from a seminar and told us about a programming manager who would fire any programmer who put a GOTO in his code. I guess you could say he was against GOTOs?

  7. I grew up programing technical programs in Fortran. There were a lot of go to statements and the programs were long. With the emphasis today on shorter main programs and more subroutines and functions, go to statements aren’t as necessary but can be useful.

  8. There seems to be a trend in arguments for GOTO, and your example of how it helps with complicated and difficult to read code, is one of the common ones.

    I learned long ago that I save time, if I just don’t write code that looks like that.

    As to being for or against GOTO, I find it unnecessary. It’s not needed. I forget all about it when I’m not reading arguments like these.

  9. The GOTO statement it´s like a knife: you can use it for good things or you can use it for bad things. And you can cut yourself really bad if you are not careful.

    I understand the concerns about it, but I use it. A simple rule is all that´s needed to not mess things up: I never issue back GOTO´s, if you have to go back in your code you should probably use a loop. I only use it to go forward. It can be very useful. Just like it was the BEGIN SEQUENCE / END SEQUENCE / BREAK structure in good old days Clipper.

    Just don´t run with scissors…

    1. Replying just to say: Clipper! Those were the days. :) I haven’t seen a mention of Clipper in years.
      (I was one of the developers of the ProVision libraries and Exospace.)

  10. Good article! A refreshingly new look at an old construct.

    The example with gotos is definitely clearer.

    A goto saved me a few decades ago when I was programming under a deadline and a compiler bug prevented control from flowing to the next statement. Nothing worked, until a simple goto solved the problem immediately. It’s like a seat belt: You hope you don’t have to use it, but it’s invaluable when you need it.

    On another project, we were generating C code as the output from an AI program that put together mechanisms to make a “custom-written” program. Gotos provided the essential “glue” to connect these mechanisms.

    Gotos can easily be abused, but people have taken this too far.

    1. Proximity plays a role in people’s ability to build mental models and groupings, too, so I’m curious if that aspect can help the goto example in some situations.

      You certainly could be right, but I don’t believe there’s enough current research on the matter that we can reject the goto example as being inferior.

  11. One thousand times what Tom and Paul Hadfield said.

    If you are using GOTO, you can likely refactor your code to make it just as readable and fast without GOTO. And you get the added bonus making it less likely to introduce mistakes when the function is changed in the future.

    1. I’m not sure there’s research that currently allows us to make the claim that the goto is less readable (or more mistake prone.)

      If you know of some research, I’d love to read it (I’m currently putting an experiment together and I’m looking for all of the references I can find.)

      Thanks for the feedback!

  12. Life would be a lot easier if block-structured languages offered a “break ” statement, where is the name of some block (not necessarily the most deeply nested one) in which the break statement is embedded.

    I use goto rarely, and almost always in a break kind of way.

    1. As far as I remember Ada supports named loops – thus one can continue or break iteration at any nesting level, not just the current one. This is nicer way of more complex control flow than using a plain goto.

      Other languages offer a break / continue feature, however this is pretty error-prone, especially when the nesting level changes due to changes after the original implementation: Goto at least jumps to an explicit label!

      I expect that there are certain algorithms which can be expressed best with one or more goto statements, however all examples I´ve seen for justifying a goto can be better expressed using plain
      -breaking out using “return”
      -“finally” block
      -choosing more suitable loop conditions (express why you are iterating, not how)
      -state variables (my personal favorite, works well with enums and types)
      -switch statements (sometimes including fall through – too bad there isn´t an explicit keyword for it in all languages I´m aware of, thus fall through should be documented in order to avoid someone “fixing” it away)
      -polymorphic objects
      -object / scope cleanup

      So far I ´haven´t felt the need to “goto” since my first programming attempts in mid-80s BASIC (ok, .cmd/.bat scripts are excluded: not needing additional script interpreters is worth more than avoiding a goto in a rudimentary batch language 😉

  13. I think people make WAY too big a deal over the GOTO issue. It’s just a language construct folks. If you know how to write elegant code you can write it with or without using GOTO depending on how you are building your code. It’s just a tool. Like anything else it can be abused. I seldom use it anymore myself but there are times that nothing else will do quite as well. I say leave it in there!

  14. In the right circumstances a GOTO construct can be thought of as an interactive or dynamic SWITCH statement, in that the process itself can alter a variable and so change the course of logic in a more dynamic way than a simple SWITCH statement.

    Handled right it can be very powerful.

  15. I go way back to the dark ages when for commercial apps Cobol was the thing and goto less programming was a new concept. Even then I found that go to was useful to reduce the complexity of data vaildation code. It is just an example that following a good idea blindly is just that – blind.

  16. An interesting counter-example to the evils of GOTO is the use of such things in Common Lisp.

    But then, that’s more because Lisp is a weird language: while it starts out high-level, you can descend down as far as you would like–and GOTO is used to create new control structures, if you need them. Thus, GOTO is mostly limited to use in macros. Of course, you don’t have the power to do this in most other languages…

    In any case, my thoughts on GOTO are thus: if you have a problem where it simplifies things, and makes the code easier to read (or provides needed efficiency), then use it; otherwise, don’t. For the most part, this means that you will mostly be using data structures in languages that provide them; in languages that don’t provide them, it is best to emulate control structures with GOTOs when you can.

  17. Hi Adam, nice article. I enjoyed reading through the argument and examples. I think your argument is strongest for systems programmers, especially when working on something like the Linux kernel. As a PHP developer, I wouldn’t spend much time thinking about the performance overhead of objects and function calls. Although I also understand that you chose PHP for your examples because it’s a popular high level language that supports GOTO. I’d be curious to know if Facebook ever considered your approach, given their massive effort to improve the performance of PHP in HHVM.

    In your example, my priority would be to provide a clean interface. I’d expose the getters and setters rather than having a single function with multiple responsibilities.

Leave a Reply

Your email address will not be published. Required fields are marked *