Three new books, Go Optimizations 101, Go Details & Tips 101 and Go Generics 101 are published now. It is most cost-effective to buy all of them through this book bundle in the Leanpub book store.

Explain Panic/Recover Mechanism in Detail

(Note: This is the old version of this article. This version states a general panic and a Goexit signal are independent of each other, which is reasonable in my opinion but is inconsistent with the official standard Go runtime implementation. Before releasing Go Toolchain 1.19, the Go core team clarified that the official standard Go runtime implementation is okay. In other words, the clarification thinks a Goexit signal should be viewed as a harmless unrecoverable panic. Please read the new version which adpots the clarification.)

Panic and recover mechanism has been introduced before, and several panic/recover use cases are shown in the last article. This current article will explain panic/recover mechanism in detail. Exiting phases of function calls will also be explained in detail.

Exiting Phases of Function Calls

In Go, a function call may undergo an exiting phase before it fully exits. In the exiting phase, the deferred function calls pushed into the deferred call queue during executing the function call will be executed (in the inverse pushing order). When all of the deferred calls fully exit, the exiting phase ends and the function call also fully exits.

Exiting phases might also be called returning phases elsewhere.

A function call may enter its exiting phase (or exit directly) through three ways:
  1. after the call returns normally.
  2. when a panic occurs in the call.
  3. after the runtime.Goexit function is called and fully exits in the call.
For example, in the following code snippet,
import (
	"fmt"
	"runtime"
)

func f0() int {
	var x = 1
	defer fmt.Println("exits normally:", x)
	x++
	return x
}

func f1() {
	var x = 1
	defer fmt.Println("exits normally:", x)
	x++
}

func f2() {
	var x, y = 1, 0
	defer fmt.Println("exits for panicking:", x)
	x = x / y // will panic
	x++       // unreachable
}

func f3() int {
	x := 1
	defer fmt.Println("exits for Goexiting:", x)
	x++
	runtime.Goexit()
	return x+x // unreachable
}

Associating Panics and Goexit Signals of Function Calls

(Note: Again, the explanations in this section conflict with the official standard Go runtime implementation. Please read the new version which is consistent with the official standard Go runtime implementation.)

When a panic occurs directly in a function call, we say the (unrecovered) panic starts associating with the function call. Similarly, when the runtime.Goexit function is called in a function call, we say a Goexit signal starts associating with the function call after the runtime.Goexit call fully exits. A panic and a Goexit signal are independent of each other. As explained in the last section, associating either a panic or a Goexit signal with a function call will make the function call enter its exiting phase immediately.

We have learned that panics can be recovered. However, there are no ways to cancel a Goexit signal.

At any give time, a function call may associate with at most one unrecovered panic. If a call is associating with an unrecovered panic, then For example, in the following program, the recovered panic is panic 3, which is the last panic associating with the main function call.
package main

import "fmt"

func main() {
	defer func() {
		fmt.Println(recover()) // 3
	}()
	
	defer panic(3) // will replace panic 2
	defer panic(2) // will replace panic 1
	defer panic(1) // will replace panic 0
	panic(0)
}

As Goexit signals can't be cancelled, arguing whether a function call may associate with at most one or more than one Goexit signal is unnecessary.

Although it is unusual, there might be multiple unrecovered panics coexisting in a goroutine at a time. Each one associates with one non-exited function call in the call stack of the goroutine. When a nested call still associating with an unrecovered panic fully exits, the unrecovered panic will spread to the nesting call (the caller of the nested call). The effect is the same as a panic occurs directly in the nesting call. That says,

So, when a goroutine finishes to exit, there may be at most one unrecovered panic in the goroutine. If a goroutine exits with an unrecovered panic, the whole program crashes. The information of the unrecovered panic will be reported when the program crashes.

When a function is invoked, there is neither a panic nor Goexit signals associating with its call initially, no matter whether its caller (the nesting call) has entered exiting phase or not. Surely, panics might occur or the runtime.Goexit function might be called later in the process of executing the call, so panics and Goexit signals might associate with the call later.

The following example program will crash if it runs, because the panic 2 is still not recovered when the new goroutine exits.
package main

func main() {
	// The new goroutine.
	go func() {
		// The anonymous deferred call.
		// When it fully exits, the panic 2 will spread
		// to the entry function call of the new
		// goroutine, and replace the panic 0. The
		// panic 2 will never be recovered.
		defer func() {
			// As explained in the last example,
			// panic 2 will replace panic 1.
			defer panic(2)
			
			// When the anonymous function call fully
			// exits, panic 1 will spread to (and
			// associate with) the nesting anonymous
			// deferred call.
			func () {
				panic(1)
				// Once the panic 1 occurs, there will
				// be two unrecovered panics coexisting
				// in the new goroutine. One (panic 0)
				// associates with the entry function
				// call of the new goroutine, the other
				// (panic 1) associates with the
				// current anonymous function call.
			}()
		}()
		panic(0)
	}()
	
	select{}
}
The output (when the above program is compiled with the standard Go compiler v1.19):
panic: 0
	panic: 1
	panic: 2

goroutine 5 [running]:
...

The format of the output is not perfect, it is prone to make some people think that the panic 0 is the final unrecovered panic, whereas the final unrecovered panic is panic 2 actually.

Similarly, when a nested call fully exits and it is associating with a Goexit signal, then the Goexit signal will also spread to (and associate with) the nesting call. This will make the nesting call enter (if it hasn't entered) its exiting phase immediately.

The above has mentioned that a panic and a Goexit signal are independent of each other. In other words, an unrecovered panic should not cancel a Goexit signal, and a Goexit signal should not shadow an unrecovered panic or be cancelled. However, both of the current official standard Go compiler (gc, v1.19) and gccgo (v10) don't implement this rule correctly. For example, the following program should crash but it doesn't if it is compiled with the current versions of gc and gccgo.
package main

import "runtime"

func f() {
	// The Goexit signal shadows the "bye"
	// panic now, but it should not.
	defer runtime.Goexit()
	panic("bye")
}

func main() {
	go f()
	
	for runtime.NumGoroutine() > 1 {
		runtime.Gosched()
	}
}

The problem will be fixed in future versions of gc and gccgo.

The following example program should exit quickly in running, but it will not compile correctly with the current gccgo version (v8.0) and gc versions before Go Toolchain 1.14. In fact, it never exits if it compiles with those compiler versions.
package main

import "runtime"

func f() {
	defer func() {
		recover()
	}()
	defer panic("will cancel Goexit but should not")
	runtime.Goexit()
}

func main() {
	c := make(chan struct{})
	go func() {
		defer close(c)
		f()
		for {
			runtime.Gosched()
		}
	}()
	<-c
}

Since Go Toolchain 1.14, the problem has been fixed in the standard compiler (gc).

Some recover Calls Are No-Ops

The builtin recover function must be called at proper places to take effect. Otherwise, the calls are no-ops. For example, none of the recover calls in the following example recover the bye panic.
package main

func main() {
	defer func() {
		defer func() {
			recover() // no-op
		}()
	}()
	defer func() {
		func() {
			recover() // no-op
		}()
	}()
	func() {
		defer func() {
			recover() // no-op
		}()
	}()
	func() {
		defer recover() // no-op
	}()
	func() {
		recover() // no-op
	}()
	recover()       // no-op
	defer recover() // no-op
	panic("bye")
}

We have already known that the following recover call takes effect.
package main

func main() {
	defer func() {
		recover() // take effect
	}()

	panic("bye")
}

Then why don't those recover calls in the first example in the current section take effect? Let's read the current version of Go specification:
The return value of recover is nil if any of the following conditions holds:
  • panic's argument was nil;
  • the goroutine is not panicking;
  • recover was not called directly by a deferred function.

There is an example showing the first condition case in the last article.

Most of the recover calls in the first example in the current section satisfy either the second or the third conditions mentioned in Go specification, except the first call. Yes, the current descriptions are not precise yet. It is still being improved now.

In fact, the current Go specification also doesn't explain well why the second recover call, which is expected to recover panic 1, in the following example doesn't take effect.
// This program exits without recovering panic 1.
package main

func demo() {
	defer func() {
		defer func() {
			recover() // this one recovers panic 2
		}()

		defer recover() // no-op

		panic(2)
	}()
	panic(1)
}

func main() {
	demo()
}

What Go specification doesn't mention is that, at any given time, only the newest unrecovered panic in a goroutine is recoverable. In other words, each recover call is viewed as an attempt to recover the newest unrecovered panic in the current goroutine. This is why the second recover call in the above example is a no-op.

OK, now, let's try to make a short description on which recover calls will take effect:
A recover call takes effect only if the direct caller of the recover call is a deferred call and the direct caller of the deferred call associates with the newest unrecovered panic in the current goroutine. An effective recover call disassociates the newest unrecovered panic from its associating function call, and returns the value passed to the panic call which produced the newest unrecovered panic.

Index↡

The Go 101 project is hosted on Github. Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies, description flaws, code bugs and broken links.

If you would like to learn some Go details and facts every serveral days, please follow Go 101's official Twitter account @go100and1.

The digital versions of this book are available at the following places:
Tapir, the author of Go 101, has been on writing the Go 101 series books and maintaining the go101.org website since 2016 July. New contents will be continually added to the book and the website from time to time. Tapir is also an indie game developer. You can also support Go 101 by playing Tapir's games (made for both Android and iPhone/iPad):
Individual donations via PayPal are also welcome.

Index: