Three new books, Go Optimizations 101, Go Details & Tips 101 and Go Generics 101 are published now. It is most cost-effective to buy all of them through this book bundle in the Leanpub book store.

Memory Order Guarantees in Go

About Memory Ordering

Many compilers (at compile time) and CPU processors (at run time) often make some optimizations by adjusting the instruction orders, so that the instruction execution orders may differ from the orders presented in code. Instruction ordering is also often called memory ordering.

Surely, instruction reordering can't be arbitrary. The basic requirement for a reordering inside a specified goroutine is the reordering must not be detectable by the goroutine itself if the goroutine doesn't share data with other goroutines. In other words, from the perspective of such a goroutine, it can think its instruction execution order is always the same as the order specified by code, even if instruction reordering really happens inside it.

However, if some goroutines share some data, then instruction reordering happens inside one of these goroutine may be observed by the others goroutines, and affect the behaviors of all these goroutines. Sharing data between goroutines is common in concurrent programming. If we ignore the results caused by instruction reordering, the behaviors of our concurrent programs might compiler and CPU dependent, and often abnormal.

Here is an unprofessional Go program which doesn't consider instruction reordering. the program is expanded from an example in the official documentation Go 1 memory model.
package main

import "log"
import "runtime"

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
	if done {
		log.Println(len(a)) // always 12 once printed
	}
}

func main() {
	go setup()

	for !done {
		runtime.Gosched()
	}
	log.Println(a) // expected to print: hello, world
}
The behavior of this program is very possible as we expect, a hello, world text will be printed. However, the behavior of this program is compiler and CPU dependent. If the program is compiled with a different compiler, or with a later compiler version, or it runs on a different architecture, the hello, world text might not be printed, or a text different from hello, world might be printed. The reason is compilers and CPUs may exchange the execution orders of the first two lines in the setup function, so the final effect of the setup function may become to
func setup() {
	done = true
	a = "hello, world"
	if done {
		log.Println(len(a))
	}
}

The setup goroutine in the above program is unable to observe the reordering, so the log.Println(len(a)) line will always print 12 (if this line gets executed before the program exits). However, the main goroutine may observe the reordering, which is why the printed text might be not hello, world.

Besides the problem of ignoring memory reordering, there are data races in the program. There are not any synchronizations made in using the variable a and done. So, the above program is a showcase full of concurrent programming mistakes. A professional Go programmer should not make these mistakes.

We can use the go build -race command provided in Go Toolchain to build a program, then we can run the outputted executable to check whether or not there are data races in the program.

Go Memory Model

Sometimes, we need to ensure that the execution of some code lines in a goroutine must happen before (or after) the execution of some code lines in another goroutine (from the view of either of the two goroutines), to keep the correctness of a program. Instruction reordering may cause some troubles for such circumstances. How should we do to prevent certain possible instruction reordering?

Different CPU architectures provide different fence instructions to prevent different kinds of instruction reordering. Some programming languages provide corresponding functions to insert these fence instructions in code. However, understanding and correctly using the fence instructions raises the bar of concurrent programming.

The design philosophy of Go is to use as fewer features as possible to support as more use cases as possible, at the same time to ensure a good enough overall code execution efficiency. So Go built-in and standard packages don't provide direct ways to use the CPU fence instructions. In fact, CPU fence instructions are used in implementing all kinds of synchronization techniques supported in Go. So, we should use these synchronization techniques to ensure expected code execution orders.

The remaining of the current article will list some guaranteed (and non-guaranteed) code execution orders in Go, which are mentioned or not mentioned in Go 1 memory model and other official Go documentation.

In the following descriptions, if we say event A is guaranteed to happen before event B, it means any of the goroutines involved in the two events will observe that any of the statements presented before event A in source code will be executed before any of the statements presented after event B in source code. For other irrelevant goroutines, the observed orders may be different from the just described.

The creation of a goroutine happens before the execution of the goroutine

In the following function, the assignment x, y = 123, 789 will be executed before the call fmt.Println(x), and the call fmt.Println(x) will be executed before the call fmt.Println(y).
var x, y int
func f1() {
	x, y = 123, 789
	go func() {
		fmt.Println(x)
		go func() {
			fmt.Println(y)
		}()
	}()
}
However, the execution orders of the three in the following function are not deterministic. There are data races in this function.
var x, y int
func f2() {
	go func() {
		// Might print 0, 123, or some others.
		fmt.Println(x)
	}()
	go func() {
		// Might print 0, 789, or some others.
		fmt.Println(y)
	}()
	x, y = 123, 789
}

Channel operations related order guarantees

Go 1 memory model lists the following three channel related order guarantees.
  1. The nth successful send to a channel happens before the nth successful receive from that channel completes, no matter that channel is buffered or unbuffered.
  2. The nth successful receive from a channel with capacity m happens before the (n+m)th successful send to that channel completes. In particular, if that channel is unbuffered (m == 0), the nth successful receive from that channel happens before the nth successful send on that channel completes.
  3. The closing of a channel happens before a receive completes if the receive returns a zero value because the channel is closed.

In fact, the completion of the nth successful send to a channel and the completion of the nth successful receive from the same channel are the same event.

Here is an example show some guaranteed code execution orders in using an unbuffered channel.
func f3() {
	var a, b int
	var c = make(chan bool)

	go func() {
		a = 1
		c <- true
		if b != 1 { // impossible
			panic("b != 1") // will never happen
		}
	}()

	go func() {
		b = 1
		<-c
		if a != 1  { // impossible
			panic("a != 1") // will never happen
		}
	}()
}
Here, for the two new created goroutines, the following orders are guaranteed: So the two calls to panic in the above example will never get executed. However, the panic calls in the following example may get executed.
func f4() {
	var a, b, x, y int
	c := make(chan bool)

	go func() {
		a = 1
		c <- true
		x = 1
	}()

	go func() {
		b = 1
		<-c
		y = 1
	}()

	// Many data races are in this goroutine.
	// Don't write code as such.
	go func() {
		if x == 1 {
			if a != 1 { // possible
				panic("a != 1") // may happen
			}
			if b != 1 { // possible
				panic("b != 1") // may happen
			}
		}

		if y == 1 {
			if a != 1 { // possible
				panic("a != 1") // may happen
			}
			if b != 1 { // possible
				panic("b != 1") // may happen
			}
		}
	}()
}

Here, for the third goroutine, which is irrelevant to the operations on channel c. It will not be guaranteed to observe the orders observed by the first two new created goroutines. So, any of the four panic calls may get executed.

In fact, most compiler implementations do guarantee the four panic calls in the above example will never get executed, however, the Go official documentation never makes such guarantees. So the code in the above example is not cross-compiler or cross-compiler-version compatible. We should stick to the Go official documentation to write professional Go code.

Here is an example using a buffered channel.
func f5() {
	var k, l, m, n, x, y int
	c := make(chan bool, 2)

	go func() {
		k = 1
		c <- true
		l = 1
		c <- true
		m = 1
		c <- true
		n = 1
	}()

	go func() {
		x = 1
		<-c
		y = 1
	}()
}
The following orders are guaranteed:

However, the execution of x = 1 is not guaranteed to happen before the execution of l = 1 and m = 1, and the execution of l = 1 and m = 1 is not guaranteed to happen before the execution of y = 1.

The following is an example on channel closing. In this example, the execution of k = 1 is guaranteed to end before the execution of y = 1, but not guaranteed to end before the execution of x = 1,
func f6() {
	var k, x, y int
	c := make(chan bool, 1)

	go func() {
		c <- true
		k = 1
		close(c)
	}()

	go func() {
		<-c
		x = 1
		<-c
		y = 1
	}()
}

Mutex related order guarantees

The followings are the mutex related order guarantees in Go.
  1. For an addressable value m of type Mutex or RWMutex in the sync standard package, the nth successful m.Unlock() method call happens before the (n+1)th m.Lock() method call returns.
  2. For an addressable value rw of type RWMutex, if its nth rw.Lock() method call has returned, then its successful nth rw.Unlock() method call happens before the return of any rw.RLock() method call which is guaranteed to happen after the nth rw.Lock() method call returns.
  3. For an addressable value rw of type RWMutex, if its nth rw.RLock() method call has returned, then its mth successful rw.RUnlock() method call, where m <= n, happens before the return of any rw.Lock() method call which is guaranteed to happen after the nth rw.RLock() method call returns.
In the following example, the following orders are guaranteed:
func fab() {
	var a, b int
	var l sync.Mutex // or sync.RWMutex

	l.Lock()
	go func() {
		l.Lock()
		b = 1
		l.Unlock()
	}()
	go func() {
		a = 1
		l.Unlock()
	}()
}

func fmn() {
	var m, n int
	var l sync.RWMutex

	l.RLock()
	go func() {
		l.Lock()
		n = 1
		l.Unlock()
	}()
	go func() {
		m = 1
		l.RUnlock()
	}()
}

func fxy() {
	var x, y int
	var l sync.RWMutex

	l.Lock()
	go func() {
		l.RLock()
		y = 1
		l.RUnlock()
	}()
	go func() {
		x = 1
		l.Unlock()
	}()
}

Note, in the following code, by the official Go documentation, the execution of p = 1 is not guaranteed to end before the execution of q = 1, though most compilers do make such guarantees.
var p, q int
func fpq() {
	var l sync.Mutex
	p = 1
	l.Lock()
	l.Unlock()
	q = 1
}

Order guarantees made by sync.WaitGroup values

At a given time, assume the counter maintained by an addressable sync.WaitGroup value wg is not zero. If there is a group of wg.Add(n) method calls invoked after the given time, and we can make sure that only the last returned call among the group of calls will modify the counter maintained by wg to zero, then each of the group of calls is guaranteed to happen before the return of a wg.Wait method call which is invoked after the given time.

Note, wg.Done() is equivalent to wg.Add(-1).

Please read the explanations for the sync.WaitGroup type to get how to use sync.WaitGroup values.

Order guarantees made by sync.Once values

Please read the explanations for the sync.Once type to get the order guarantees made by sync.Once values and how to use sync.Once values.

Order guarantees made by sync.Cond values

It is some hard to make a clear description for the order guarantees made by sync.Cond values. Please read the explanations for the sync.Cond type to get how to use sync.Cond values.

Atomic operations related order guarantees

Since Go 1.19, the Go 1 memory model documentation formally specifies that all atomic operations executed in Go programs behave as though executed in some sequentially consistent order. If the effect of an atomic operation A is observed by atomic operation B, then A is synchronized before B.

By the descriptions, in the following code, the atomic write operation on the variable b is guaranteed to happen before the atomic read operation with result 1 on the same variable. Consequently, the write operation on the variable a is also guaranteed to happen before the read operation on the same variable. So the following program is guaranteed to print 2.
package main

import (
	"fmt"
	"runtime"
	"sync/atomic"
)

func main() {
	var a, b int32 = 0, 0

	go func() {
		a = 2
		atomic.StoreInt32(&b, 1)
	}()

	for {
		if n := atomic.LoadInt32(&b); n == 1 {
			// The following line always prints 2.
			fmt.Println(a)
			break
		}
		runtime.Gosched()
	}
}

Please read this article to get how to do atomic operations.

Finalizers related order guarantees

A call to runtime.SetFinalizer(x, f) happens before the finalization call f(x).

(more articles ↡)

The Go 101 project is hosted on Github. Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies, description flaws, code bugs and broken links.

If you would like to learn some Go details and facts every serveral days, please follow Go 101's official Twitter account @zigo_101.

The digital versions of this book are available at the following places:
Tapir, the author of Go 101, has been on writing the Go 101 series books and maintaining the go101.org website since 2016 July. New contents will be continually added to the book and the website from time to time. Tapir is also an indie game developer. You can also support Go 101 by playing Tapir's games (made for both Android and iPhone/iPad):
Individual donations via PayPal are also welcome.

Articles in this book: