Three new books, Go Optimizations 101, Go Details & Tips 101 and Go Generics 101 are published now.

Maps

In Go, the capacity of a map is unlimited in theory, it is only limited by available memory. That is why the builtin cap function doesn't apply to maps.

In the official standard Go runtime implementation, maps are implemented as hashables internally. Each map/hashtable maintains a backing array to store map entries (key-value pairs). Along with more and more entries are put into a map, the size of the array might be thought as too small to store the current entries, so a larger backing array will be allocated and the current entries will be moved to it, then the old backing array will be discarded.

In the official standard Go runtime implementation, the backing array of a map will never shrink, even if all entries are deleted from the map. This is a form of memory wasting.

Clear map entries

We could use the following loop to clear all entries in a map:

	for key := range aMap {
		delete(aMap, key)
	}

The loop is specially optimized so that its execution is very fast. However, please note that, as mentioned above, the backing array of the cleared map doesn't shrink after the loop. Then how to release the backing array of the map? Just do:

	aMap = nil
	// or
	aMap = map(map[K]V)

If the map is not referenced elsewhere, then its backing array will be collected eventually after doing this.

If there will be many new entries to be put in the map after it is cleared, then the former way is preferred; otherwise, the latter way is preferred.

aMap[key]++ is more efficient than aMap[key] = aMap[key] + 1

In the statement aMap[key] = aMap[key] + 1, the key are hashed twice, but in the statement aMap[key]++, it is only hashed once.

Similarly, aMap[key] += value is more efficient than aMap[key] = aMap[key] + value.

These could be proved by the following benchmark code:

package maps

import "testing"

var m = map[int]int{}

func Benchmark_increment(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99]++
	}
}

func Benchmark_plusone(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99] += 1
	}
}

func Benchmark_addition(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99] = m[99] + 1
	}
}

The benchmark results:

Benchmark_increment-4  11.31 ns/op
Benchmark_plusone-4    11.21 ns/op
Benchmark_addition-4   16.10 ns/op	

Pointers in maps

If the key type and element type of a map both don't contain pointers, then in the scan phase of a GC cycle, the garbage collector will not scan the entries of the map. This could save much time.

This tip is also valid for other kinds of container in Go, such as slices, arrays and channels.

Using byte arrays instead of short strings as keys

Internally, each string contains a pointer, which points to the underlying bytes of that string. So if the key or element type of a map is a string type, then all the entries of the map needs to be scanned in GC cycles.

If we can make sure that the string values used in the entries of a map have a max length and the max length is small (or the variance of the string lengths is small), then we could use the array type [N]byte to replace the string types (where N is the max string length). Doing this will save much garbage collection scanning time if the number of the entries in the map is very large.

For example, the entries of mapB contain no pointers, but the (string) keys of mapA contain pointers. So garbage collector will skip mapB during the scan phase of a GC cycle.

	var mapA = make(map[string]int, 1 << 16)
	var mapB = make(map[[32]byte]int, 1 << 16)

Lower map element modification frequency

In the previous "strings and byte slices" article, it has been mentioned that a byte-slice-to-string conversion appearing as the index key in a map element retrieval expression doesn't allocate, but such conversions in L-value map element index expressions will allocate.

So sometimes, we could lower the frequency of using such conversions in L-value map element index expressions to improve program performance.

In the following example, the B way (pointer element way) is more performant than the A way. The reason is the B way modifies element values rarely. The elements in the B way are pointers, once they are created, they are never changed.

package maps

import "testing"

var wordCounterA = make(map[string]int)
var wordCounterB = make(map[string]*int)
var key = make([]byte, 64)

func IncA(w []byte) {
	wordCounterA[string(w)]++
}

func IncB(w []byte) {
	p := wordCounterB[string(w)]
	if p == nil {
		p = new(int)
		wordCounterB[string(w)] = p
	}
	*p++
}

func Benchmark_A(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncA(key[:i])
		}
	}
}

func Benchmark_B(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncB(key[:i])
		}
	}
}

The benchmark results:

Benchmark_A-4  11600 ns/op  2336 B/op  62 allocs/op
Benchmark_B-4   1543 ns/op     0 B/op   0 allocs/op

Although the B way (pointer element way) is less CPU consuming, it is inefficient from garbage collection view. It creates many pointers, which increases the burden of pointer scanning in a GC cycle.

We could use an extra counter table (a slice) and let the map record indexes to the table, to avoid making many allocations and creating many pointers, as the following code shows:

var wordIndexes = make(map[string]int)
var wordCounters []int

func IncC(w []byte) {
	if i, ok := wordIndexes[string(w)]; ok {
		wordCounters[i]++
	} else {
		wordIndexes[string(w)] = len(wordCounters)
		wordCounters = append(wordCounters, 1)
	}
}

func Benchmark_C(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncC(key[:i])
		}
	}
}

The benchmark results:

Benchmark_A-4  11600 ns/op  2336 B/op  62 allocs/op
Benchmark_B-4   1543 ns/op     0 B/op   0 allocs/op
Benchmark_C-4   1609 ns/op     0 B/op   0 allocs/op

From a short-period view, the C way is as almost performant as the B way, But as it doesn't use pointers, it is actually more efficient than the B way in a long-period view.

Please note that the above benchmark results show the latter two ways both make zero allocations. This is actually not true. It is just that each of latter two benchmark runs makes less than one allocation averagely, which is truncated to zero. This is a deliberate design of the benchmark reports in the standard packages.

Try to grow a map in one stop

If we could predict the max number of entries will be put into a map at coding time, we should create the map with the max number as the size, to avoid growing the map in multiple steps later.

Use index tables instead of maps which key types have only a small set of possible values

Some programmers like to use a map with bool key to reduce verbose if-else code block uses. For example, the following code

	// Within some function.
	var condition bool
	condition = evaluateCondition()
	...
	if condition {
		counter++
	} else {
		counter--
	}
	...
	if condition {
		f()
	} else {
		g()
	}
	...

could be replaced with

// Package-level maps.
var boolToInt = map[bool]int{true: 1, false: 0}
var boolToFunc = map[bool]func(){true: f, false: g}

	// Within a function.
	var condition bool
	condition = evaluateCondition()
	...
	counter += boolToInt[condition]
	...
	boolToFunc[condition]()
	...

If there are many such identical if-else blocks used in code, using maps with bool keys will reduce many boilerplates and make code look much cleaner. For most use cases, this is generally good. However, as of Go toolchain v1.17, the map way is not very efficient from the code execution performance view. The following benchmarks show the performance differences.

package maps

import "testing"

//go:noiline
func f() {}

//go:noiline
func g() {}

func IfElse(x bool) func() {
	if x {
		return f
	} else {
		return g
	}
}

var m = map[bool]func() {true: f, false: g}
func MapSwitch(x bool) func() {
	return m[x]
}

func Benchmark_IfElse(b *testing.B) {
	for i := 0; i < b.N; i++ {
		IfElse(true)()
		IfElse(false)()
	}
}

func Benchmark_MapSwitch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		MapSwitch(true)()
		MapSwitch(false)()
	}
}

The benchmark results:

Benchmark_IfElse-4      4.155 ns/op
Benchmark_MapSwitch-4  47.46 ns/op

From the benchmark results, we could get that the if-else block way is much more performant than the map-switch way.

For the use cases which require high code performance, we can use index tables to reduce if-else boilerplates but still keep the simplicity of the map switch way, with the help of a bool-to-int function. The following benchmarks show how to use the index table way.

func b2i(b bool) (r int) {
	if b {r = 1}
	return
}
var a = [2]func() {g, f}
func IndexTable(x bool) func() {
	return a[b2i(x)]
}

func Benchmark_IndexTable(b *testing.B) {
	for i := 0; i < b.N; i++ {
		IndexTable(true)()
		IndexTable(false)()
	}
}

From the above code, we could find that the uses of the index table way is as simple as the map switch way, though it needs to define an extra tiny b2i function. And from the following benchmark results, we know that the index table way is as performant as the if-else block way.

Benchmark_IfElse-4      4.155 ns/op
Benchmark_MapSwitch-4  47.46 ns/op
Benchmark_IndexTable-4  4.135 ns/op

Index↡

The Go 101 project is hosted on Github. Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies, description flaws, code bugs and broken links.

If you would like to learn some Go details and facts every serveral days, please follow Go 101's official Twitter account @go100and1 or join Go 101 slack channels.

The digital versions of this book are available at the following places:
Tapir, the author of Go 101, has been on writing the Go 101 series books and maintaining the go101.org website since 2016 July. New contents will be continually added to the book and the website from time to time. Tapir is also an indie game developer. You can also support Go 101 by playing Tapir's games (made for both Android and iPhone/iPad):
Individual donations via PayPal are also welcome.

Index: