Memory Blocks

Go is a language which supports automatic memory management, such as memory allocation and garbage collection. So Go programmers can do programming without handling the underlying memory management. This saves Go programmers lots of time.

Although knowing the underlying memory management implementation details is not necessary for Go programmers to write Go code, understanding some concepts and being aware of some facts in the memory management implementation by the standard Go compiler and runtime is very helpful for Go programmers to write high quality Go code.

This article will explain some concepts and facts of memory block allocation and garbage collection by the standard Go compiler and runtime. Other aspects, such as memory apply and memory release in memory management, will not be touched in this article.

Memory Blocks

A memory block is a continuous memory segment to host value parts at run time. Different memory blocks may have different sizes, to host different value parts. One memory block may host multiple value parts at the same time, but each value part can only be hosted within one memory block, no matter how large the size of that value part is. In other words, for any value part, it never crosses memory blocks.

There are many reasons when one memory block may host multiple value parts. Some of them:

A Value References The Memory Block It Is Hosted On

I have known that a value part can reference another value part. Here, we extend the definition by saying a memory block is referenced by all the value parts it hosts. So if a value part v is referenced by another value part, then the other value will also reference the memory block hosting v, indirectly.

When Will Memory Blocks Be Allocated?

In Go, memory blocks may be allocated but not limited at following situations:

Where Will Memory Blocks Be Allocated?

For the standard Go compiler and runtime, in every Go program, each goroutine will maintain a stack, a memory region, at run time. The initial stack size of each goroutine is small. The stack size will grow and shrink as needed.

(Please note, for the standard Go compiler, there is a limit of stack size each goroutine can has. Ar present, Go 1.10, the default maximum stack size is 1 GB on 64-bit systems, and 250 MB on 32-bit systems. We can call the SetMaxStack function in the runtime/debug standard package to change the size.)

Memory blocks allocated on the stack of a goroutine can only be used (referenced) in the goroutine internally. They are goroutine localized resources. They are not safe to be referenced crossing goroutines. A goroutine can access a memory block allocated on the stack of the goroutine without using any data synchronization techniques.

Heap is a singleton in each program. Memory blocks allocated on heap can be accessed by multiple goroutines. In other words, they can be accessed concurrently. Their assessments should be synchronized when needed.

Heap is a conservative place to register memory blocks. If compilers detect a memory block will be referenced crossing goroutines or can't easily confirm whether or not the memory block is safe to be put on stacks, then the memory block will be allocated on heap at run time. This means some values can be safely allocated on stacks may be also allocated on heap.

In fact, stacks are not essential for Go programs. Go compiler/runtime can allocate all memory block on heap. Supporting stacks is just to make Go programs run more efficiently:

If a memory block is allocated somewhere, we can also say the value parts hosted on the memory block are allocated on the same place.

If a value part used in a function is allocated on heap, we can say the value part escapes to heap. By using the official Go SDK, we can run go build -gcflags -m to check which local values (value parts) will escape to heap at run time. As above has mentioned, the current escape analyzer in the standard Go compiler is still not perfect, many local value parts can be allocated on stacks safely will still escape to heap.

When a local variable value of type T escapes to heap, what that really means is that Go runtime also creates an implicit pointer of type *T on the stack of the current goroutine. The value of the pointer stores the address of the memory block allocated on heap (a.k.a., the address of the local variable of type T). Go compiler has already replaced all references to the variable to the deferences of the pointer value at compile time.

Direct parts of global variables (package-level variables) will never be allocated on stacks. Whether or not they will be allocated in heap is compiler dependent. A value part referenced by the direct part of any global variable and any value part allocated on heap will be allocated on heap if the value part is not the direct part of a global variable. Direct parts of global variables and value parts allocated on heap will never reference value parts allocated on stacks. Value parts allocated on a stack can only be referenced by value parts allocated on the same stack.

Some facts:

A memory block created by calling new function may be allocated on heap or stacks. This is different to C++.

When Can A Memory Block Be Collected?

Memory blocks allocated for direct parts of global variables will never be collected.

Each stack will be collected as a whole when the corresponding goroutine exits. So there is no need to collect the memory blocks allocated on stacks, individually, one by one. Stacks are not collected by the garbage collector.

For a memory block allocated on heap, it can be safely collected only when it is no longer referenced (either directly or indirectly) by any direct part of global values and any value part allocated on goroutine stacks. We call such memory blocks as unused memory blocks. Here the global values include not only explicit package-level variables, but also some runtime internal global values. Unused memory blocks on heap will be collected by the garbage collector.

Here is an example to show when some memory blocks can be collected:
package main

var p *int

func main() {
	done := make(chan bool)
	// done will be used on the main and the following
	// new goroutine, so it will be allocated on heap.
	
	go func() {
		x, y, z := 123, 456, 789
		_ = z  // z can be allocated on stack safely.
		p = &x // For x and y are referenced by global p,
		p = &y // so they will be both allocated on heap.
		// Now, x is not referenced by anyone, so
		// its memory block can be collected now.
		p = nil
		// Now, y is als not referenced by anyone,
		// so its memory block can be collected now.
		
		done <- true
	}()
	
	<- done
	// Now the above goroutine exits, the done channel
	// is not used any more, a smart compiler will
	// think it can be collected now.
	
	// ...
}

Sometimes, smart compilers, such as the standard Go compiler, may make some optimizations so that some references are removed earlier than we expect. Here is such an example.
package main

import "fmt"

func main() {
	bs := make([]int, 1000000)
	
	// A smart compiler can detect that the underlying part of the
	// slice bs will never be used later, so that the underlying
	// part of the slice bs can be garbage collected safely now.

	fmt.Println(len(bs))
}

Please read value parts to learn the internal structures of slice values.

By the way, sometimes, we may hope the slice bs is guaranteed to not being garbage collected after the fmt.Println call, then we can use a runtime.KeepAlive function call to tell garbage collectors that the slice bs is still used.

For example,
package main

import "fmt"
import "runtime"

func main() {
	bs := make([]int, 1000000)

	fmt.Println(len(bs))
	runtime.KeepAlive(&bs) // runtime.KeepAlive(bs) is also
	                       // okay for this specified example.
}

runtime.KeepAlive function calls are often needed if unsafe pointers are involved.

When Will A Memory Block Be Collected?

Go runtime will collect unused heap memory blocks (garbage) to reuse or release memory. The current standard Go compiler (up to v1.10) uses a concurrent, tri-color, mark-sweep garbage collector. Here this article will not explain the detailed implementation. Just some facts are listed below.

The collector is not always running. It will start when a threshold is satisfied. So an unused memory block may be not collected immediately after it becomes unused, it will be collected eventually. Currently, the threshold is controlled by GOGC environment variable:
The GOGC variable sets the initial garbage collection target percentage. A collection is triggered when the ratio of freshly allocated data to live data remaining after the previous collection reaches this percentage. The default is GOGC=100. Setting GOGC=off disables the garbage collector entirely.

The value of this environment variable determines the frequency of garbage collecting, and it can be modified at run time. Smaller values lead to more frequent garbage collections. Go programs can also run the collector explicitly by calling the runtime.GC function.

In the mark phase, the collector uses the tri-color algorithm mentioned above to analyze which memory blocks are unused:
At the start of a GC cycle all objects are white. The GC visits all roots, which are objects directly accessible by the application such as globals and things on the stack, and colors these grey. The GC then chooses a grey object, blackens it, and then scans it for pointers to other objects. When this scan finds a pointer to a white object, it turns that object grey. This process repeats until there are no more grey objects. At this point, white objects are known to be unreachable and can be reused.

Here, objects are either value parts or memory blocks.

The unused memory blocks will be swept at the sweep phase.

The collector is a non-compacting one, so it will not move memory blocks to rearrange them.

Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies description flaws and code bugs.