Type-Unsafe Pointers

We have learned Go pointers from the article pointers in Go. From that article, we know that, comparing to C pointers, there are many restrictions made for Go pointers. For example, Go pointers can't participate arithmetic operations. And for two arbitrary pointer types, it is very possible that their values can't be converted to each other.

The pointers explained in that article are called type-safe pointers actually. Although the restrictions on type-safe pointers really make us be able to write safe Go code with ease, they also make some obstacles to write efficient code for some scenarios.

In fact, up to currrent (Go 1.10), Go also supports type-unsafe pointers, which are pointers without the restrictions made for safe pointers. Type-unsafe pointers are also called unsafe pointers in Go. Go unsafe pointers are much like C pointers, they are powerful, and also dangerous. For some scenarios, we can write more efficient code with the help of unsafe pointers. On the other hand, it is easy to write invalid code which is subtle and difficult to detect when using unsafe pointers.

The biggest risk of using unsafe pointers comes from the fact that the unsafe mechanism is not protected by the Go 1 compatibility guidelines. Code depending on unsafe pointers works today could break since a later Go version.

If you really desire the code efficient improvements by using unsafe pointers for any reason, not only should you know the above mentioned risks, but also should you follow the instructions written in official Go documentations and clearly understand the effects of each unsafe pointer use, so that you can write safe Go code with unsafe pointers, at least if you use the standard Go compiler in Go SDK 1.10.

About The unsafe Standard Package

Go provides a special kind of types for unsafe pointers. We must import the unsafe standard package to use unsafe pointers. The unsafe.Pointer type is defined as
type Pointer *ArbitraryType

Surely, it is not a usual type definition. Here the ArbitraryType just hints that a unsafe.Pointer value can be converted to any safe pointer values in Go, and vice versa. In other words, unsafe.Pointer is like the void* in C.

Go unsafe pointers mean the types whose underlying types are unsafe.Pointer.

The zero values of unsafe pointers are also represneted with the predelcared identifier nil.

The unsafe standard package also provides three functions.

Note that the types of the return results of the three functions are all uintptr. Below we will learn that uintptr values can be converted to unsafe.Pointer, and vice versa.

Although the return results of the three functons are consistent in the same program, they may be different crossing operation systems of different architectures, crossing compilers, and crossing compiler versions.

Calls to the three function are always evaluated at compile time. Their results can be assigned to constants.

An example of using the three functions.
package main

import "fmt"
import "unsafe"

func main() {
	var x struct {
		a int64
		b bool
		c string
	const M, N = unsafe.Sizeof(x.c), unsafe.Sizeof(x)
	fmt.Println(M, N)   // 16 32
	fmt.Println(unsafe.Alignof(x.a)) // 8
	fmt.Println(unsafe.Alignof(x.b)) // 1
	fmt.Println(unsafe.Alignof(x.c)) // 8
	fmt.Println(unsafe.Offsetof(x.a)) // 0
	fmt.Println(unsafe.Offsetof(x.b)) // 8
	fmt.Println(unsafe.Offsetof(x.c)) // 16

Please note, the above print results are for the standard Go compiler 1.10 on Linux amd64 OSes.

Unsafe Pointers Related Conversion Rules

Go compilers allow the following explicit conversions.

By using these conversions, we can do conversions between any two safe pointers, between any two uintptr values, and between an arbitrary safe pointer and an uintptr value. Unsafe pointers act as bridges in these conversions.

However, although these conversions are all legal at compile time, not all of them are valid (safe) at run time. These conversions defeat the memory safety the whole Go type system (except the unsafe part) tries to maintain. We must follow the instructions listed in a later section below to write valid Go code with unsafe pointers.

Some Facts In Go We Should Know

Before introducing the valid unsafe pointer use patterns, we need to know some facts in Go.

Fact 1: Unsafe Pointers Are Pointers And Uintptr Values Are Intergers

Each of non-nil safe and unsafe pointers references another value. However uintptr values don't reference any values, they are just plain integers, though often each of them stores a memory address.

Go is a language supporting automatic garbage collection. When a Go program is running, Go runtime will check which values are not used any more and collect the memory allocated for these unused values, from time to time. Pointers play an important role in the check process. If a value is unreachable from (referenced by) any values still in using, then Go runtime thinks it is an unuse value and it can be safely garbage collected.

As uintptr values are integers, they can participate arithmetic operations.

The example in the next subsection shows the differences between pointers and uintptr values.

Fact 2: Unused Values May Be Collected At Any Time

At run time, the garbage collector may run at any time, so when a value becomes unused, the memory block allocated for unused values may be collected at any time.

For example:
package main

import "fmt"
import "unsafe"

var x *int
var y unsafe.Pointer
var z uintptr

func main() {
	var vx, vy, vz int
	x = &vx
	y = unsafe.Pointer(&vy)
	z = uintptr(unsafe.Pointer(&vz))
	// At the time, even if its address of vz is stored in z.
	// vz has already become unused. Gargage collector can
	// collect the memory allocated for it now.
	// On the other hand. vx and vy are still in using,
	// for they are reachable from the x and y pointers.
	// uinptr can be used as operands of arithmetic operators.
	z &^= 1 // <=> z = z &^ 1
	fmt.Println(x, y, z)

Fact 3: We Can Use A runtime.KeepAlive Function Call To Mark A Value As Still In Using (Reachable) Currently

To mark a value still reachable, we should pass another value which references the value as the argument a runtime.KeepAlive function call. A pointer to the value is often used as such an argument.

In the following code, a small modification is made on the example in the last subsection.
func main() {
	var vx, vy, vz int
	x = &vx
	y = unsafe.Pointer(&vy)
	z = uintptr(unsafe.Pointer(&vz))
	// do other things ...
	// vz is still reachable at least up to here, so
	// it will not be garbage collected now for sure.

Fact 4: *unsafe.Pointer Is A General Safe Pointer Type

Yes, *unsafe.Pointer is a safe unnamed pointer type. Its base type is unsafe.Pointer. As it is a safe pointer, accroding the conversion rules listed above, it can be converted to unsafe.Pointer type, and vice versa.

For example:
package main

import "unsafe"

func main() {
	x := 123 // of type int
	p := unsafe.Pointer(&x)
	pp := &p // of type *unsafe.Pointer
	p = unsafe.Pointer(pp)
	pp = (*unsafe.Pointer)(p)

How To Use Unsafe Pointers Correctly

The unsafe standard package documentation lists six unsafe pointer use patterns. Following will introduce and explain them one by one.

Pattern 1: Convert *T1 To Unsafe Poniter, Then Convert The Unsafe Pointer Value To *T2

As above has mentioned, by using the unsafe pointer conversion rules above, we can convert a value of *T1 to type *T2, where T1 and T2 are two arbitrary types. However, we should only do such conversions if the size of T1 is no larger than T2, and only if the conversions are meaningful.

As a result, we can also achieve the conversions between type T1 and T2 by using this pattern.

One example is the math.Float64bits function, which converts a float64 values to an uint64 value. Each corresponding bit of the memory representations of the two values is identical. The math.Float64bits function does reverse conversions.
func Float64bits(f float64) uint64 {
	return *(*uint64)(unsafe.Pointer(&f))

func Float64frombits(b uint64) float64 {
	return *(*float64)(unsafe.Pointer(&b))

Please note, the return result of the Float64bits(aFloat64) function is different from the result of the explicit conversion uint64(aFloat64).

In the following example, we use this pattern to convert a []MyString slice to type []string, and vice versa. The result slice and the original slice share the underlying elements. Such conversions are impossible through safe ways,
package main

import (

func main() {
	type MyString string
	ms := []MyString{"C", "C++", "Go"}
	fmt.Printf("%s\n", ms)  // [C C++ Go]
	// ss := ([]string)(ms) // compiling error
	ss := *(*[]string)(unsafe.Pointer(&ms))
	ss[1] = "Rust"
	fmt.Printf("%s\n", ms) // [C Rust Go]
	// ms = []MyString(ss) // compiling error
	ms = *(*[]MyString)(unsafe.Pointer(&ss))

Pattern 2: Convert Unsafe Pointer To Uintptr, Then Use The Uintptr Value

This parttern is not very useful. Usually, we print the result uintptr values to check the memory addresses stored in them. However, there are other less verbose ways to this job. So this pattern is not much useful.

package main

import "fmt"
import "unsafe"

func main() {
	type T struct{a int}
	var t T
	fmt.Println(&t)                                 // &{0}
	fmt.Printf("%x\n", uintptr(unsafe.Pointer(&t))) // c6233120a8
	println(&t)                                     // 0xc6233120a8

Abviously, it is more simple to use the built-in function println to print an address.

Pattern 3: Convert Unsafe Pointer To Uintptr, Do Arithmetic Operations With The Uintptr Value, Then Convert Back

For example:
package main

import "fmt"
import "unsafe"

type T struct {
	x bool
	y [3]int16

const N = unsafe.Offsetof(T{}.y)
const M = unsafe.Sizeof([3]int16{}[0])

// We use a package-level channel variable to make sure the T
// value used in the main function will be allocated on heap.
var c = make(chan *T, 1)

func main() {
	c <- &T{y: [3]int16{123, 456, 789}}
	t := <-c
	p := unsafe.Pointer(t)
	// "uintptr(p) + N + M + M" stores the address of t.y[2].
	ty2 := (*int16)(unsafe.Pointer(uintptr(p) + N + M + M))
	fmt.Println(*ty2) // 789

For this specified example, the conversions are not much useful. It is just a demo for education purpose.

Please note, for this specified example, the above conversion line from uintptr to unsafe.Pointer shouldn't be split into two lines, like the following code shows. Please read the comments in the code for the reason.
	p := unsafe.Pointer(t)
	// Now the t value becomes unused, its memory may
	// be garbage collected at this time. So the following
	// use of the address of t.y[2] may be invalid!
	addr := uintptr(p) + N + M + M
	ty2 := (*int16)(unsafe.Pointer(addr))

Such bugs are so subtle and hard to detect, which is why the uses of unsafe pointers are dangerous.

If we do want to split that conversion line into two lines, we should call the runtime.KeepAlive function and pass the unsafe pointer p as the argument. after the use of the address of t.y[2]. Like this
	p := unsafe.Pointer(t)
	addr := uintptr(p) + N + M + M
	ty2 := (*int16)(unsafe.Pointer(addr))
	// This following line assures the memory of the value
	// t will not get garbage collected currently for sure.

Another detail to note is that, it is not recommended to store the end boundry of a memory block in a pointer (either safe or unsafe one). Doing this will prevent another memory block which closely follows the former memory block from being garbage collected.

Pattern 4: Convert Unsafe Pointer To uintptr When Calling syscall.Syscall

From the explanations for the last pattern, we know that the following function is dangerous.
// Assume this function will not inlined.
func DoSomething(addr uintptr) {
	// read or write values at the passed address ...

The reason why the above function is dangerous is that the function itself can't guarantee the values at the passed argument address are not garbage collected yet. In fact, there may be some new values which have been allocated at the passed argument address.

However, the prototype of the Syscall function in the syscall standard package is as
func Syscall(trap, a1, a2, a3 uintptr) (r1, r2 uintptr, err Errno)

How does this function guarantee that the values at the passed addresses a1, a2 and a3 are still not garbage collected yet within the function internal? The function can't guarantee this. In fact, compilers will make the guarantee. It is the privilege of calls to syscall.Syscall alike functions.

We can think that, compilers will append a runtime.KeepAlive call for each of uintptr arguments, like the folowing code shows:
syscall.Syscall(SYS_READ, uintptr(fd), uintptr(unsafe.Pointer(p)), uintptr(n))
// Compilers will automatically append the following lines for the above line.

Although the syscall.Syscall function has this privilege, there is a requirement to call this function. The conversions from unsafe pointers to uinptr must be present within the literal representations of the arguments of the calls to this function.

For example, the following call is invalid if the p is not guaranteed to be still used after the call.
u := uintptr(unsafe.Pointer(p))
syscall.Syscall(SYS_READ, uintptr(fd), u, uintptr(n))

Again, never use this pattern when calling other functions. We should append the runtime.KeepAlive calls manually when calling other functions with uintptr parameters.

Pattern 5: Convert The uintptr Result Of reflect.Value.Pointer Or reflect.Value.UnsafeAddr Method Call To Unsafe Pointer

The methods Pointer and UnsafeAddr of the Value type in the reflect standard package both return a result of type uintptr instead of unsafe.Pointer. This is a deliberate design, which is to avoid converting the results of calls (to the two methods) to any safe poiner types without importing the unsafe standard package.

The design requires the return result of a call to either of the two methods must be converted to an unsafe pointer immediately after making the call. Otherwise, there will be small time window in which the value allocated at the address stored in the result might lose all references and be garbage collected.

For example, the following call is valid.
p := (*int)(unsafe.Pointer(reflect.ValueOf(new(int)).Pointer()))
One the other hand, the following call is invalid.
u := reflect.ValueOf(new(int)).Pointer()
p := (*int)(unsafe.Pointer(u))

Pattern 6: Convert A reflect.SliceHeader.Data Or reflect.StringHeader.Data Field To Unsafe Pointer, And The Inverse.

For the same reason mentioned in the last subsection, the Data fields of the struct type SliceHeader and StringHeader in the reflect standard package are declared with type uintptr instead of unsafe.Pointer.

It is valid to convert a pointer to a string to a StringHeader pointer, so that we can manipulate the internal of the string. The same, it is valid to convert a pointer to a slice to a SliceHeader pointer, so that we can manipulate the internal of the slice.

An example of using reflect.StringHeader:
package main

import "fmt"
import "unsafe"
import "reflect"

func main() {
	a := [...]byte{'G', 'o', 'l', 'a', 'n', 'g'}
	s := "Java"
	hdr := (*reflect.StringHeader)(unsafe.Pointer(&s))
	hdr.Data = uintptr(unsafe.Pointer(&a))
	hdr.Len = len(a)
	fmt.Println(s) // Golang
	// Now s and a share the same byte sequence,
	// which makes the string s become mutable.
	a[2], a[3], a[4], a[5] = 'o', 'g', 'l', 'e'
	fmt.Println(s) // Google

An example of using reflect.SliceHeader:
package main

import "fmt"
import "unsafe"
import "reflect"
import "runtime"

func main() {
	bs := []byte("Golang")
	var pa *[2]byte // an array pointer
	hdr := (*reflect.SliceHeader)(unsafe.Pointer(&bs))
	pa = (*[2]byte)(unsafe.Pointer(hdr.Data))
	fmt.Printf("%s\n", pa) // &Go
	pa[1] = 'a'
	fmt.Printf("%s\n", bs) // Galang

The runtime.KeepAlive call is essential if the last Printf line is absent.

In general, we should only get a StringHeader pointer value from an actual (alreay existed) string, or get a SliceHeader pointer value from an actual (alreay existed) slice. We shouldn't do the contrary, such as creating a string from a StringHeader, or creating a slice from a SliceHeader. For example, the following code is invalid.
// Assume p points to a sequence of byte and
// n is the number of bytes in the sequence.
var hdr reflect.StringHeader
hdr.Data = uintptr(unsafe.Pointer(new([5]byte)))
// Now the just allocated byte array has lose all
// references and it can be garbage collected now.
hdr.Len = 5
s := *(*string)(unsafe.Pointer(&hdr))

The following is an example which shows how to convert byte slices to strings, by using the unsafe way, and vice versa. Different from the safe conversion from a byte slice to a string, the unsafe way doesn't allocate a new underlying byte sequence for the result string in each conversion.
package main

import "fmt"
import "unsafe"
import "reflect"
import "runtime"

func ByteSlice2String(bs []byte) (str string) {
	sliceHdr := (*reflect.SliceHeader)(unsafe.Pointer(&bs))
	strHdr := (*reflect.StringHeader)(unsafe.Pointer(&str))
	strHdr.Data = sliceHdr.Data
	strHdr.Len = sliceHdr.Len
	runtime.KeepAlive(&bs) // this line is essential.

func main() {
	bs := []byte{'G', 'o', 'l', 'a', 'n', 'd'}
	s := ByteSlice2String(bs)
	fmt.Println(s) // Goland
	bs[5] = 'g'
	fmt.Println(s) // Golang

The docs of the SliceHeader and StringHeader types in the reflect standard package are similar. The docs says the representations of the two struct types may change in a later release. So the above example may become invalid even if the unsafe rules keep unchanged. Fortunately, the current two avaliable Go compilers (the standard Go compiler and the gccgo compiler) both recognise the the representations of the two types declared in the reflect standard package.

It is also possible to convert a string to a byte slice by using the unsafe way. However, we should treat the result slice as an immutable value and never modify its elements.

By the way, for the standard Go compiler, currently (Go 1.10), there is a more efficient (and more unsafe) way to convert a byte slice to a string.
func ByteSlice2String(bs []byte) string {
	return *(*string)(unsafe.Pointer(&bs))

This is the implementation adopted by the String method of the Builder type supported since Go 1.10 in the strings standard package. It makes use of the first pattern introduced above. It is more efficient than the above one.

Final Words

From the above content, we know that, for some scenarios, the unsafe mechanism can help us write more efficient Go code. However, it is very easy to introduce some subtle bugs when using the unsafe mechanism. A program with these bugs may run well for a long time, but suddenly behave abnormally and even crash some time later. Such bugs are very hard to detect and debug.

We should only use the unsafe mechanism if we have to, and we must use it with extreme care, in particular, we should follow the instructions described above.

And again, we should aware that the unsafe mechanism introduced above may change and even become invalid totally in later Go versions, though no evidences this will happen soon. If the unsafe mechanism rules change, the above introduced valid unsafe pointer use patterns may become invalid. So please keep it easy to switch back to the safe implementations for you code depending on the unsafe mechanism.

Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies description flaws and code bugs.