Three new books, Go Optimizations 101, Go Details & Tips 101 and Go Generics 101 are published now. It is most cost-effective to buy all of them through this book bundle in the Leanpub book store.

How to take addresses of the bytes in a string?

We all know that string bytes are not addressable, so Go compilers will report for &aString[i] alike expressions.

Use unsafe.StringData

Prior to Go 1.22 (and since 1.20), the only way to take the addresses of string bytes is through the unsafe way:

package main

import (
	"unsafe"
)

var h = "Hello "
var w = "world!"

func commonUnderlyingBytes(x, y string) bool {
	return &[]byte(x)[4] == &[]byte(y)[4]
}

func main() {
	var s = h + w
	//var p = &s[4] // error: cannot take address of s[4]
	var p = unsafe.StringData(s[4:])
	println(p, string(*p)) // 0xc00004470c o
}

Use reflect.Value.UnsafePointer

Since Go 1.23, the reflect.Value.UnsafePointer method has supported string values. So we can use it to get addresses of string bytes.

package main

import (
	"reflect"
)

var h = "Hello "
var w = "world!"

func commonUnderlyingBytes(x, y string) bool {
	return &[]byte(x)[4] == &[]byte(y)[4]
}

func main() {
	var s = h + w
	//var p = &s[4] // error: cannot take address of s[4]
	var p = (*byte)(reflect.ValueOf(s[4:]).UnsafePointer())
	println(p, string(*p)) // 0xc000104eec o
}

The above two ways both involve unsafe.Pointer. The following one doesn't.

Convert string to a byte slice, then take addresses of the byte slice

Since Go toolchain 1.22, an optimization has been made so that memory allocation and bytes duplication are not needed any more in some []byte(aString) conversions (if the bytes in the conversion result are proven to be never modified).

We can make use of this optimization to take addresses of the bytes of a string, by converting the string to a byte slice then indirectly take addresses of the bytes in the byte slice. Here is an example demonstrating the usage:

package main

var h = "Hello "
var w = "world!"

func takeAddressOfFirstByte(s string) *byte {
	return &[]byte(s)[0]
}

func commonUnderlyingBytes(x, y string) bool {
	return len(x) != 0 && len(y) != 0 &&
		takeAddressOfFirstByte(x) == takeAddressOfFirstByte(y)
}

func main() {
	var s = h + w

	var p1 = &[]byte(s)[4]
	println(p1, string(*p1))
	
	var p2 = &[]byte(s)[4]
	println(p2, string(*p2))

	println(p1 == p2)
	
	var s2 = s
	println(commonUnderlyingBytes(s[4:], s2[4:]))
}

Run it with Go toolchain 1.21 and 1.22 versions:

$ gotv 1.21. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.21.12/bin/go run demo1.go
0xc00003c6c4 o
0xc00003c6a4 o
false
false
$ gotv 1.22. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.22.5/bin/go run demo1.go
0xc0000426ec o
0xc0000426ec o
true
true

Wait! Does this mean we can modify string content now? Don't worry. The answer is certainly not. The compiler can successfully detect such attempts and disable the optimimation for the invloved conversions, as the following code demonstrates:

package main

var h = "Hello "
var w = "world!"

func main() {
	var s = h + w

	var p1 = &[]byte(s)[4]
	println(p1, string(*p1))
	
	var p2 = &[]byte(s)[4]
	*p2 = 'X' // modify the byte
	println(p2, string(*p2))

	println(p1 == p2)
}

The behaviors are identical between Go toolchain v1.21 and v1.22 versions.

$ gotv 1.21. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.21.12/bin/go run demo2.go
0xc00003c6d4 o
0xc00003c6b4 X
false
$ gotv 1.22. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.22.5/bin/go run demo2.go
0xc0000426f0 o
0xc0000426cc X
false

So to make this way work, we should never change the bytes in the result byte slice, even potentially.

(GoTV is a tool used to manage and use multiple coexisting installations of official Go toolchain versions harmoniously and conveniently.)

Is this trick useful in practice? I think the answer is yes. If one of two strings is derived from another (through substring operation), then to get their common prefix, the following CommonStringPrefix1 function is more performant than the CommonStringPrefix2 function (we don't there is a derivation relation between thetwo strings).

func CommonStringPrefix1(x, y string) int {
	if len(x) == 0 || len(y) == 0 {
		return 0
	}
	if &[]byte(x)[0] == &[]byte(y)[0] {
		if len(x) <= len(y) {
			return len(x)
		} else {
			return len(y)
		}
	}
	return CommonStringPrefix2(x, y)
}

func CommonStringPrefix2(x, y string) int {
	var min = len(x)
	if len(y) < len(x) {
		min = len(y)
	}
	x2, y2 := x[:min], y[:min]
	if len(x2) == len(y2) {
		for i := 0; i < len(x2); i++ {
			if x2[i] != y2[i] {
				return i
			}
		}
	}
	return min
}


(more articles ↡)

The Go 101 project is hosted on Github. Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies, description flaws, code bugs and broken links.

If you would like to learn some Go details and facts every serveral days, please follow Go 101's official Twitter account @zigo_101.

Tapir, the author of Go 101, has been on writing the Go 101 series books and maintaining the go101.org website since 2016 July. New contents will be continually added to the book and the website from time to time. Tapir is also an indie game developer. You can also support Go 101 by playing Tapir's games (made for both Android and iPhone/iPad):
Individual donations via PayPal are also welcome.

Articles in this book: