Three new books, Go Optimizations 101, Go Details & Tips 101 and Go Generics 101 are published now. It is most cost-effective to buy all of them through this book bundle in the Leanpub book store.

# BCE (Bound Check Elimination)

Go is a memory safe language. In array/slice/string element indexing and subslice operations, Go runtime will check whether or not the involved indexes are out of range. If an index is out of range, a panic will be produced to prevent the invalid index from doing harm. This is called bounds checking.

Bounds checks make our code run safely, on the other hand, they also make our code run a little slower. This is a trade-off a safe language must made.

Since Go toolchain v1.7, the standard Go compiler has started to support BCE (bounds check elimination). BCE can avoid some unnecessary bounds checks, so that the standard Go compiler could generate more efficient programs.

The following will list some examples to show in which cases BCE works and in which cases BCE doesn't work.

We could use the `-d=ssa/check_bce` compiler option to show which code lines need bound checks.

## Example 1

A simple example:

``````// example1.go
package main

func f1a(s []struct{}, index int) {
_ = s[index] // line 5: Found IsInBounds
_ = s[index]
_ = s[index:]
_ = s[:index+1]
}

func f1b(s []byte, index int) {
s[index-1] = 'a' // line 12: Found IsInBounds
_ = s[:index]
}

func f1c(a [5]int) {
_ = a[0]
_ = a[4]
}

func f1d(s []int) {
if len(s) > 2 {
_, _, _ = s[0], s[1], s[2]
}
}

func f1g(s []int) {
middle := len(s) / 2
_ = s[:middle]
_ = s[middle:]
}

func main() {}
``````

Let's run it with the `-d=ssa/check_bce` compiler option:

``````\$ go run -gcflags="-d=ssa/check_bce" example1.go
./example1.go:5:7: Found IsInBounds
./example1.go:12:3: Found IsInBounds
``````

The outputs show that only two code lines needs bound checks in the above example code.

Note that: Go toolchains with version smaller than 1.21 failed to remove the bound checks in the `f1g` function.

And note that, up to now (Go toolchain v1.22.n), the official standard compiler doesn't check BCE for an operation in a generic function if the operation involves type parameters and the generic function is never instantiated. For example, the command `go run -gcflags=-d=ssa/check_bce bar.go` will report nothing.

``````// bar.go
package bar

func foo[E any](s []E) {
_ = s[0] // line 5
_ = s[1] // line 6
_ = s[2] // line 7
}

// var _ = foo[bool]
``````

However, if the variable declaration line is enabled, then the compiler will report:

``````./bar.go:5:7: Found IsInBounds
./bar.go:6:7: Found IsInBounds
./bar.go:7:7: Found IsInBounds
./bar.go:4:6: Found IsInBounds
``````

## Example 2

All the bound checks in the slice element indexing and subslice operations shown in the following example are eliminated.

``````// example2.go
package main

func f2a(s []int) {
for i := range s {
_ = s[i]
_ = s[i:len(s)]
_ = s[:i+1]
}
}

func f2b(s []int) {
for i := 0; i < len(s); i++ {
_ = s[i]
_ = s[i:len(s)]
_ = s[:i+1]
}
}

func f2c(s []int) {
for i := len(s) - 1; i >= 0; i-- {
_ = s[i]
_ = s[i:len(s)]
_ = s[:i+1]
}
}

func f2d(s []int) {
for i := len(s); i > 0; {
i--
_ = s[i]
_ = s[i:len(s)]
_ = s[:i+1]
}
}

func f2e(s []int) {
for i := 0; i < len(s) - 1; i += 2 {
_ = s[i]
_ = s[i:len(s)]
_ = s[:i+1]
}
}

func main() {}
``````

Run it, we will find that nothing is outputted. Yes, the official standard Go compiler is so clever that it finds all bound checks may be removed in the above example code.

``````\$ go run -gcflags="-d=ssa/check_bce" example2.go
``````

There are still some small imperfections. If we modify the `f2g` and `f2h` functions as that shown in the following code, then the compiler (v1.22.n) fails to remove the bound checks for the two subslice operations.

``````// example2b.go
package main

func f2g(s []int) {
for i := len(s) - 1; i >= 0; i-- {
//_ = s[i]
_ = s[:i+1] // line 7: Found IsSliceInBounds
}
}

func f2h(s []int) {
for i := 0; i <= len(s) - 1; i++ {
//_ = s[i]
_ = s[:i+1] // line 14: Found IsSliceInBounds
}
}

func main() {}
``````

Run it, we will get the following output:

``````\$ go run -gcflags="-d=ssa/check_bce" example2b.go
./example2b.go:7:8: Found IsSliceInBounds
./example2b.go:14:8: Found IsSliceInBounds
``````

We may give the compiler some hints by turning on the comment lines to remove these bound checks.

## Example 3

We should try to evaluate the element indexing or subslice operation with the largest index as earlier as possible to reduce the number of bound checks.

In the following example, if the expression `s[3]` is evaluated without panicking, then the bound checks for `s[0]`, `s[1]` and `s[2]` could be eliminated.

``````// example3.go
package main

func f3a(s []int32) int32 {
return s[0] | // Found IsInBounds (line 5)
s[1] | // Found IsInBounds
s[2] | // Found IsInBounds
s[3]   // Found IsInBounds
}

func f3b(s []int32) int32 {
return s[3] | // Found IsInBounds (line 12)
s[0] |
s[1] |
s[2]
}

func main() {
}
``````

Run it, we get:

``````./example3.go:5:10: Found IsInBounds
./example3.go:6:4: Found IsInBounds
./example3.go:7:4: Found IsInBounds
./example3.go:8:4: Found IsInBounds
./example3.go:12:10: Found IsInBounds
``````

From the output, we could learn that there are 4 bound checks in the `f3a` function, but only one in the `f3b` function.

## Example 4

Since Go toolchain v1.19, the bould check in the `f5a` function is successfully removed,

``````func f5a(isa []int, isb []int) {
if len(isa) > 0xFFF {
for _, n := range isb {
_ = isa[n & 0xFFF]
}
}
}
``````

However, before Go toolchain v1.19, the check is not removed. The compilers before version 1.19 need a hint to be removed, as shown in the `f5b` function:

``````func f5b(isa []int, isb []int) {
if len(isa) > 0xFFF {
// A successful hint (for v1.18- compilers)
isa = isa[:0xFFF+1]
for _, n := range isb {
_ = isa[n & 0xFFF] // BCEed!
}
}
}

func f5c(isa []int, isb []int) {
if len(isa) > 0xFFF {
// A not-workable hint (for v1.18- compilers)
_ = isa[:0xFFF+1]
for _, n := range isb {
_ = isa[n & 0xFFF] // Found IsInBounds
}
}
}

func f5d(isa []int, isb []int) {
if len(isa) > 0xFFF {
// A not-workable hint (for v1.18- compilers)
_ = isa[0xFFF]
for _, n := range isb {
_ = isa[n & 0xFFF] // Found IsInBounds
}
}
}
``````

The next section shows more cases which need compiler hints to avoid some unnecessary bound checks.

## Sometimes, the compiler needs some hints to remove some bound checks

The official standard Go compiler is still not smart enough to remove all unnecessary bound checks. Sometimes, the compiler needs to be given some hints to remove some bound checks.

In the following example, by adding one hint line in the function `f4b`, all bound checks in the loop are eliminated.

``````func f4a(is []int, bs []byte) {
if len(is) >= 256 {
for _, n := range bs {
_ = is[n] // Found IsInBounds
}
}
}

func f4b(is []int, bs []byte) {
if len(is) >= 256 {
is = is[:256] // a successful hint
for _, n := range bs {
_ = is[n] // BCEed!
}
}
}

func f4c(is []int, bs []byte) {
if len(is) >= 256 {
_ = is[:256] // a non-workable hint
for _, n := range bs {
_ = is[n] // Found IsInBounds
}
}
}

func f4d(is []int, bs []byte) {
if len(is) >= 256 {
_ = is[255] // a non-workable hint
for _, n := range bs {
_ = is[n] // Found IsInBounds
}
}
}
``````

Please note that, as of Go toolchain v1.22.n, the two hints used in the `f4c` and `f4d` functions are not workable (but they should).

In the following example, by adding a redundant `if` code block in the function `NumSameBytes_2`, all bound checks in the loop are eliminated.

``````type T = string

func NumSameBytes_1(x, y T) int {
if len(x) > len(y) {
x, y = y, x
}
for i := 0; i < len(x); i++ {
if x[i] !=
y[i] { // Found IsInBounds
return i
}
}
return len(x)
}

func NumSameBytes_2(x, y T) int {
if len(x) > len(y) {
x, y = y, x
}

// a successful hint
if len(x) > len(y) {
panic("unreachable")
}

for i := 0; i < len(x); i++ {
if x[i] != y[i] { // BCEed!
return i
}
}
return len(x)
}
``````

The above hint works when `T` is either a string type or a slice type, whereas each of the following two hints only works for one case (as of Go toolchain v1.22.n).

``````func NumSameBytes_3(x, y T) int {
if len(x) > len(y) {
x, y = y, x
}

y = y[:len(x)] // a hint, only works if T is slice
for i := 0; i < len(x); i++ {
if x[i] != y[i] {
return i
}
}
return len(x)
}

func NumSameBytes_4(x, y T) int {
if len(x) > len(y) {
x, y = y, x
}

_ = y[:len(x)] // a hint, only works if T is string
for i := 0; i < len(x); i++ {
if x[i] != y[i] {
return i
}
}
return len(x)
}
``````

Please note that, the future versions of the standard official Go compiler will become smarter so that the above hints will become unnecessary later.

## Write code in BCE-friendly ways

In the following example, the `f7b` and `f7c` functions makes 3 less bound checks than `f7a`.

``````func f7a(s []byte, i int) {
_ = s[i+3] // Found IsInBounds
_ = s[i+2] // Found IsInBounds
_ = s[i+1] // Found IsInBounds
_ = s[i]   // Found IsInBounds
}

func f7b(s []byte, i int) {
s = s[i:i+4] // Found IsSliceInBounds
_ = s[3]
_ = s[2]
_ = s[1]
_ = s[0]
}

func f7c(s []byte, i int) {
s = s[i:i+4:i+4] // Found IsSliceInBounds
_ = s[3]
_ = s[2]
_ = s[1]
_ = s[0]
}
``````

However, please note that, there might be some other factors which will affect program performances. On my machine (Intel i5-4210U CPU @ 1.70GHz, Linux/amd64), among the above 3 functions, the function `f7b` is actually the least performant one.

``````Benchmark_f7a-4  3861 ns/op
Benchmark_f7b-4  4223 ns/op
Benchmark_f7c-4  3477 ns/op
``````

In practice, it is encouraged to use the three-index subslice form (`f7c`).

In the following example, benchmark results show that

• the `f8z` function is the most performant one (in line with expectation)
• but the `f8y` function is as performant as the `f8x` function (unexpected).
``````func f8x(s []byte) {
var n = len(s)
s = s[:n]
for i := 0; i <= n - 4; i += 4 {
_ = s[i+3] // Found IsInBounds
_ = s[i+2] // Found IsInBounds
_ = s[i+1] // Found IsInBounds
_ = s[i]
}
}

func f8y(s []byte) {
for i := 0; i <= len(s) - 4; i += 4 {
s2 := s[i:]
_ = s2[3] // Found IsInBounds
_ = s2[2]
_ = s2[1]
_ = s2[0]
}
}

func f8z(s []byte) {
for i := 0; len(s) >= 4; i += 4 {
_ = s[3]
_ = s[2]
_ = s[1]
_ = s[0]
s = s[4:]
}
}
``````

In fact, benchmark results also show the following `f8y3` function is as performant as the `f8z` function and the performance of the `f8y2` function is on par with the `f8y` function. So it is encouraged to use three-index subslice forms for such situations in practice.

``````func f8y2(s []byte) {
for i := 0; i < len(s) - 3; i += 4 {
s2 := s[i:i+4] // Found IsInBounds
_ = s2[3]
_ = s2[2]
_ = s2[1]
_ = s2[0]
}
}

func f8y3(s []byte) {
for i := 0; i < len(s) - 3; i += 4 {
s2 := s[i:i+4:i+4] // Found IsInBounds
_ = s2[3]
_ = s2[2]
_ = s2[1]
_ = s2[0]
}
}
``````

In the following example, there are no bound checks in the `f9b` and `f9c` functions, but there is one in the `f9a` function.

``````func f9a(n int) []int {
buf := make([]int, n+1)
k := 0
for i := 0; i <= n; i++ {
buf[i] = k // Found IsInBounds
k++
}
return buf
}

func f9b(n int) []int {
buf := make([]int, n+1)
k := 0
for i := 0; i < len(buf); i++ {
buf[i] = k
k++
}
return buf
}

func f9c(n int) []int {
buf := make([]int, n+1)
k := 0
for i := 0; i < n+1; i++ {
buf[i] = k
k++
}
return buf
}
``````

In the following code, the function `f0b` is much more performant than `f0a`.

``````func f0a(x [16]byte) (r [4]byte){
for i := 0; i < 4; i++ {
r[i] =
x[i*4+3] ^ // Found IsInBounds
x[i*4+2] ^ // Found IsInBounds
x[i*4+1] ^ // Found IsInBounds
x[i*4]     // Found IsInBounds
}
return
}

func f0b(x [16]byte) (r [4]byte){
r[0] = x[3] ^ x[2] ^ x[1] ^ x[0]
r[1] = x[7] ^ x[6] ^ x[5] ^ x[4]
r[2] = x[11] ^ x[10] ^ x[9] ^ x[8]
r[3] = x[15] ^ x[14] ^ x[13] ^ x[12]
return
}
``````

In the following code, the function `f6b` is more performant than `f6a`, but both of them are much less performant than `f6c`.

``````const N = 3

func f6a(s []byte) {
for i := 0; i < len(s)-(N-1); i += N {
_ = s[i+N-1] // Found IsInBounds
}
}

func f6b(s []byte) {
for i := N-1; i < len(s); i += N {
_ = s[i] // Found IsInBounds
}
}

func f6c(s []byte) {
for i := uint(N-1); i < uint(len(s)); i += N {
_ = s[i]
}
}
``````

Global (package-level) slices are often unfriendly to BCE, so we should try to assign them to local ones to eliminate some unnecessary bound checks. For example, in the following code, the `fa0` function does one more bound check than the `fa1` and `fa2` functions, so the function calls `fa1()` and `fa2(s)` are both more performant than `fa0()`.

``````var s = make([]int, 5)

func fa0() {
for i := range s {
s[i] = i // Found IsInBounds
}
}

func fa1() {
s := s
for i := range s {
s[i] = i
}
}

func fa2(x []int) {
for i := range x {
x[i] = i
}
}
``````

Arrays are often more BCE-friendly than slices. In the following code, the array version functions (`fb2` and `fc2`) don't need bound checks.

``````var s = make([]int, 256)
var a = [256]int{}

func fb1() int {
return s[100] // Found IsInBounds
}

func fb2() int {
return a[100]
}

func fc1(n byte) int {
return s[n] // Found IsInBounds
}

func fc2(n byte) int {
return a[n]
}
``````

Please note that, the future versions of the standard official Go compiler will become smarter so that some BCE-unfriendly code might become BCE-friendly later.

## The current official standard Go compiler fails to eliminate some unnecessary bound checks

As of Go toolchain v1.22.n, the official standard Go compiler doesn't eliminate the following unnecessary bound checks.

``````func fb(s, x, y []byte) {
n := copy(s, x)
copy(s[n:], y) // Found IsSliceInBounds
_ = x[n:]      // Found IsSliceInBounds
}

func fd(data []int, check func(int) bool) []int {
var k = 0
for _, v := range data {
if check(v) {
data[k] = v // Found IsInBounds
k++
}
}
return data[:k] // Found IsSliceInBounds
}

// For the only bound check in the following function,
// * if N == 1, it will be always removed.
// * if N is a power of 2, Go toolchain 1.19+ can remove it.
// * for other cases, Go toolchain fails to remove it.
func fe(s []byte) {
const N = 3
if len(s) >= N {
r := len(s) % N
_ = s[r] // Found IsInBounds
}
}

func ff(s []byte) {
for i := 0; i < len(s); i++ {
_ = s[i/2] // Found IsInBounds
_ = s[i/3] // Found IsInBounds
}
}
``````

The future versions of the standard official Go compiler will become smarter so that the above unnecessary bound checks will be eliminated later.

Index↡

The Go 101 project is hosted on Github. Welcome to improve Go 101 articles by submitting corrections for all kinds of mistakes, such as typos, grammar errors, wording inaccuracies, description flaws, code bugs and broken links.

If you would like to learn some Go details and facts every serveral days, please follow Go 101's official Twitter account @zigo_101.

The digital versions of this book are available at the following places:
Tapir, the author of Go 101, has been on writing the Go 101 series books and maintaining the go101.org website since 2016 July. New contents will be continually added to the book and the website from time to time. Tapir is also an indie game developer. You can also support Go 101 by playing Tapir's games (made for both Android and iPhone/iPad):
Individual donations via PayPal are also welcome.

Index:
• Acknowledgments
• Value Parts and Value Sizes (available in the paid ebooks)
• value/type sizes
• memory alignments
• avoid larger copy costs
• Memory Allocations
• Stack and Escape Analysis (available in the paid ebooks)
• escape analysis
• how to control value allocation places
• stacks growth and shrinkage
• how to reduce stack grow times
• Garbage Collection (available in the paid ebooks)
• GC pacer
• how to reduce GC pressure
• control GC frequency
• Pointers
• Structs (available in the paid ebooks)
• 3 facts/suggestions
• Arrays and Slices (available in the paid ebooks)
• 10+ facts/suggestions
• String and Byte Slices (available in the paid ebooks)
• 10+ facts/suggestions
• BCE (Bound Check Elimination)
• the cases BCE works for
• the cases BCE doesn't work for
• the cases BCE works for when given hints
• Maps
• Channels (available in the paid ebooks)
• 3 facts/suggestions
• Functions (available in the paid ebooks)
• how to make a function inline-able
• how to make a function not inline-able
• pointer parameters/results vs. non-pointer ones
• named results vs. anonymous ones
• 10+ facts/suggestions
• Interfaces (available in the paid ebooks)
• value boxing costs
• 3+ facts/suggestions