현재 3권의 신간들인 Go Optimizations 101, Go Details & Tips 101 과 Go Generics 101이 출간되어 있습니다. Leanpub 서점에서 번들을 모두 구입하시는 방법이 비용 대비 효율이 가장 좋습니다.

Go에 대한 많은 정보들과 Go 101 책들의 최신 소식을 얻으시려면 Go 101 트위터 계정인 @Go100and1을 팔로잉 해주세요.

Maps

In Go, the capacity of a map is unlimited in theory, it is only limited by available memory. That is why the built-in cap function doesn't apply to maps.

In the official standard Go runtime implementation, maps are implemented as hashtables internally. Each map/hashtable maintains a backing array to store map entries (key-value pairs). Along with more and more entries are put into a map, the size of the backing array might be thought as too small to store more entries, thus a new larger backing array will be allocated and the current entries (in the old backing array) will be moved to it, then the old backing array will be discarded.

In the official standard Go runtime implementation, the backing array of a map will never shrink, even if all entries are deleted from the map. This is a form of memory wasting. But in practice, this is seldom a problem and and actually often good for program performances.

Clear map entries

We could use the following loop to clear all entries in a map:

	for key := range aMap {
		delete(aMap, key)
	}

The loop is specially optimized (except entries with NaN keys exist) so that its execution is very fast. However, please note that, as mentioned above, the backing array of the cleared map doesn't shrink after the loop. Then how to release the backing array of the map? There are two ways:

	aMap = nil
	// or
	aMap = map(map[K]V)

If the backing array of the map is not referenced elsewhere, then the backing array will be collected eventually after being released.

If there will be many new entries to be put in the map after it is cleared, then the former way is preferred; otherwise, the latter (release) ways are preferred.

`aMap[key]++` is more efficient than `aMap[key] = aMap[key] + 1`

In the statement aMap[key] = aMap[key] + 1, the key are hashed twice, but in the statement aMap[key]++, it is only hashed once.

Similarly, aMap[key] += value is more efficient than aMap[key] = aMap[key] + value.

These could be proved by the following benchmark code:

package maps

import "testing"

var m = map[int]int{}

func Benchmark_increment(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99]++
	}
}

func Benchmark_plusone(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99] += 1
	}
}

func Benchmark_addition(b *testing.B) {
	for i := 0; i < b.N; i++ {
		m[99] = m[99] + 1
	}
}

The benchmark results:

Benchmark_increment-4  11.31 ns/op
Benchmark_plusone-4    11.21 ns/op
Benchmark_addition-4   16.10 ns/op

Pointers in maps

If the key type and element type of a map both don't contain pointers, then in the scan phase of a GC cycle, the garbage collector will not scan the entries of the map. This could save much time.

This tip is also valid for other kinds of container in Go, such as slices, arrays and channels.

Using byte arrays instead of short strings as keys

Internally, each string contains a pointer, which points to the underlying bytes of that string. So if the key or element type of a map is a string type, then all the entries of the map needs to be scanned in GC cycles.

If we can make sure that the string values used in the entries of a map have a max length and the max length is small, then we could use the array type [N]byte to replace the string types (where N is the max string length). Doing this will save much garbage collection scanning time if the number of the entries in the map is very large.

For example, in the following code, the entries of mapB contain no pointers, but the (string) keys of mapA contain pointers. So garbage collector will skip mapB during the scan phase of a GC cycle.

	var mapA = make(map[string]int, 1 << 16)
	var mapB = make(map[[32]byte]int, 1 << 16)

Lower map element modification frequency

In the previous "strings and byte slices" chapter, it has been mentioned that a byte-slice-to-string conversion appearing as the index key in a map element retrieval expression doesn't allocate, but such conversions in L-value map element index expressions will allocate.

So sometimes, we could lower the frequency of using such conversions in L-value map element index expressions to improve program performance.

In the following example, the B way (pointer element way) is more performant than the A way. The reason is the B way modifies element values rarely. The elements in the B way are pointers, once they are created, they are never changed.

package maps

import "testing"

var wordCounterA = make(map[string]int)
var wordCounterB = make(map[string]*int)
var key = make([]byte, 64)

func IncA(w []byte) {
	wordCounterA[string(w)]++
}

func IncB(w []byte) {
	p := wordCounterB[string(w)]
	if p == nil {
		p = new(int)
		wordCounterB[string(w)] = p
	}
	*p++
}

func Benchmark_A(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncA(key[:i])
		}
	}
}

func Benchmark_B(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncB(key[:i])
		}
	}
}

The benchmark results:

Benchmark_A-4  11600 ns/op  2336 B/op  62 allocs/op
Benchmark_B-4   1543 ns/op     0 B/op   0 allocs/op

Although the B way (pointer element way) is less CPU consuming, it creates many pointers, which increases the burden of pointer scanning in a GC cycle. But generally, the B way is more efficient.

We could use an extra counter table (a slice) and let the map record indexes to the table, to avoid making many allocations and creating many pointers, as the following code shows:

var wordIndexes = make(map[string]int)
var wordCounters []int

func IncC(w []byte) {
	if i, ok := wordIndexes[string(w)]; ok {
		wordCounters[i]++
	} else {
		wordIndexes[string(w)] = len(wordCounters)
		wordCounters = append(wordCounters, 1)
	}
}

func Benchmark_C(b *testing.B) {
	for i := 0; i < b.N; i++ {
		for i := range key {
			IncC(key[:i])
		}
	}
}

The benchmark results:

Benchmark_A-4  11600 ns/op  2336 B/op  62 allocs/op
Benchmark_B-4   1543 ns/op     0 B/op   0 allocs/op
Benchmark_C-4   1609 ns/op     0 B/op   0 allocs/op

From a short-period view, the C way is as almost performant as the B way, But as it uses much less pointers, it is actually more efficient than the B way in a long-period view.

Please note that the above benchmark results show the latter two ways both make zero allocations. This is actually not true. It is just that each of latter two benchmark runs makes less than one allocation averagely, which is truncated to zero. This is a deliberate design of the benchmark reports in the standard packages.

Try to grow a map in one step

If we could predict the max number of entries will be put into a map at coding time, we should create the map with the make function and pass the max number as the size argument of the make call, to avoid growing the map in multiple steps later.

Use index tables instead of maps which key types have only a small set of possible values

Some programmers like to use a map with bool key to reduce verbose if-else code block uses. For example, the following code

	// Within a function ...
	var condition bool
	condition = evaluateCondition()
	...
	if condition {
		counter++
	} else {
		counter--
	}
	...
	if condition {
		f()
	} else {
		g()
	}
	...

could be replaced with

// Package-level maps.
var boolToInt = map[bool]int{true: 1, false: 0}
var boolToFunc = map[bool]func(){true: f, false: g}

	// Within a function ...
	var condition bool
	condition = evaluateCondition()
	...
	counter += boolToInt[condition]
	...
	boolToFunc[condition]()
	...

If there are many such identical if-else blocks used in code, using maps with bool keys will reduce many boilerplates and make code look much cleaner. For most use cases, this is generally good. However, as of Go toolchain v1.20, the map way is not very efficient from the code execution performance view. The following benchmarks show the performance differences.

package maps

import "testing"

//go:noiline
func f() {}

//go:noiline
func g() {}

func IfElse(x bool) func() {
	if x {
		return f
	} else {
		return g
	}
}

var m = map[bool]func() {true: f, false: g}
func MapSwitch(x bool) func() {
	return m[x]
}

func Benchmark_IfElse(b *testing.B) {
	for i := 0; i < b.N; i++ {
		IfElse(true)()
		IfElse(false)()
	}
}

func Benchmark_MapSwitch(b *testing.B) {
	for i := 0; i < b.N; i++ {
		MapSwitch(true)()
		MapSwitch(false)()
	}
}

The benchmark results:

Benchmark_IfElse-4      4.155 ns/op
Benchmark_MapSwitch-4  47.46 ns/op

From the benchmark results, we could get that the if-else block way is much more performant than the map-switch way.

For the use cases which require high code performance, we can use index tables to reduce if-else boilerplates but still keep the simplicity of the map switch way, with the help of a bool-to-int function. The following benchmarks show how to use the index table way.

func b2i(b bool) (r int) {
	if b {r = 1}
	return
}
var a = [2]func() {g, f}
func IndexTable(x bool) func() {
	return a[b2i(x)]
}

func Benchmark_IndexTable(b *testing.B) {
	for i := 0; i < b.N; i++ {
		IndexTable(true)()
		IndexTable(false)()
	}
}

From the above code, we could find that the uses of the index table way is as clean as the map-switch way, though it needs to define an extra tiny b2i function. And from the following benchmark results, we know that the index table way is as performant as the if-else block way.

Benchmark_IfElse-4      4.155 ns/op
Benchmark_MapSwitch-4  47.46 ns/op
Benchmark_IndexTable-4  4.135 ns/op

Index↡

The Go 101 프로젝트는 Github 에서 호스팅됩니다. 오타, 문법 오류, 부정확한 표현, 설명 결함, 코드 버그, 끊어진 링크와 같은 모든 종류의 실수에 대한 수정 사항을 제출하여 Go 101을 개선을 돕는 것은 언제나 환영합니다.

주기적으로 Go에 대한 깊이 있는 정보를 얻고 싶다면 Go 101의 공식 트위터 계정인 @go100and1을 팔로우하거나 Go 101 슬랙 채널에j가입해주세요.

이 책의 디지털 버전은 아래와 같은 곳을 통해서 구매할 수 있습니다.

Leanpub, 최소 7.99 달러 (이곳이나 이곳에서 번들로 구매하세요).
애플 북스, 7.99 달러.
아마존 킨들 스토어, 7.99 달러.

Go 101의 저자인 Tapir는 2016년 7월부터 Go 101 시리즈 책들을 집필하고 go101.org 웹사이트를 유지 관리하고 있습니다. 새로운 콘텐츠는 책과 웹사이트에 수시로 추가될 예정입니다. Tapir는 인디 게임 개발자이기도 합니다. Tapir의 게임을 플레이하여 Go 101을 지원할 수도 있습니다. (안드로이드와 아이폰/아이패드용):

Color Infection (★★★★★), 140개 이상의 단계로 이루어진 물리 기반의 캐주얼 퍼즐 게임
Rectangle Pushers (★★★★★), 2가지 모드와 104개 이상의 단계로 이루어진 캐주얼 퍼즐 게임
Let's Play With Particles, 세가지 미니 게임이 있는 캐주얼 액션 게임

페이팔을 통한 개인 기부도 환영합니다.

색인:

Acknowledgments
About Go Optimizations 101
Value Parts and Value Sizes ^{(available in the paid ebooks)}

value/type sizes
memory alignments
struct padding
avoid larger copy costs

Memory Allocations
Stack and Escape Analysis ^{(available in the paid ebooks)}

escape analysis
how to control value allocation places
stacks growth and shrinkage
how to reduce stack grow times

Garbage Collection ^{(available in the paid ebooks)}

GC pacer
how to reduce GC pressure
control GC frequency

Pointers
Structs ^{(available in the paid ebooks)}

3 facts/suggestions

Arrays and Slices ^{(available in the paid ebooks)}

10+ facts/suggestions

String and Byte Slices ^{(available in the paid ebooks)}

10+ facts/suggestions

BCE (Bound Check Eliminate) ^{(available in the paid ebooks)}

the cases BCE works for
the cases BCE doesn't work for
the cases BCE works for when given hints

Maps
Channels ^{(available in the paid ebooks)}

3 facts/suggestions

Functions ^{(available in the paid ebooks)}

how to make a function inline-able
how to make a function not inline-able
pointer parameters/results vs. non-pointer ones
named results vs. anonymous ones
10+ facts/suggestions

Interfaces ^{(available in the paid ebooks)}

value boxing costs
3+ facts/suggestions

Maps

Clear map entries

aMap[key]++ is more efficient than aMap[key] = aMap[key] + 1

Pointers in maps

Using byte arrays instead of short strings as keys

Lower map element modification frequency

Try to grow a map in one step

Use index tables instead of maps which key types have only a small set of possible values

`aMap[key]++` is more efficient than `aMap[key] = aMap[key] + 1`