unsafe in Go
Contents
Go’s safety properties:
- During compilation, type checking detects most attempts to apply an operation to a value that is inappropriate for its type.
- Strict rules for type conversions prevent direct access to the internals of built-in types like strings, maps, slices, and channels.
- Dynamic checks ensure that the program immediately terminates with an informative error whenever a forbidden operation, such as out-of-bounds array accesses or nil pointer dereferences, occurs.
- Automatic memory management (garbage collection) eliminates “use after free” bugs, as well as most memory leaks.
- Many implementation details are inaccessible to Go programs.
- There is no way to discover the memory layout of an aggregate type like a struct, or the machine code for a function, or the identity of the operating system thread on which the current goroutine is running.
- The Go scheduler freely moves goroutines from one thread to another.
- A pointer identifies a variable without revealing the variable’s numeric address.
- Addresses may change as the garbage collector moves variables; pointers are transparently updated.
- There is no way to discover the memory layout of an aggregate type like a struct, or the machine code for a function, or the identity of the operating system thread on which the current goroutine is running.
These features make Go programs more predictable and highly portable by hiding the underlying details.
- The language semantics are largely independent of any particular compiler, operating system, or CPU architecture.
- Some details leak through, such as the word size of the processor, the order of evaluation of certain expressions, and the set of implementation restrictions imposed by the compiler.
- A word is 4 bytes on a 32-bit platform and 8 bytes on a 64-bit platform.
- Some details leak through, such as the word size of the processor, the order of evaluation of certain expressions, and the set of implementation restrictions imposed by the compiler.
Occasionally, we may choose to forfeit some of these helpful guarantees to achieve the highest possible performance, to interoperate with libraries written in other languages, or to implement a function that cannot be expressed in pure Go.
The unsafe
package is implemented by the compiler. It provides access to a number of built-in language features that are not ordinarily available because they expose details of Go’s memory layout.
unsafe.Sizeof
, unsafe.Alignof
, unsafe.Offsetof
func Sizeof(x ArbitraryType) uintptr
takes an expression x
of any type and returns the size in bytes of a hypothetical variable v
as if v
was declared via var v = x
.
- The expression is not evaluated.
Sizeof
reports only the size of the fixed part of each data structure, like the pointer and length of a string, but not indirect parts like the contents of the string .- The size does not include any memory possibly referenced by
x
. For instance, ifx
is a slice,Sizeof
returns the size of the slice descriptor, not the size of the memory referenced by the slice. - For a struct, the size includes any padding introduced by field alignment.
- The size does not include any memory possibly referenced by
- The return value of
Sizeof
is a Go constant if the type of the argumentx
does not have variable size.- A type has variable size if it is a type parameter or if it is an array or struct type with elements of variable size.
大多数计算机体系统结构对于访问内存的指令有限制,32 位平台上,如果一条指令访问 4 个字节,起始内存地址应该是 4 的整数倍,如果一条指令访问 2 个字节,起始内存地址应该是 2 的整数倍,这称为对齐(Alignment)。如果指令所访问的内存地址没有正确对齐,在有些平台上将不能访问内存,而是引发一个异常,在 x86 平台上仍然能访问内存,但是不对齐的指令执行效率比对齐的指令要低,所以编译器在安排各种变量的地址时都会考虑到对齐的问题。
Computers load and store values from memory most efficiently when those values are properly aligned. For example, the address of a value of a two-byte type such as int16
should be an even number, the address of a four-byte value such as a rune
should be a multiple of four, and the address of an eight-byte value such as a float64
, uint64
, or 64-bit pointer should be a multiple of eight. Alignment requirements of higher multiples are unusual, even for larger data types such as complex128
.
For this reason, the size of a value of an aggregate type (a struct or array) is at least the sum of the sizes of its fields or elements but may be greater due to the presence of “holes”. Holes are unused spaces added by the compiler to ensure that the following field or element is properly aligned relative to the start of the struct or array.
- 对于一个 8 byte 的值,若地址 8-9 被一个 2 byte 的值占据,下一个 8 byte 的值不会从 10 开始存储,编译器会用空格(holes)填充地址 10-15,下一个 8 byte 的值的起始地址是 16,16-23 用于存储一个值。
结构体罗列字段的顺序对占用空间大小的影响示例:
// x 表示 hole,b 表示 bool,f 表示 float64,i 表示 int16
// 64 位平台 32 位平台
struct { bool; float64; int16 } // 1+1+1 word 1+2+1 word
// bxxxxxxx
// ffffffff
// iixxxxxx
struct { float64; int16; bool } // 1+1+0 word 2+1+0 word
// ffffffff
// iibxxxxx
struct { bool; int16; float64 } // 1+0+1 word 1+0+2 word
// bxiixxxx
// ffffffff
func Alignof(x ArbitraryType) uintptr
takes an expression x
of any type and returns the required alignment of a hypothetical variable v
as if v
was declared via var v = x
.
- It is the largest value
m
such that the address ofv
is always zeromod m
. - Typically, boolean and numeric types are aligned to their size (up to a maximum of 8 bytes) and all other types are word-aligned.
- If a variable
s
is of struct type andf
is a field within that struct, thenAlignof(s.f)
will return the required alignment of a field of that type within a struct. x
在内存中的地址必须是Alignof(x)
的倍数(一般情况最大倍数是 8)。
func Offsetof(x ArbitraryType) uintptr
whose operand must be a field selector x.f
, computes the offset of field f
relative to the start of its enclosing struct x
, accounting for holes.
var x struct {
a bool
b int16
c []int
}
// Typical 32-bit platform:
Sizeof(x) = 16 Alignof(x) = 4
Sizeof(x.a) = 1 Alignof(x.a) = 1 Offsetof(x.a) = 0
Sizeof(x.b) = 2 Alignof(x.b) = 2 Offsetof(x.b) = 2
Sizeof(x.c) = 12 Alignof(x.c) = 4 Offsetof(x.c) = 4
// Typical 64-bit platform:
Sizeof(x) = 32 Alignof(x) = 8
Sizeof(x.a) = 1 Alignof(x.a) = 1 Offsetof(x.a) = 0
Sizeof(x.b) = 2 Alignof(x.b) = 2 Offsetof(x.b) = 2
Sizeof(x.c) = 24 Alignof(x.c) = 8 Offsetof(x.c) = 8
// https://github.com/golang/go/blob/go1.16.6/src/runtime/slice.go#L13
type slice struct {
array unsafe.Pointer
len int
cap int
}
unsafe.Pointer
unsafe.Pointer
是一种特殊的指针,可以存放任意变量的地址。
- 不能用
*
对unsafe.Pointer
解引用,因为不知道该地址对应的变量的类型。
unsafe.Pointer
的零值是 nil
。
unsafe.Pointer
和普通指针一样可以相互比较,也可以和 nil
比较。
普通的 *T
类型可以并转换成 unsafe.Pointer
,还可以转换回普通的指针类型(可以是 *T
,也可以是其它类型的普通指针)。
Convert a *float64
pointer to a *uint64
to inspect the bit pattern of a floating-point variable.
package math
func Float64bits(f float64) uint64 { return *(*uint64)(unsafe.Pointer(&f)) }
fmt.Printf("%#016x\n", Float64bits(1.0)) // "0x3ff0000000000000"
unsafe.Pointer
conversions let us write arbitrary values to memory and thus subvert the type system.
unsafe.Pointer
可以转换成 uintptr
类型,里面保存了指针地址的数值,对这一地址的数值可以进行算数运算。这种转换可逆,但由于不是所有的整型数值都是合法的地址,因此 uintptr
转成 unsafe.Pointer
可能会破坏类型系统。
uintptr
是可以容纳任何地址的无符号整型。
unsafe.Pointer
常被用作普通指针和指针地址的数值之间相互转换的媒介。
var x struct {
a bool
b int16
c []int
}
// equivalent to pb := &x.b
pb := (*int16)(unsafe.Pointer(
uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b)))
*pb = 42
fmt.Println(x.b) // "42"
不要用 uintptr
类型的临时变量去接。
// NOTE: subtly incorrect!
tmp := uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b)
// 执行这一语句时,x 的地址可能已经发生移动
pb := (*int16)(unsafe.Pointer(tmp))
*pb = 42
// 类似的问题:并没有指针指向 new(T) 创建变量,因而此变量会被 GC 回收
pT := uintptr(unsafe.Pointer(new(T))) // NOTE: wrong!
moving GCs: Some garbage collectors move variables around in memory to reduce fragmentation or bookkeeping.
尽管当前 Go 的实现并未使用 moving GC,但 Go 确实会腾挪内存中变量的地址。
- 协程可以按需扩容,扩容发生时,旧栈中的变量被移动到新的、更大的栈上,导致变量地址的数值发生变化。
使用建议:
- Treat all
uintptr
values as if they contain the former address of a variable, and minimize the number of operations between converting anunsafe.Pointer
to auintptr
and using thatuintptr
. - When calling a library function that returns a
uintptr
, such as those from thereflect
package, the result should be immediately converted to anunsafe.Pointer
to ensure that it continues to point to the same variable.
References