unsafe in Go

Go’s safety properties:

During compilation, type checking detects most attempts to apply an operation to a value that is inappropriate for its type.
- Strict rules for type conversions prevent direct access to the internals of built-in types like strings, maps, slices, and channels.
Dynamic checks ensure that the program immediately terminates with an informative error whenever a forbidden operation, such as out-of-bounds array accesses or nil pointer dereferences, occurs.
Automatic memory management (garbage collection) eliminates “use after free” bugs, as well as most memory leaks.
Many implementation details are inaccessible to Go programs.
- There is no way to discover the memory layout of an aggregate type like a struct, or the machine code for a function, or the identity of the operating system thread on which the current goroutine is running.
  - The Go scheduler freely moves goroutines from one thread to another.
- A pointer identifies a variable without revealing the variable’s numeric address.
  - Addresses may change as the garbage collector moves variables; pointers are transparently updated.

These features make Go programs more predictable and highly portable by hiding the underlying details.

The language semantics are largely independent of any particular compiler, operating system, or CPU architecture.
- Some details leak through, such as the word size of the processor, the order of evaluation of certain expressions, and the set of implementation restrictions imposed by the compiler.
  - A word is 4 bytes on a 32-bit platform and 8 bytes on a 64-bit platform.

Occasionally, we may choose to forfeit some of these helpful guarantees to achieve the highest possible performance, to interoperate with libraries written in other languages, or to implement a function that cannot be expressed in pure Go.

The unsafe package is implemented by the compiler. It provides access to a number of built-in language features that are not ordinarily available because they expose details of Go’s memory layout.

`unsafe.Sizeof`, `unsafe.Alignof`, `unsafe.Offsetof`

func Sizeof(x ArbitraryType) uintptr takes an expression x of any type and returns the size in bytes of a hypothetical variable v as if v was declared via var v = x.

The expression is not evaluated.
Sizeof reports only the size of the fixed part of each data structure, like the pointer and length of a string, but not indirect parts like the contents of the string .
- The size does not include any memory possibly referenced by x. For instance, if x is a slice, Sizeof returns the size of the slice descriptor, not the size of the memory referenced by the slice.
- For a struct, the size includes any padding introduced by field alignment.
The return value of Sizeof is a Go constant if the type of the argument x does not have variable size.
- A type has variable size if it is a type parameter or if it is an array or struct type with elements of variable size.

大多数计算机体系统结构对于访问内存的指令有限制，32 位平台上，如果一条指令访问 4 个字节，起始内存地址应该是 4 的整数倍，如果一条指令访问 2 个字节，起始内存地址应该是 2 的整数倍，这称为对齐（Alignment）。如果指令所访问的内存地址没有正确对齐，在有些平台上将不能访问内存，而是引发一个异常，在 x86 平台上仍然能访问内存，但是不对齐的指令执行效率比对齐的指令要低，所以编译器在安排各种变量的地址时都会考虑到对齐的问题。

Computers load and store values from memory most efficiently when those values are properly aligned. For example, the address of a value of a two-byte type such as int16 should be an even number, the address of a four-byte value such as a rune should be a multiple of four, and the address of an eight-byte value such as a float64, uint64, or 64-bit pointer should be a multiple of eight. Alignment requirements of higher multiples are unusual, even for larger data types such as complex128.

For this reason, the size of a value of an aggregate type (a struct or array) is at least the sum of the sizes of its fields or elements but may be greater due to the presence of “holes”. Holes are unused spaces added by the compiler to ensure that the following field or element is properly aligned relative to the start of the struct or array.

对于一个 8 byte 的值，若地址 8-9 被一个 2 byte 的值占据，下一个 8 byte 的值不会从 10 开始存储，编译器会用空格（holes）填充地址 10-15，下一个 8 byte 的值的起始地址是 16，16-23 用于存储一个值。

结构体罗列字段的顺序对占用空间大小的影响示例：

// x 表示 hole，b 表示 bool，f 表示 float64，i 表示 int16
                                    // 64 位平台         32 位平台
struct { bool; float64; int16 }     // 1+1+1 word       1+2+1 word      
// bxxxxxxx
// ffffffff
// iixxxxxx      
struct { float64; int16; bool }     // 1+1+0 word       2+1+0 word
// ffffffff
// iibxxxxx
struct { bool; int16; float64 }     // 1+0+1 word       1+0+2 word
// bxiixxxx
// ffffffff

func Alignof(x ArbitraryType) uintptr takes an expression x of any type and returns the required alignment of a hypothetical variable v as if v was declared via var v = x.

It is the largest value m such that the address of v is always zero mod m.
Typically, boolean and numeric types are aligned to their size (up to a maximum of 8 bytes) and all other types are word-aligned.
If a variable s is of struct type and f is a field within that struct, then Alignof(s.f) will return the required alignment of a field of that type within a struct.
x 在内存中的地址必须是 Alignof(x) 的倍数（一般情况最大倍数是 8）。

func Offsetof(x ArbitraryType) uintptr whose operand must be a field selector x.f, computes the offset of field f relative to the start of its enclosing struct x, accounting for holes.

var x struct {
    a bool
    b int16
    c []int
}
// Typical 32-bit platform:
Sizeof(x)   = 16    Alignof(x)   = 4
Sizeof(x.a) = 1     Alignof(x.a) = 1    Offsetof(x.a) = 0
Sizeof(x.b) = 2     Alignof(x.b) = 2    Offsetof(x.b) = 2
Sizeof(x.c) = 12    Alignof(x.c) = 4    Offsetof(x.c) = 4
// Typical 64-bit platform:
Sizeof(x)   = 32    Alignof(x)   = 8
Sizeof(x.a) = 1     Alignof(x.a) = 1    Offsetof(x.a) = 0
Sizeof(x.b) = 2     Alignof(x.b) = 2    Offsetof(x.b) = 2
Sizeof(x.c) = 24    Alignof(x.c) = 8    Offsetof(x.c) = 8

// https://github.com/golang/go/blob/go1.16.6/src/runtime/slice.go#L13
type slice struct {
	array unsafe.Pointer
	len   int
	cap   int
}

`unsafe.Pointer`

unsafe.Pointer 是一种特殊的指针，可以存放任意变量的地址。

不能用 * 对 unsafe.Pointer 解引用，因为不知道该地址对应的变量的类型。

unsafe.Pointer 的零值是 nil。

unsafe.Pointer 和普通指针一样可以相互比较，也可以和 nil 比较。

普通的 *T 类型可以并转换成 unsafe.Pointer，还可以转换回普通的指针类型（可以是 *T，也可以是其它类型的普通指针）。

Convert a *float64 pointer to a *uint64 to inspect the bit pattern of a floating-point variable.

package math
func Float64bits(f float64) uint64 { return *(*uint64)(unsafe.Pointer(&f)) }
fmt.Printf("%#016x\n", Float64bits(1.0)) // "0x3ff0000000000000"

unsafe.Pointer conversions let us write arbitrary values to memory and thus subvert the type system.

unsafe.Pointer 可以转换成 uintptr 类型，里面保存了指针地址的数值，对这一地址的数值可以进行算数运算。这种转换可逆，但由于不是所有的整型数值都是合法的地址，因此 uintptr 转成 unsafe.Pointer 可能会破坏类型系统。

uintptr 是可以容纳任何地址的无符号整型。

unsafe.Pointer 常被用作普通指针和指针地址的数值之间相互转换的媒介。

var x struct {
    a bool
    b int16
    c []int
}
// equivalent to pb := &x.b
pb := (*int16)(unsafe.Pointer(
    uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b)))
*pb = 42
fmt.Println(x.b) // "42"

不要用 uintptr 类型的临时变量去接。

// NOTE: subtly incorrect!
tmp := uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b)
// 执行这一语句时，x 的地址可能已经发生移动
pb := (*int16)(unsafe.Pointer(tmp))
*pb = 42

// 类似的问题：并没有指针指向 new(T) 创建变量，因而此变量会被 GC 回收
pT := uintptr(unsafe.Pointer(new(T))) // NOTE: wrong!

moving GCs: Some garbage collectors move variables around in memory to reduce fragmentation or bookkeeping.

尽管当前 Go 的实现并未使用 moving GC，但 Go 确实会腾挪内存中变量的地址。

协程可以按需扩容，扩容发生时，旧栈中的变量被移动到新的、更大的栈上，导致变量地址的数值发生变化。

使用建议：

Treat all uintptr values as if they contain the former address of a variable, and minimize the number of operations between converting an unsafe.Pointer to a uintptr and using that uintptr.
When calling a library function that returns a uintptr, such as those from the reflect package, the result should be immediately converted to an unsafe.Pointer to ensure that it continues to point to the same variable.

References

https://pkg.go.dev/unsafe

Contents

unsafe.Sizeof, unsafe.Alignof, unsafe.Offsetof

unsafe.Pointer

`unsafe.Sizeof`, `unsafe.Alignof`, `unsafe.Offsetof`

`unsafe.Pointer`