Notes on HD Wallets
Contents
区块链钱包主要分为 2 类:
- 非确定性钱包(nondeterministic wallet)
- 私钥由随机数生成,私钥之间无关联,也称为 JBOK 钱包(Just a Bunch Of Keys)。
- 最佳实践是每笔交易都使用不同的地址以提高隐私性,此时使用非确定性钱包就十分不便,必须频繁备份新私钥。
- Many Ethereum clients (including
geth
) use a keystore file, which is a JSON-encoded file that contains a single (randomly generated) private key, encrypted by a passphrase for extra security.- The keystore format uses a key derivation function (KDF), also known as a password stretching algorithm, which protects against brute-force, dictionary, and rainbow table attacks.
- The private key is not encrypted by the passphrase directly. Instead, the passphrase is stretched, by repeatedly hashing it. The hashing function is repeated for 262144 rounds (
crypto.kdfparams.n
). - An attacker trying to brute-force the passphrase would have to apply 262144 rounds of hashing for every attempted passphrase, which slows down the attack sufficiently to make it infeasible for passphrases of sufficient complexity and length.
- 确定性钱包(deterministic wallet)
- 所有私钥由一个 master key(称为种子 seed)衍生出来(衍生的方法是哈希函数),由种子即可再次生成得到所有衍生出来的私钥,因此只需要备份种子即可。
- 为了方便记忆,种子通常编码为英文单词,称为助记词(mnemonic code words)。
- 最常用的衍生方法用到了树形结构,这种钱包称为 HD 钱包(Hierarchical Deterministic wallet)。
- 所有私钥由一个 master key(称为种子 seed)衍生出来(衍生的方法是哈希函数),由种子即可再次生成得到所有衍生出来的私钥,因此只需要备份种子即可。
HD 钱包的特性:
-
树形结构可以用来表示组织结构;
-
用户通过父公钥衍生出子公钥,这一过程无需用到父私钥。
# 椭圆曲线满足分配率 c = p + h C = Gh + Gp = G(h + p) = Gc
BIP-32
BIP-32 定义了 HD 钱包的标准。
HD 钱包由一个根种子(root seed)创建,根种子长度可以是 128、256 或 512 比特,根种子通常从助记词创建而来。
根种子作为 HMAC-SHA512 函数的参数计算得到哈希值,哈希值的左侧 256 比特是主私钥,右侧 256 比特是主 chain code,主私钥可以生成主公钥。
HD 钱包用子密钥衍生(Child Key Derivation,CKD)函数从父密钥衍生出子密钥。CDK 基于哈希函数,参数结合了以下 3 部分:
- 父私钥或父公钥(ECDSA 的未压缩密钥);
- chain code(256 比特)作为种子;
- 索引号(32 比特)。
HMAC-SHA512 (Hash Based Message Authentication Code) is a special cryptographic hash function that, besides the normal single input, also takes a key.
HMAC-SHA512 是一种 CKD 函数。
扩展密钥(extended key)由密钥和 chain code 组成,分为扩展私钥 xprv 和扩展公钥 xpub。
- 由 xprv 可以解析出私钥和 chain code;
- 由 xpub 可以解析出公钥和 chain code;
- 配套的 xprv 和 xpub 的 chain code 相同,xprv 中包含的私钥扩展可以得到 xpub 中包含的公钥。
扩展密钥用 Base58Check
编码,前缀分别是 xprv
和 xpub
。
xprv 衍生子私钥:
- 计算
HMAC-SHA512(parent public key + parent chain code + index number)
得到 512 比特; - 512 比特的左边 256 比特($$h$$)和 parent private key($$p$$)相加得到 child private key;
- 这里的加法是普通的加法,结果对 $$2^{256}$$ 取余,为的是让加法的结果在 256 比特能表示的范围内;
- $$(h+p)∗G$$ 从子私钥衍生出对应的公钥。
- 512 比特的右边 256 比特是子私钥的 child chain code。
xpub 衍生子公钥:
- 计算
HMAC-SHA512(parent public key + parent chain code + index number)
得到 512 比特; - 512 比特的左边 256 比特($$h$$)当作私钥,计算出对应的公钥($$h∗G$$)后与 parent public key($$p*G$$)相“加”得到 child public key;
- 这里的加法是椭圆曲线加法,得到结果 $$h∗G + p*G$$,和 $$(h+p)∗G$$ 相等。
- 512 比特的右边 256 比特是子公钥的 child chain code。
从衍生过程可以看出,对某一公私钥对,当它们对同一 index number 进行衍生操作,无论是衍生子私钥还是子公钥,HMAC-SHA512
的入参都相同,因此衍生得到的子私钥和子公钥的 chain code 是相同的,$$h$$ 也相同。
By convention, we use M
to denote an xpub path and m
to denote an xprv path. M/1
and m/1
have the same chain code, but M/1
doesn’t have the private key, only the public key.
chain code 的作用是在衍生子密钥时提供熵。
由 seed 衍生主私钥和对应的 chaincode 也遵照以上的衍生流程,但这一过程并未用到 chain code,原因是 seed 里已经包含了所需的 entropy。
xprv 和 xpub 衍生子密钥的区别:
- To calculate the child private key, the parent private key is added (normal addition) to the left half of the hash, and the sum (modulo $$2^256$$ to keep the result within 256-bit numbers) becomes the child private key. You need to add the parent private key to the left 256 bits to make it impossible for someone with the xpub to generate child private keys.
- To calculate the child public key, you treat the left 256 bits as if they were a private key and derive a public key from them. This public key is then added to the parent public key using the special public key addition operation. The result is the child public key.
对 hardened child key derivation 的需要:
- 作恶者获得了 xpub
M/1
和与它衍生出来的公钥相对应的一个私钥 childPrvm/1/1
;- 作恶者可以从 xpub
M/1
解析出 chain codecc_m_1
,m/1
共享此 chain code; - 作恶者可以从 xpub
M/1
解析出 public keypk_m_1
。
- 作恶者可以从 xpub
- 作恶者拿到了
m/1/1
,但还不知道m/1/1
对应的 index; - 作恶者从
m/1/1
衍生出对应的公钥M/1/1
; - 衍生
M/1/1
的方法:hash = HMAC-SHA512(pk_m_1 + cc_m_1 + index_number)
- 取
hash
的左边 256 比特当作私钥,计算出对应的公钥,与pk_m_1
做椭圆曲线加法,得到的结果与M/1/1
比对是否相等; - 若相等,对应的
index_number
就是要找的target_index_number
,hash
就是对应的target_hash
,若不相等,继续逐个试index_number
。- 利用
target_hash
的右边 256 比特和m/1/1
可以衍生出m/1/1
的所有子私钥。
- 利用
m/1 + left half of target_hash = m/1/1
- 这里是普通加法。
m/1
被破解。- 问题出在衍生公钥和公钥都是基于相同的
HMAC-SHA512
计算结果进行计算。
- 问题出在衍生公钥和公钥都是基于相同的
hardened xprv 的做法是 HMAC-SHA512(parent private key + parent chain code + index number)
,衍生出 hardened child xprv,计算出对应的 hardened child xpub。
- 显然,普通的 xpub 不能衍生出 hardened child xpub。
最佳实践是将 master key 的子 key 用 hardened 的方式衍生,以保护 master key,更下层的 key 可以用普通的方式衍生以更好地利用 HD 钱包的 extended public key 特性。
普通衍生密钥的索引范围是 [0, (2^31)-1],hardened 方式衍生密钥的索引范围是 [(2^31), (2^32)-1]。
为了易读,hardened 方式衍生的密钥也从 0 开始展示,但加上角分符号 '
作为区分,即 i
真实索引是 i+(2^31)
。
BIP32 Deterministic Key Generator
Base58 and Base58Check Encoding
In order to represent long numbers in a compact way, using fewer symbols, many computer systems use mixed-alphanumeric representations with a base (or radix) higher than 10.
Base64 representation uses 26 lowercase letters, 26 capital letters, 10 numerals, and 2 more characters such as “+” and “/”.
Base58 is a subset of Base64, using upper- and lowercase letters and numbers, but omitting some characters that are frequently mistaken for one another and can appear identical when displayed in certain fonts. Specifically, Base58 is Base64 without the 0 (number zero), O (capital o), l (lower L), I (capital i), and the symbols “+” and “/”.
Base58Check is a Base58 encoding format with a checksum of an additional four bytes added to the end of the data that is being encoded. The checksum is derived from the hash of the encoded data.
Encoding:
-
Add a prefix to the data (called the “version byte”);
- It serves to easily identify the type of data that is encoded.
Type Version prefix (hex) Base58 result prefix Bitcoin Address 0x00 1 Pay-to-Script-Hash Address 0x05 3 Bitcoin Testnet Address 0x6F m or n Private Key WIF 0x80 5, K, or L BIP-38 Encrypted Private Key 0x0142 6P BIP-32 Extended Public Key 0x0488B21E xpub BIP-32 Extended Private Key 0x0488ade4 xprv -
Compute the “double-SHA” checksum;
checksum = SHA256(SHA256(prefix+data))
-
Take the first four bytes and append to the end.
# an example of an extended private key
xprv9tyUQV64JT5qs3RSTJkXCWKMyUgoQp7F3hA1xzG6ZGu6u6Q9VMNjGr67Lctvy5P8oyaYAL9CAWrUE9i6GoNMKUga5biW6Hx4tws2six3b9c
# xprv decode result (length 164, 82 bits)
0488ade4 # 4-bit prefix
010a2683ed00000000
d970c5e49aa52f3e074c2dc1f8eb4f08fd12c594cbde161dffc722ae0b7bafcf
00
894ca3c5afb5ae1552c55e14c3b73ca2ac8a092408b0e92d31af1b6538035313 # private key
b6e41f1b # 4-bit checksum
# corresponding extended public key
xpub67xpozcx8pe95XVuZLHXZeG6XWXHpGq6Qv5cmNfi7cS5mtjJ2tgypeQbBs2UAR6KECeeMVKZBPLrtJunSDMstweyLXhRgPxdp14sk9tJPW9
# xpub decode result (length 164, 82 bits)
0488b21e # 4-bit prefix
010a2683ed00000000
d970c5e49aa52f3e074c2dc1f8eb4f08fd12c594cbde161dffc722ae0b7bafcf
02 # prefix of compressed public key
12b55b9431515c7185355f15b48c5e1a1bbfa31af61429fa2bb8709de722f420 # corresponding public key
767a344a # 4-bit checksum
# https://github.com/SlackBuffer/secp256k1demo/blob/main/main.go
The private key can be represented in a number of different formats, all of which correspond to the same 256-bit number.
- Hexadecimal and raw binary formats are used internally in software and rarely shown to users;
- The WIF is used for import/export of keys between wallets and often used in QR code (barcode) representations of private keys.
Type | Prefix | Description |
---|---|---|
Raw | None | 32 bytes |
Hex | None | 64 hexadecimal digits |
WIF | 5 | Base58Check encoding: Base58 with version prefix of 128- and 32-bit checksum |
WIF-compressed | K or L | with added suffix 0x01 before encoding |
Whereas uncompressed public keys have a prefix of 04
, compressed public keys start with either a 02
or a 03
prefix.
- The left side of the equation is $$y^2$$, the solution for $$y$$ is a square root, which can have a positive or negative value, that’s why there are two possible prefixes for compressed.
BIP-39
Mnemonic code words are word sequences that encode a random number used as a seed to derive a deterministic wallet.
BIP-39 是助记词(mnemonic)标准的一套实现,接受度最广。
不同的实现的区别在于使用了不同的词库。
BIP-39 用 12-24 个单词编码 HD 钱包的种子。
脑钱包的单词是用户自己选择的。
A password’s strength is measured in entropy. The higher the entropy, the harder it is to guess the password.
助记词生成流程:
-
用随机数生成器生成 128-256 比特的随机序列(entropy),比特数记为
l
; -
取 entropy 的 SHA256 结果的前
l/32
个字节计算校验和; -
将校验和追加到 entropy 的末尾得到拼接序列;
-
拼接后的序列中每 11 个比特分成一组;
Entropy (bits) Checksum (bits) Entropy + checksum (bits) Mnemonic length (words) 128 4 132 12 160 5 165 15 192 6 198 18 224 7 231 21 256 8 264 24 -
将各组 11 比特的值映射预定义好的 2048(2^11)个单词,保持顺序不变;
-
助记词就是得到的单词序列。
The mnemonic words represent entropy with a length of 128 to 256 bits. The mnemonic words are then used to derive a longer (512-bit) seed through the use of the key-stretching function PBKDF2
. The seed produced is used to build a deterministic wallet and
derive its keys.
PBKDF2
takes two parameters:
- The mnemonic;
- A salt.
- The purpose of a salt in a key-stretching function is to make it difficult to build a lookup table enabling a brute-force attack.
- In the BIP-39 standard, the salt also allows the introduction of a passphrase that serves as an additional security factor protecting the seed.(助记词泄露,没有密码也得不到 seed)
- The salt is composed of the string constant
mnemonic
concatenated with an optional user-supplied passphrase.
PBKDF2
stretches the mnemonic and salt parameters using 2048 rounds of hashing with the HMAC-SHA512 algorithm, producing a 512-bit value as its final output. That 512-bit value is the seed.
BIP-43 , BIP-44
HD 钱包的的密钥遵循路径的命名规范,每层用 /
隔开,私钥以 m
开头,公钥以 M
开头.
Identifier m/x/y/z
describes the key that is the z
-th child of key m/x/y
, which is the y
-th child of key m/x
, which is the x
-th child of m
.
BIP-43 和 BIP-44 是和 HD 钱包的树形结构相关的规范。
BIP-43 提议将第一层的 hardened 子密钥的索引用来标识对应子树的目的(purpose)。
BIP-44 扩展了 BIP-43,提出了多账户的结构 m / purpose' / coin_type' / account' / change / address_index
:
- 第 1 层
purpose'
总是44'
; - 第 2 层
coin_type'
指定加密货币的种类,m/44'/0'
是比特币; - 第 3 层
account'
用于划分子账户; - 第 4 层
change
(普通的衍生类型)有两个分支,0 表示对外公开的收款账户,1 表示用于接收找零的账户; - 第 5 层
address_index
就是账户地址的索引。
References
- Grokking Bitcoin by Kalle Rosenbaum. 9781617294648
- Mastering Bitcoin by Andreas M. Antonopoulos (O’Reilly). Copyright 2017 Andreas M. Antonopoulos, 978-1-491-95438-6
- Mastering Ethereum by Andreas M. Antonopoulos and Dr. Gavin Wood (O’Reilly). Copyright 2019 The Ethereum Book LLC and Gavin Wood, 978-1-491-97194-9