zk-SNARKs 一探

zk-SNARK stands for “Zero-Knowledge Succinct Non-Interactive Argument of Knowledge,” and refers to a proof construction where one can prove possession of certain information, e.g. a secret key, without revealing that information, and without any interaction between the prover and verifier.

“Zero-knowledge” proofs allow one party (the prover) to prove to another (the verifier) that a statement is true, without revealing any information beyond the validity of the statement itself.
- For example, given the hash of a random number, the prover could convince the verifier that there indeed exists a number with this hash value, without revealing what it is.
In a zero-knowledge “Proof of Knowledge” the prover can convince the verifier not only that the number exists, but that they in fact know such a number – again, without revealing any information about the number.
“Succinct” zero-knowledge proofs can be verified within a few milliseconds, with a proof length of only a few hundred bytes even for statements about programs that are very large.
In the first zero-knowledge protocols, the prover and verifier had to communicate back and forth for multiple rounds, but in “non-interactive” constructions, the proof consists of a single message sent from prover to verifier.

Polynomials

The degree of a polynomial, $$f(x) = c_nx^n + … + c_1x^1 + c_0x^0$$, is determined by its greatest exponent of $$x$$, which is $$n$$.

The result of a polynomial equation consists of two polynomials for arbitrary degree $$d$$ polynomials is always another polynomial equation of degree at most $$d$$, since there is no multiplication to produce higher degrees. And the Fundamental Theorem of Algebra states that a degree $$d$$ polynomial equation can have at most $$d$$ solutions, so these two polynomials can have at most $$d$$ shared points.

Example: $$5x³ + 7x² – x + 2 = 3x³ – x² + 2x – 5$$ simplifies to $$2x³ + 8x² – 3x + 7 = 0$$.

Because polynomials have this advantageous property (if we have two non-equal polynomials of degree at most $$d$$, they can intersect at no more than $$d$$ points), a prover claims to know some polynomial (no matter how large its degree is) that the verifier also knows can follow a simple protocol to prove it:

Verifier chooses a random $$x$$ and evaluates the polynomial locally;
Verifier gives $$x$$ to the prover and asks to evaluate the polynomial in question;
Prover evaluates his polynomial at $$x$$ and gives the result to the verifier;
Verifier checks if the local result is equal to the prover’s result, and if so then the statement is proven with a high confidence.
- Consider an integer range of $$x$$ from $$1$$ to $$10⁷⁷$$, the number of points where evaluations are different is $$10⁷⁷ – d$$. Henceforth the probability that $$x$$ accidentally “hits” any of the $$d$$ shared points is equal to (which is considered negligible) $$\frac{d}{10^{77}}$$. This protocol requires only one round and gives overwhelming confidence (almost 100% assuming $$d$$ is sufficiently smaller than the upper bound of the range). That is why polynomials are at the very core of zk-SNARK.
  - 假设 prover 知晓的多项式和 verifier 的不同，则 $$x$$ 的取值只有 $$d$$ 中可能，概率很低。

这里没有零知识的部分，prover 和 verifier 都知道同一个多项式。

Fundamental Theorem of Algebra also states that any polynomial equation can be factored into linear polynomials (i.e., a degree 1 polynomials representing a line), as long it is solvable. Consequently, we can represent any valid polynomial equation as a product of its factors $$(x-a_0)(x-a_1)…(x-a_n) = 0$$.

The factorized form has all the solutions (also called roots) on the surface.
最高次项的系数不用考虑，左右两边同时除掉。

If a prover claims to know a polynomial equation of degree 3 with the roots $$1$$ and $$2$$, this means that his polynomial has the form $$(x-1)(x-2)\cdot(x-a_3)$$. Hence if the prover wants to prove that indeed his polynomial equation has those roots without disclosing the polynomial itself, he needs to prove that his polynomial $$p(x)$$ is the multiplication of those cofactor $$t(x) = (x-1)(x-2)$$, called target polynomial, and some arbitrary polynomial.

$$p(x) = t(x) ⋅ h(x)$$

A natural way to find $$h(x)$$ is through the division $$h(x) = \frac{p(x)}{t(x)}$$. If the prover cannot find such $$h(x)$$ that means that $$p(x)$$ does not have the necessary cofactor $$t(x)$$, in which case the polynomials division will have a remainder.

prover 手中的多项式是 $$p(x) = x^3 - 3x^2 + 2x = (x-1)(x-2)x$$；verifier 只知晓 $$t(x) = (x-1)(x-2) = x^2 - 3x + 2$$，随机选择一个数 $$23$$ 计算得到 $$t = t(23) = 462$$ 并将 $$23$$ 发给 prover；prover 计算 $$p = p(23) = 10626$$ 和 $$h = h(23) = 23$$，将 $$p$$ 和 $$h$$ 发给 verifier；verifier 计算得到 $$p = t ⋅ h$$，验证通过。
如果 prover 手中的的多项式是 $$p(x) = 2x^3 - 3x^2 + 2x$$，此时 $$p(x) = t(x)×(2x + 3) + 7x – 6$$，此时 $$h(x) = 2x + 3 + \frac{7x-6}{t(x)}$$；尽管 $$x$$ 的选择是随机的，$$\frac{7x-6}{t(x)}$$ 恰好除尽的概率仍不可忽略。
- That is the reason to introduce cryptographic primitives which make such division impossible, even if the raw evaluations happen to be divisible.
- 不能除尽的情况下，由于浮点数存储的限制，势必会引入误差，造成 $$p(x)$$ 和 $$t(x) ⋅ h(x)$$ 不相等。

prover 随机选 $$x$$ 显然不能是 roots，否则 $$t(x) = 0$$，起不到验证效果；这种机制下 verifier 不知道完整的多项式，只知道 $$t(x)$$，开始向零知识过渡。

There are multiple issues with this verification procedure:

Prover may not know the claimed polynomial $$p(x)$$ at all, but he also happens to know how to calculate $$t(x)$$. He can calculate evaluation $$t = t(r)$$, select a random number $$h$$ and set $$p = t⋅h$$, which will be accepted by the verifier as valid, since equation holds.
- verifier 只知道 $$t(x)$$；
Because prover knows the random point $$x = r$$, he can construct any polynomial which has one shared point at $$r$$ with $$t(r) ⋅ h(r)$$.
- prover 知道正确的多项式 $$p(x)$$，他可以构造另一个和 $$p(x)$$ 在 $$x = r$$ 相交的多项式；
There is no enforcement of degree. Prover can cheat by using a polynomial of higher degree which also satisfies the cofactor check.

前两个问题可以通过对 $$r$$ 模糊处理而不是明文传输来解决。

Obscure Evaluation

Homomorphic encryption allows to encrypt a value and be able to apply arithmetic operations on such encryption.

假设需要加密的数是 3，选择一个底数（底数公开），比如 5，加密结果是 $$5^3 = 125$$。

加密结果乘以 2：$$125^2 = 5^{3 \times 2}$$；
加密结果加 2：$$5^{3+2} = 5^3 \cdot 5^2$$；
都是操作指数位；

由于底数 5 是公开的，可以由加密结果轻易反推出加密的数。引入模运算可以让反推难度剧增，反推的结果也不再是对应一个数而是很多数，同时保留了算数运算的能力。

加密结果：$$5^3 = 6 \ (mod \ 7)$$；
加密结果乘以 2：$$125^2 = 5^{3 \times 2} = 1 \ (mod \ 7)$$；
加密结果加 2：$$5^{3+2} = 5^3 \cdot 5^2 = 3 \ (mod \ 7)$$；

$$5^? = 125$$：乘方的逆运算称为对数，即便数字很大，求对数也并不困难；

$$7^? mod \ 13 = 8$$：模运算中的对数称为离散对数，当模数），求离散对数非常困难，也非常耗时，尚未有快速求出离散对数的算法；

There are limitations to this homomorphic encryption scheme while we can multiply an encrypted value by an unencrypted value, we cannot multiply (and divide) two encrypted values, as well as we cannot exponentiate an encrypted value.

$$E(v) = g^v(mod \ n)$$ is the encryption function, where $$v$$ is the value to be encrypted.

Evaluate the polynomial $$p(x) = x³ – 3x² + 2x$$ with an encrypted random value of $$x$$. To know a polynomial is to know its coefficients, in this case those are: 1, –3, 2. Because homomorphic encryption does not allows to exponentiate an encrypted value, we’ve must been given encrypted values of powers of $$x$$ from 1 to 3: $$E(x)$$, $$E(x²)$$, $$E(x³)$$.

无法通过 $$(E(x))^2$$ 计算 $$E(x^2)$$；
多项式的系数与加密结果的乘法发生在指数位；
模运算可以每次计算都执行，也可以只对最后的结果执行，不影响最终结果；
$$E(x^3)^1 \cdot E(x^2)^{-3} \cdot E(x)^2 = (g^{x^3})^1 \cdot (g^{x^2})^{-3} \cdot (g^{x})^2 = g^{x^3 - 3x^2 + 2x} = E(p(x))$$

知晓 $$d$$ 阶多项式的证明、验证流程：

验证者选出随机值 $$s$$，计算 $$E(s^0)$$ 到 $$E(s^d)$$ 并把计算结果给 prover，同时计算 $$t(s)$$；
prover 计算 $$h(x) = \frac{p(x)}{t(x)}$$；$$c_0$$ 到 $$c_d$$ 是多项式的系数，代入加密函数，得到 $$E(p(s))$$ 和 $$E(h(s))$$，分别记为 $$g^p$$ 和 $$g^h$$，发给 verifier；
verifier 计算 $$g^p = (g^h)^{t(s)}$$ 是否成立；

The prover still can use any other means to forge a proof without actually using the provided encryptions of powers of $$s$$. It is not possible to verify this in the current protocol. One could use any possible means to find some arbitrary values $$zₚ$$ and $$zₕ$$ which satisfy equation $$zₚ = zₕ^{t(s)}$$. That is why verifier needs the proof that only encryptions of powers of $$s$$ were used and nothing else.

Shifted Polynomial

Consider an elementary example of a degree 1 polynomial with one variable and one coefficient $$f(x) = c⋅x$$ and correspondingly the encryption of the $$s$$ is provided $$E(s) = gˢ$$. What we are looking for is to make sure that only encryption of $$s$$, i.e., $$gˢ$$, was homomorphically “multiplied” by some arbitrary coefficient $$c$$ and nothing else. So the result must always be of the form (for some arbitrary $$c$$) $$(gˢ)^c$$. A way to do this is to require to perform the same operation on another shifted encrypted value alongside with the original one, acting as an arithmetic analog of “checksum”, ensuring that the result is exponentiation of the original value. This is achieved through the Knowledge-of-Exponent Assumption (or KEA):

Alice has a value $$a$$ ($$a$$ is a generator of a finite field group used) that she wants Bob to exponentiate to any power, the single requirement is that only this a can be exponentiated and nothing else, to ensure this she chooses a random $$α$$ (alpha), calculates $$a’ = a^α (mod \ n)$$, provides the tuple $$(a, a’)$$ to Bob and asks him to perform same arbitrary exponentiation of each value and reply with the resulting tuple $$(b, b’)$$ where the exponent “α-shift” remains the same, i.e., $$b^α = b’ (mod \ n)$$;
Bob cannot extract $$α$$ from the tuple $$(a,a′)$$ other than through a brute-force, which is infeasible. Bob chooses a value $$c$$, calculates $$b = a^c (mod \ n)$$, $$b’=(a’)^c (mod \ n)$$, and replies with $$(b, b’)$$;
Alice checks the equality:
- $$b^α = (a^c)^α=a^{c \cdot α}$$
- $$b’ = (a’)^c = (a^α)^c = a^{α \cdot c}$$
- $$b^α = b’$$

Conclusions:
- Bob has applied the same exponent $$c$$ to both values of Alice’s tuple;
- Bob could only use the original Alice’s tuple to maintain the $$α$$ relationship;
- Bob knows the applied exponent $$c$$, because the only way to produce valid $$(b,b’)$$ is to use the same exponent;
- Alice has not learned $$c$$ for the same reason Bob cannot learn $$α$$;
  - Although $$c$$ is encrypted its range of possible values might not be sufficient to preserve zero-knowledge property;
Such protocol provides a proof to Alice that Bob indeed exponentiated $$a$$ by some value known to him, and he could not do any other operation, e.g., multiplication, addition, since this would erase the $$α$$-shift relationship.

In the homomorphic encryption context, exponentiation is the multiplication of the encrypted value. Apply the same construction with the simple one-coefficient polynomial $$f(x) = c ⋅ x$$:

Verifier chooses random $$s$$, $$α$$ and provides evaluation for $$x = s$$ for power 1 and its “shift”: $$(g^s, g^{α \cdot s})$$;
Prover applies the coefficient $$c$$: $$((g^s)^c, (g^{α \cdot s})^c) = (g^{s \cdot c}, g^{α \cdot s \cdot c})$$;
Verifier checks $$(g^{s \cdot c})^α = g^{α \cdot s \cdot c}$$;

Such construction restricts the prover to use only the encrypted $$s$$ provided, therefore prover could have assigned coefficient $$c$$ only to the polynomial provided by the verifier;

We can now scale such one-term polynomial (monomial) approach to a multi-term polynomial because the coefficient assignment of each term is calculated separately and then homomorphically “added” together.

Verifier provides encrypted powers $$g^{s^0}$$, $$g^{s^1}$$, …, $$g^{s^d}$$ and their shifts $$g^{αs^0}$$, $$g^{αs^1}$$, …, $$g^{αs^d}$$;
Prover evaluates encrypted polynomial $$g^{p(s)} = (g^{s^0})^{c_0} \cdot g^{s^1})^{c_1} \cdot … \cdot g^{s^d})^{c_d} = g^{c_0αs^0 + c_1αs^1 +… + c_dαs^d}$$; evaluates shifted polynomial $$g^{αp(s)} = (g^{αs^0})^{c_0} \cdot g^{αs^1})^{c_1} \cdot … \cdot g^{αs^d})^{c_d} = g^{α(c_0αs^0 + c_1αs^1 +… + c_dαs^d)}$$; provides the results as $$g^p$$, $$g^{p^’}$$ to verifier;
Verifier checks $$(g^p)^α = g^{p^’}$$;

Now we can be sure that the prover did not use anything else other than what verifier provided, since there is no other way to preserve the $$α$$-shift.

根据 proof 提供的数据，verifier 可以暴力枚举多项式系数的所有组合，获得多项式的信息。

While theoretically polynomial coefficients $$cᵢ$$ can have a vast range of values, in reality, it might be quite limited. For instance if we consider the range of 100 values for each coefficient, the degree 2 polynomial would total to 1 million of distinct combinations, which considering brute-force would require less than a million iterations. Moreover, the secure protocol should be secure even in cases where there is only one coefficient, and its value is 1.

Zero-Knowledge

The proofs, $$g^p$$, $$g^{p^’}$$, $$g^h$$ provided by the prover participate in the following checks:

Polynomial $$p(x)$$ has roots of $$t(x)$$: $$g^p = (g^h)^{t(s)}$$;
Polynomial of a correct form is used: $$(g^p)^α = g^{p^’}$$;

The question is how do we alter the proof such that the checks still hold, but no knowledge can be extracted?

We can “shift” those values by some random number $$δ$$ (delta), e.g., $$(gᵖ)ᵟ$$ . Now, in order to extract the knowledge, one first needs to find $$δ$$, which is considered infeasible. Moreover, such randomization is statistically indistinguishable from random.

Prover samples a random $$δ$$ and exponentiates his proof values with it; provides $$(g^p)^δ$$, $$(g^{p^’})^δ$$, $$(g^h)^δ$$ to verifier;
Verifier verifies:
1. $$g^{δ \cdot p} = g^{δ \cdot t(s)h}$$
2. $$g^{δ \cdot αp} = g^{δ \cdot p^’}$$

Non-Interactivity

References

What are zk-SNARKs

Why and How zk-SNARK Works

https://secbit.io/blog/2019/12/25/learn-zk-snark-from-zero-part-one/

Contents

Polynomials

Obscure Evaluation

Shifted Polynomial

Zero-Knowledge

Non-Interactivity