Thursday, August 08, 2013

[Paper Review] Efficient Cryptosystems From $2^k$-th Power Residue Symbols

Today i want to present a paper from the 2013 Eurocrypt conference. The paper is written by Marc Joye and Benoît Libert with the name Efficient Cryptosystems From $2^k$-th Power Residue Symbols [1]. It describes a cryptosystem that is a generalization of the Goldwasser-Micali cryptosystems [2] from quadratic residues to $2^k$-th power residues.

The scheme provides
  • indistinguishable encryption under the $k$-QR assumption 
  • a small ciphertext expansion rate 
  • a fast decryption algorithm
  • additive homomorphic operations
  • a (more) efficient Lossy Trapdoor Function
Just as the Goldwasser-Micali cryptosystem the Joye-Libert cryptosystems works in the ring $\mathbb{Z}_N$ for a RSA-modulus $N$. The factorization of $N = pq$ is chosen to be $p \equiv q  \equiv 1\pmod{2^k}$. Then they make use of the $n$-th power residue symbol:

Definition. [$n$-th power residue symbol]. Let $p$ be an odd prime and let $n \geq 2$ such that $n | (p-1)$. Then the symbol $$ \binom{a}{p}_n = a^{(p-1)/n}\pmod{p}$$ is called the $n$-th power residue symbol modulo $p$. $\blacktriangleleft$.


As usual, it is $$\mathbb{J}_N = \left\{a \in \mathbb{Z}^*_N\left| \binom{a}{N}_2 = 1\right.\right\}$$ Those are all elements from $\mathbb{Z}^*_N$ which have a positive Jacobi symbol. Then there is a subset of $\mathbb{J}_N$ $$\mathbb{QR}_N = \left\{a \in \mathbb{Z}^*_N \left| \binom{a}{p}_2 = \binom{a}{q}_2 = 1\right.\right\}$$, which is the set of all integers that are "real" quadratic residues modulo $N$.

The Quadratic Residuosity assumption (QR) is then the task to decide, given an integer $a \in \mathbb{J}_N$, if $a$ is from $\mathbb{QR}_N$ or $\mathbb{J}_N/\mathbb{QR}_N$. The QR assumption is believed to be a computational hard task if the factors of $N$ are unknown. 

Because the primes in the Joye-Libert cryptosystem have the property $p \equiv q \equiv 1\pmod{2^k}$, they renamed the QR assumption for their setup the $k$-QR assumption.

The encryption scheme is than the following:

1. KeyGen($1^k$): Pick two random primes $p$ and $q$ with $p \equiv q \equiv 1\pmod{2^k}$ for a message space of $[0,2^k-1]$. Choose $y \in \mathbb{J}_N/\mathbb{QR}_N$. The public key is then $pk = \{N,y,k\}$ and the private key is $sk = \{p\}$.

2. Encrypt($pk,m$): To encrypt a message $m \in \{0,1\}^k$, pick a random $x \in \mathbb{Z}^*_N$ and return the ciphertext $c \equiv y^mx^{2^k}\pmod{N}$.

3. Decrypt($sk,c$): Given the ciphertext $c \in \mathbb{Z}^*_N$ and the private key $p$, the algorithm computes $z = \binom{c}{p}_{2^k} \equiv c^{(p-1)/2^k}\pmod{p}$ and then searches (efficiently as shown later) the integer $m$ such that $$\left[ \binom{y}{p}_{2^k} \right]^m\equiv z \pmod{p}$$

It is left to show that A) the decryption process returns a unique (and the correct) $m$ and B) this could be carried out efficiently.

to A) Let $\alpha$ be the integer $\alpha \equiv y^{(p-1)/2^k}\pmod{p}$. From that it follows that $\alpha^{2^k} \equiv 1 \pmod{p}$, hence the multiplicative order of $\alpha$, say $\mathsf{ord}_p(\alpha) = t$, must divide $2^k$.

Assume that $t = 2^{k'}$ for $k' < k$. Hence $\alpha^{2^{k'}} \equiv y^{(p-1)/2^{k'}} \equiv 1 \pmod{p}$.
It follows $$\left(\alpha^{2^{k'}}\right)^{2^{k'-1}} \equiv \left(y^{(p-1)/2^{k'}}\right)^{2^{k'-1}} \equiv y^{(p-1)/2} \equiv 1 \pmod{p}$$
But $y$ is a quadratic non-residue in $\mathbb{Z}_p$ (it is from  $\mathbb{J}_N/\mathbb{QR}_N$), i.e., it is $y^{(p-1)/2} \equiv -1\pmod{p}$ which is a contradiction. So $\alpha$ generates a group of size $2^k$, which allows to uniquely determine $m$.

to B) The decryption process follows directly from the fact, that if $p \equiv 1 \pmod{2^k}$, then the discrete logarithm in the group of quadratic non-residues in $\mathbb{Z}^*_p$, leaks $k$ bits. I.e. for an arbitrary prime the Legrende symbol is enough to get information if the exponent is even or odd. And here, these $k$ bits cover the whole message. The decryption process, which is actually the task to compute the discrete logarithm in a small subgroup, works because of the smoothness of the subgroup's order (here a power of two). So the decryption process is comparable to the Pohlig-Hellman algorithm.

To prove its semantic security, they actually have to show that one could not distinguish $2^k$-th residues from an element from $\mathbb{J}_N / \mathbb{QR}_N$.

To make this visible, assume that one encrypts two messages $m_1$ and $m_2$: $c_1 \equiv y^{m_1}x_1^{2^k} \pmod{N}$ and $c_2 \equiv y^{m_2}x_2^{2^k} \pmod{N}$. An adversary, that could actually distinguish $2^k$-th residues from an element from $\mathbb{J}_N / \mathbb{QR}_N$ would do $$\frac{c_1}{c_2} \equiv y^{m_1-m_2}\left(\frac{x_1}{x_2}\right)^{2^k}\pmod{N}$$
So, if $m_1 = m_2$, it is $$\frac{c_1}{c_2} \equiv \left(\frac{x_1}{x_2}\right)^{2^k}\pmod{N}$$
and $c_1/c_2$ is a $2^k$-th residues, in particular it is a quadratic residues, hence $\in \mathbb{QR}_N$. If not, the non cancelling factor of $y^{m_1-m_2}$, with $y \in \mathbb{J}_N / \mathbb{QR}_N$ moves the whole product to $\mathbb{J}_N / \mathbb{QR}_N$.
For the proof, they define the Gap $2^k$-Residuosity Assumption.

Definition [Gap $2^k$-Residuosity Assumption]. Let $N = pq$ be the product of two large primes $p$ and $q$ with $p,q \equiv 1\pmod{2^k}$. The Gap $2^k$-Residuosity problem in $\mathbb{Z}^*_N$ is to distinguish the distribution of the following two sets given only $N = pq$: $$V_0 = \{x \in \mathbb{J}_N / \mathbb{QR}_N\}\;\text{and}\;V_1 = \{y^{2^k}\pmod{N} \left| y \in \mathbb{Z}^*_N\right.\}$$ The Gap 2^k-Residuosity assumption posits that the advantage $\mathsf{Adv}_\mathcal{D}^{\mathsf{Gap-2^k-Res}}(1^\kappa)$ of any PPT distinguisher $\mathcal{D}_t$, defines as the distance $$\left|\mathsf{Pr}[\mathcal{D}(x,k,N)=1|x \stackrel{R} {\leftarrow}V_0] - \mathsf{Pr}[\mathcal{D}(x,k,N)=1|x \stackrel{R} {\leftarrow}V_1]\right|$$ where probabilities are taken over all coin tosses, is negligible.$\blacktriangleleft$.

Finally, they prove that the $k$-QR implies the Gap $2^k$ Residuosity assumption and hence their scheme earns the property of being semantic secure.

Lossy Trapdoor function. The concept of a Lossy Trapdoor function (LTDF) is a relatively new concept and was introduced 2008 by Peikert and Waters [3]. The meaning of a LTDF is simply, that the function $f$ is not injective, i.e., it looses information, since a value $f(x)$ has many inverse elements $f^{-1}(x)$. LTDFs allow to proof some security properties more easily, by showing that an adversary can not distinguish a injective trapdoor function from a lossy one. Joye and Libert construct a LTDF from the cryptosystem as follows: It is $p = 2^kp'+1$ and $q = 2^kq'+1$ with $p,q,p',q'$ primes. And $h_i = g_i^{2^k}$ for some random $g_i$ from $\mathbb{Z}^*_N$ and random numbers $r_i$ from the intervall $[0,p'q'-1]$. Then they build:

$$ \mathsf{InjGen:} \begin{bmatrix}
yh_1^{r_1}  & h_2^{r_1}  & ... & h_m^{r_1}\\
h_1^{r_2}  & yh_2^{r_1}  & ... & h_m^{r_1}  \\
... & ... & ... & ... \\
h_1^{r_m}  & h_2^{r_m}  & ... & yh_m^{r_m} 
\end{bmatrix}_{N}$$

$$ \mathsf{LossyGen:} \begin{bmatrix}
h_1^{r_1}  & h_2^{r_1}  & ... & h_m^{r_1}\\
h_2^{r_2}  & h_2^{r_1}  & ... & h_m^{r_1}  \\
... & ... & ... & ... \\
h_2^{r_m}  & h_2^{r_m}  & ... & h_m^{r_m} 
\end{bmatrix}_{N}$$
The evaluation of an input vector $\mathsf{x} = \{0,1\}^m = \{0,1\}^{km}$ is then, after spitting the vector $\mathsf{x}$ in $m$ blocks $x_i$ of length $k$, the product of each row raised to the power of $x_i$:
$$  \mathsf{Eval.InjGen:} \begin{bmatrix}
y^{x_1}h_1^{x_1r_1}  & h_2^{x_1r_1}  & ... & h_m^{x_1r_1}\\
h_1^{x_2r_2}  & y^{x_2}h_2^{x_2r_1}  & ... & h_m^{x_2r_1}  \\
... & ... & ... & ... \\
h_1^{x_mr_m}  & h_2^{x_mr_m}  & ... & y^{x_m}h_m^{x_mr_m} 
\end{bmatrix}_{N}$$
and then building the product over each column $$\mathsf{y}_I = \left(y^{x_1}h_1^{\sum^m_{i=1}x_ir_i},y^{x_2}h_2^{\sum^m_{i=1}x_ir_i},...,
y^{x_m}h_m^{\sum^m_{i=1}x_ir_i}\right)$$
And equally the same for the lossy version
$$  \mathsf{Eval.LossyGen:}\begin{bmatrix}
h_1^{x_1r_1}  & h_2^{x_1r_1}  & ... & h_m^{x_1r_1}\\
h_1^{x_2r_2}  & h_2^{x_2r_1}  & ... & h_m^{x_2r_1}  \\
... & ... & ... & ... \\
h_1^{x_mr_m}  & h_2^{x_mr_m}  & ... & h_m^{x_mr_m} 
\end{bmatrix}_{N}$$
and then building the product over each column $$\mathsf{y}_L = \left(h_1^{\sum^m_{i=1}x_ir_i},h_2^{\sum^m_{i=1}x_ir_i},...,
h_m^{\sum^m_{i=1}x_ir_i}\right)$$
Clearly, one could invert $\mathsf{y}_I$ by executing the normal decryption process (note that the $h_i$'s are $2^k$-th residues) to get $\mathsf{x}$. But one could not decrypt $\mathsf{y}_L$, since there actually is nothing to decrypt. And since those elements of $\mathsf{y}_L$ are all elements from the subgroup of order $p'q'$, the size of the output domain is decreased, which is equal to the loss of information.

Known least significant bits. The circumstances that allows decryption is, besides the knowledge of the factorization of $N$, also the special form of its factors $$ p = 2^kp'+1,\;\;q = 2^kq'+1 $$ Hence, a factor of $p-1$ and $q-1$ is known, in particular $k$ least significant bits, which could probably lead to a weakening of the system's security. But the authors describe in what range $k$ has to be, such that no known attack can succeed. I.e. $$ k < \frac{1}{4}\log_2 N - \kappa $$ whereof $\kappa$ is the security parameter.

At least i feel always a little bit uncomfortable, if some properties of factors of $N$ are revealed. Normally, one could say: Ok, i revealed $k$-bit of the factors, but thats not enough to get anything senstive from it, since $\log_n(N) - k$ bit are missing.

But here, already $k$ bits are enough to get the entire message, thats is what should be kept in mind.

Nevertheless, a nice work.

[1] Marc Joye, Benoît Libert: Efficient Cryptosystems from $2^k$-th Power Residue Symbols. EUROCRYPT 2013: 76-92
[2] Shafi Goldwasser, Silvio Micali: Probabilistic Encryption. J. Comput. Syst. Sci. 28(2): 270-299 (1984)
[3] Chris Peikert, Brent Waters: Lossy trapdoor functions and their applications. STOC 2008: 187-196

No comments:

Post a Comment