Tuesday, March 15, 2016

Using the ABC-Conjecture in Cryptography

The ABC-Conjecture is a very famous conjecture in Number Theory which is perhaps not a conjecture anymore if it the proof of Shinichi Mochizuki will turn out to be correct. I can not say anything useful about proving this conjecture, but i thought about its application for a while. Let me state the stronger version of the conjecture due to Baker [1];

ABC-Conjecture (strong, explicit version)
Given co-prime integers $a,b,c$ with $a + b = c$ then $$ c < (\text{rad}(abc))^{1.75}$$

The operator $\text{rad}$ (=radical) means the product of the distinct prime numbers dividing $n$. The version was derived from the (following) explicit ABC-Conjecture, also due to Baker:


ABC-Conjecture (explicit version)
Given co-prime integers $a,b,c$ with $a + b = c$ and let $\omega(n)$ be the number of distinct prime factors of $n$ then $$c < \frac{6}{5}\text{rad}(abc)\frac{(\log(\text{rad}(abc)))^{\omega(abc)}}{\omega(abc)!}$$

Assume you have co-prime integers $a,b,c$, wlog $a < b < c$, with $\text{rad}(a) < a^{1/6}$,  $\text{rad}(b) < b^{1/6}$ and $\text{rad}(c) < c^{1/6}$, then it is
\begin{align*}
 c &\leq \text{rad}(abc)^{1.75} = (\text{rad}(a)\text{rad}(b)\text{rad}(c))^{1.75}\\
 &< (a^{1/6}b^{1/6}c^{1/6})^{1.75} < (c^{1/2})^{1.75} = c^{15/16}\\
& < c
\end{align*} which is a contradiction and hence such integers can not exists. That's also the reason, why the even more famous Fermat Last Theorem follows from the ABC-Conjecture, which shows how deep the ABC-Conjecture is.

Observe the following simple Heuristic:
Heuristic. For random integers $X,x_i,\ldots,x_m$ it is:
\begin{equation}
\text{rad}(X)  \leq \text{rad}(x_1x_2...x_m),\;\text{if}\;X \approx \prod x_i
\end{equation}
Justification: The probability that $X$ is divisible by the prime power $p^2$ is $1/p^2$. The probability that the product $x_1x_2\ldots x_m$ is divisible by $p^2$ is:
\begin{align*}
&1 - \left( \left(\frac{p-1}{p}\right)^m + m \frac{1}{p}\left(\frac{p-1}{p}\right)^m\right) \\
& = 1 - \left(\frac{p-1}{p}\right)^m\left(1+\frac{m}{p}\right)
\end{align*}
which is a lot bigger. So in the case of many small factors, the radical reduces with a chance of $1 - \left(\frac{p-1}{p}\right)^m(1+\frac{m}{p})$ by the factor $p$, compared to a chance of $1/p^2$ in the case that you only have one big integer $X$. To give you a numerical example: Take the prime $p=5$ and an integer that is the product of $m=3$ integers $x_1, x_2, x_3$. Then the chance that the product $x_1x_2x_3$ is divisible by $5^2$ is $1 - (4/5)^3(1+3/5) = 0.1808$. In the contrast, the chance that a random integer $X$ of the same size as $x_1x_2x_3$ is divisible by $5^2$ is only $0.04$.

This heuristic is important if you consider the generalized version of the ABC-Conjecture, called the "n-Conjecture".

n-Conjecture
Given $n \geq 3$ and $a_1,a_2,\ldots,a_n \in \mathbb{Z}$ that satisfy three conditions:
$\;$(1) $\gcd(a_1,a_2,\ldots,a_n) = 1$
$\;$(2) $\sum^{n}_{i=1}a_i = 0$
$\;$(3) no proper subsum of $a_i, 1 \leq i \leq n$ equals $0$
then
\begin{equation}
\max(|a_1|,|a_2|,\ldots,|a_n|) < C_{n,\epsilon}\text{rad}\left(\prod^n_{i=1}(|a_i|)\right)^{2n-5+\epsilon}
\end{equation}

There are some quick calculations you could do using the $n$-Conjecture. For example, if you pick a base-$b$ representation of an integer $c$ and let $\#p(m)$ be the $m$-th primorial, e.g., the product of all the primes less or equal than $m$, then for large $n$ and small $b$ you get (for $\epsilon = 0$ and $C_{n,\epsilon} = 1$):
\begin{equation}
c = \sum^{n-2}_{i=0} k_ib^i \Rightarrow c < \text{rad}(c\cdot \#p(b))^{2n-5}
\end{equation}
There is also a strong version of the n-Conjecture, but which requires the strong property of pairwise co-primeness for the involved integers. Consequently, the above described heuristic in case of the strong n-Conjecture is not that powerful.

n-Conjecture (strong)
Given $n \geq 3$ and $a_1,a_2,\ldots,a_n \in \mathbb{Z}$ that satisfy three conditions:
$\;$(1) $a_1,a_2,\ldots,a_n1$ are $\textbf{pairwise}$ coprime
$\;$(2) $\sum^{n}_{i=1}a_i = 0$
$\;$(3) no proper subsum of $a_i, 1 \leq i \leq n$ equals $0$
then
\begin{equation}
\max(|a_1|,|a_2|,\ldots,|a_n|) < C_{n,\epsilon}\text{rad}\left(\prod^n_{i=1}(|a_i|)\right)^{1+\epsilon}
\end{equation}

Since the ABC-Conjecture makes a statement about the prime factorization of integers, it is always worthwhile to think about what happens if you apply it to cryptographic integers that can not be factorized efficiently. So the following two problems came into my mind:

Cryptographic Problem 1Given an integer $c$, decide if $c$ is of the form $c=p^sq$ or $c=pq$ (one of them is true) with $p,q$ being prime numbers and $p \approx q$. 

Cryptographic Problem 2Given an oracle that generates integers $p^sq$ with prime numbers $p,q$ and $p \approx q$. Additionally, the oracle should output a proof that the exponent $s$ is not too large.

There are factorization algorithms that take advantage of the unusual form $p^sq$ and are more efficiently in that case. But if the bitsizes are large enough, also those algorithms fail to factorize efficiently.

We start with the equation $a + b = c$. First, lets look at the radical of $p^sq$. We write $a = p^sq$ and because of $p \approx q$ it is $\text{rad}(p^sq) \approx a^{2/(s+1)}$. If one could find integers $b,c$ with $a+b=c$ and $$ c >  (\text{rad}(abc))^{1,75} \approx (\text{rad}(bc))^{1.75}a^{3.5/(s+1)}$$ then the assumption that $a = p^sq$ is false and consequently $a$ must be equal to $pq$. Similar, you could assume that $c$ is the integer in question. Then you are faced with the inequality $$c > \text{rad}(ab)^{1.75}c^{3.5/(s+1)}$$. If, e.g., $a=x^4$ and $b=y^4$, you get $$c^{(s-2.5)/(s+1)} > c^{7/8}$$ which is true for large enough $s$, hence $c$ has no such small radical.

But how to find $b$ and $c$ in the first case, or $a$ and $b$ in the later? And if $a$ is indeed of the form $p^sq$, then those integers do not even exists (assuming the ABC-Conjecture is correct), so when do we stop searching?

Approach (for the Cryptographic problem 1). To be more flexible, we are trying to use the $n$-Conjecture. We start straight forward and try to find a suitable base representation that might work. We pick an integer $B$ of the form $B = b^m < c$, such that $m \gg n-2$:
\begin{equation}
c = B^{n-2}k_{n-2} + B^{n-3}k_{n-3} + B^{1}k_{0}
\end{equation}
Then we get $$c < b^{2n-5}\text{rad}(\prod^{n-2}_{i=0}k_i)^{2n-5}c^{(4n-10)/(s+1)}$$ Thus a necessary requirement is $s > 4n-11$. For the coefficient for a random base $B$ representation, it is $k_i \approx b/2$, and $c^{1/(n-1)} < b < c^{1/(n-2)}$, hence
$$c < \frac{1}{2^{n-1}}\prod^{n-2}_{i=0} k_i < c^{1 + 1/(n-2)}$$ The heuristic reduces the radical, say, about a factor $K$. But this reduction is negligible compared to the size explosion caused by the exponent $2n-5$. So for a given integer $p^sq$, using a random base $B$, to construct integers that solve the problem, seems not feasible at all.

LLL Approach.  Lets try lattice reduction, which is a well working tool if you are looking for small coefficients for certain type of diophantine equations. We start with the equation $a+by=cz$ and then apply the LLL algorithm to find suitable short vectors. To do this, we set $b=p^{e_1}$ (to get a small radical for $b$) and initialize $c$ with the value from the cryptographic problem (i.e. $pq$ or $p^sq$). The lattice is spanned by the vectors $$\begin{pmatrix}1 \\ b \end{pmatrix}, \begin{pmatrix}0 \\ c \end{pmatrix}$$ Any linear combination leads to a new vector in the lattice, for example the vector
\begin{equation}
x\begin{pmatrix}1 \\ b \end{pmatrix} + y\begin{pmatrix}0 \\ c \end{pmatrix} \Leftrightarrow
\begin{pmatrix}1 & 0\\ b & c \end{pmatrix}\begin{pmatrix}-y\\ z \end{pmatrix} =
\begin{pmatrix}-y\\-by + cz \end{pmatrix} = \begin{pmatrix}y\\a \end{pmatrix}
\end{equation} LLL will find most of the time an integer vector with $y$ and $a$ of size around $c^{1/2}$, hence if you initialize $p^{e_1} \approx c^{1/2}$, then $z$ is small and you will find so called "ABC-Hits". That are triples of integers $(a,b,c)$ such that $c > \text{rad}(abc)$. ABC-Hits are no counterexamples to the ABC-Conjecture. They realize the case $\epsilon = 0$ and $C_{n,\epsilon} = 1$, for which it was proven, that infinite many solution exists in that case. For example, if indeed $a$ and $y$ are of size around $c^{1/2}$ then $a + p^{e_1}y = cz$. Assume that $z=1$ (this is not unusual), then $$c > \text{rad}(ap^{e_1}yc) > pc^{1/2-\delta_1}c^{1/2-\delta_2}c^{2/(s+1)} = pc^{1-\delta_1-\delta_2+2/(s+1)}$$ often holds. The $\delta_i$ values indicate that the radical function applied to $a$ and $y$ will further reduce their size. My proof-of-concept implementation shows, that you will find ABC-Hits very fast using this approach. But sadly, as often regardless of whether $c=pq$ or $c=p^sq$. Trying to enhance this approach to work even with the strong, explicit ABC-Conjecture that uses the exponent 1.75 seems to be out of scope. Using the lattice reduction in combination with the $n$-Conjecture may increase the performance.

In [2] you find several further methods how to find ABC-Hits, but none of them seems promising in this case. So, the cryptographic problem 1 seems to be not that trivial hard.

Approach (for the Cryptographic problem 2).

For the second problem i can give a solution using the $n$-Conjecture. I couldn't find a strong, explicit version of the $n$-Conjecture, similar to Bakers version for the ABC-Conjecture, for which he uses $C_{n,1} = 1$ and $\epsilon = 3/4$. So i used the $n$-Conjecture with the bold setting $\epsilon = 1$ and $C_{n,\epsilon} = 1$. The approach is then as follows.

The oracle $\mathcal{O}$ from the problem definition executes the following six steps:
1) Choose an integer $b \in \mathbb{N}$
2) Set $B = b^m$
3) Generate a prime number $p = \sum^{l_p-1}_{i=0}B^ih_i$ with $h_i < X$
4) Compute prime $p = \sum^{l_q-1}_{i=0}B^il_i$ with $l_i < X$ and $p \approx q$
5) Compute $c = p^sq$ for $s \in \mathbb{N}$
6) The oracle $\mathcal{O}$ outputs $c$ together with the base-$B$ representation of $c = (a_0,\ldots,a_{n-2})$ and an upper bound $\mathcal{U}$ for $s$. For this upper bound it holds that, if $c$ would be of the form $p^{\mathcal{U}}q$, then
$$ c  > \text{rad}(c\prod^{n-2}_{i=0}a_i)^{2n-5+\epsilon} $$ So the receiver of the output from the oracle could at least verify, that the exponent $s$ from $c = p^sq$ is at most $\mathcal{U}-1$, otherwise the $n$-Conjecture would be false.

Correctness. Let us show, that the approach above really works and how $X$ and $\mathcal{U}$ are defined. It is $\log_B p = l_p$ and $\log_B q = l_q$, hence for the product $c = p^sq$ we get:
\begin{equation}
\log_B c = \log_B p^sq = s\log_Bp + \log_Bq = sl_p + l_q \approx (s+1)l_p
\end{equation}
The base-$B$ representation of $c$ is then:
\begin{equation}
c = \sum^{(s+1)l_p-1}_{i=0} B^ik_i,\;\;\;\text{and}\;\;k_i < X^{s+1}
\end{equation} The integer $n$ in the $n$-Conjecture is then $n = (s+1)l_p+1$. The approximation $\text{rad}(c) \approx c^{2/(s+1)}$ is the tool for the verifier to check size restriction of $s$ that was used to construct $c$. However, the verifier does not know $s$, but he can use the given bound $\mathcal{U}$ to compute $\text{rad}(c) \approx c^{2/(\mathcal{U}+1)}$ to check the following inequality:
\begin{align*}
c &= p^sq > \text{rad}(B\prod^{n-2}_{i=0}k_i c)^{2n-4} \\
   &> b^{2n-4}\text{rad}(\prod^{n-2}_{i=0}k_i)^{2n-4}c^{(4n-8)/(\mathcal{U}+1)} \\
   &> b^{2n-4}X^{(n-1)(s+1)(2n-4)}c^{(4n-8)/(\mathcal{U}+1)} \\
\end{align*}
As a rough estimation we write $b \approx c^{1/(m(s+1)l_p)}$ and we bind $X$ by $c^{1/d}$, then
\begin{align*}
c &>b^{2n-4}X^{(n-1)(s+1)(2n-4)}c^{(4n-8)/(\mathcal{U}+1)} \\
   &\approx c^{(2n-4)/(m(s+1)l_p)}c^{(n-1)(s+1)(2n-4)/d}c^{(4n-8)/(\mathcal{U}+1)} \\
   &= c^{(2n-4)/(m(s+1)l_p)+(n-1)(s+1)(2n-4)/d+(4n-8)/(\mathcal{U}+1)} \\
\end{align*}
So the question is, for which parameters is 
$$1 > \frac{2n-4}{m(s+1)l_p}+\frac{(n-1)(s+1)(2n-4)}{d}+\frac{4n-8}{\mathcal{U}+1}$$
We assume that $1/d$ is close to $w/(m(s+1)l_p)$, i.e., the bound of the coefficients is comparable to the size of $b$. This yields:
$$1 > \frac{2n-4}{m(s+1)l_p}+\frac{(n-1)(s+1)(2n-4)w}{m(s+1)l_p}+\frac{(2n-4)2}{\mathcal{U}+1}$$
and further:
$$1 > (2n-4)\left(\frac{1}{m(s+1)l_p}+\frac{(n-1)w}{ml_p}+\frac{2}{\mathcal{U}+1}\right)$$
Plugging in $n = (s+1)l_p+1$ yields:
$$1 > (2(s+1)l_p-2)\left(\frac{1}{m(s+1)l_p}+\frac{(s+1)w}{m}+\frac{2}{\mathcal{U}+1}\right)$$ and finally solving for $\mathcal{U}$:
$$\mathcal{U} > 4((s+1)l_p-1)\left(1 -  \frac{2(s+1)l_p-2}{m(s+1)l_p} - \frac{(2(s+1)l_p-2)(s+1)w}{m}\right)^{-1} - 1$$

So the oracle $\mathcal{O}$ can use the last inequality to generate the bound $\mathcal{U}$. Actually, this bound is very weak. For example, if, in order to generate $c$, the oracle chose $s=4$, $l_p = 2$ and $m=1024$, $w=2$ (so the coefficients will be within the square root distance of $b$), then the bound will be $\mathcal{U} \geq 44$. So, the receiver who only gets $p^sq$ and the base $B$ representation could at least verify that the exponent $s$ in $c$ is not larger than $43$, since otherwise the received integers would contradict the $n$-Conjecture!

I coded a proof of concept and i works fine, but there is probably a lot room for improvement, but nevertheless, it seems that cryptographic problem 2 is solvable.


[1] Baker, Alan (2004). Experiments on the abc-conjecture. Publ. Math. Debrecen 65: 253–260.
[2] Martin, Greg and Miao, Winnie (2014), abc TRIPLES, http://arxiv.org/abs/1409.2974v1

No comments:

Post a Comment