The approach is analogous to the approach of ToughSAT. For a high level overview, the algorithm works like this:
Input: $N = pq$, with unknown primes $p$ and $q$.
- $n \leftarrow$ bitlength(N)
- Take $2n$ variables for the binary representation of $p$ and $q$ and write:
\begin{align*}
x_{n-1}x_{n-2}...x_1x_0 \cdot y_{n-1}y_{n-2}...y_1y_0
\end{align*} Then apply the school method for multiplying two integers. - This multiplication creates a list of equations that heavily depend on each other.
- Replace each equation with an equivalent formula that only uses the boolean operators "and" and "or".
- Transform the boolean formula to a 3CNF
Assume we have $N = 15$, then the first step of the school method yields the following $16$ values:
x3 x2 x1 x0 * y3 y2 y1 y0 =
-------------------------------------------------------------
x3ANDy3 x2ANDy3 x1ANDy3 x0ANDy3
x3ANDy2 x2ANDy2 x1ANDy2 x0ANDy2
x3ANDy1 x2ANDy1 x1ANDy1 x0ANDy1
x3ANDy0 x2ANDy0 x1ANDy0 x0ANDy0
-------------------------------------------------------------
pp3,3 pp2,3 pp1,3 pp0,3
pp3,2 pp2,2 pp1,2 pp0,2
pp3,1 pp2,1 pp1,1 pp0,1
pp3,0 pp2,0 pp1,0 pp0,0
In particular, if you trace the output of the ToughSAT implementation, you will find exactly the variables as shown here.
So the $2\cdot 4$ variables from the two factors $p$ and $q$ are increased by the $16$ variables $ppi,j$ that represent the "and" of the binary positions $i$ of $p$ and $j$ of $q$. Next, the school method for multiplication says that we have to add the results column wise (e.g. pp0,1 + pp1,0 = ts1,0 carry: cr1,0):
Total variables: 24
pp3,3 pp2,3 pp1,3 pp0,3 cr0,1
pp3,2 pp2,2 pp1,2 pp0,2
pp3,1 pp2,1 pp1,1
pp3,0 pp2,0
ts0,1
------------------------------------------------------------- +2 vars
cr0,2
pp3,3 pp2,3 pp1,3 pp0,3
pp3,2 pp2,2 pp1,2
pp3,1 pp2,1 ts0,2
pp3,0 pp2,0 ts0,1
------------------------------------------------------------- +2 vars
cr0,3
pp3,3 pp2,3 pp1,3
pp3,2 pp2,2
pp3,1 ts0,3
pp2,1 ts0,2
pp3,0 pp2,0 ts0,1 pp0,0
------------------------------------------------------------- +2 vars
cr0,4
pp3,3 pp2,3
pp3,2 ts0,4
pp2,2 ts0,3
pp3,1 pp2,1 ts0,2
pp3,0 pp2,0 ts0,1 pp0,0
------------------------------------------------------------- +2 vars
cr0,4
pp3,3 pp2,3 cr1,1
pp3,2 ts0,4
pp2,2 ts0,3
pp3,1
ts1,1
------------------------------------------------------------- +2 vars
cr1,2
cr0,4
pp3,3 pp2,3
pp3,2 ts0,4
ts1,2 pp2,0 ts0,1 pp0,0
------------------------------------------------------------- +2 vars
cr1,3 cr0,4
pp3,3
ts1,3 ts1,2 ts0,3
ts1,1 ts0,2
pp2,0 ts0,1 pp0,0
------------------------------------------------------------- +2 vars
cr1,4
ts1,4 ts0,4
ts1,2 ts0,3
ts1,1 ts0,2
pp2,0 ts0,1 pp0,0
------------------------------------------------------------- +2 vars
cr1,4 ts1,4 cr0,4
ts1,3
ts0,4 cr2,2
ts1,2 ts0,3
ts1,1
ts2,2
------------------------------------------------------------- +2 vars
cr1,4 ts1,4 cr0,4
ts1,3 cr2,3
ts0,4
ts1,2
ts2,3
------------------------------------------------------------- +2 vars
cr2,4
cr1,4 ts1,4 cr0,4
ts1,3
ts2,4
------------------------------------------------------------- +2 vars
cr2,5
cr1,4 ts1,4
ts2,5
------------------------------------------------------------- +2 vars
cr2,6
cr1,4
ts2,6
------------------------------------------------------------- +2 vars
ts2,7
<-cr2,7
------------------------------------------------------------- +2 vars
Final:
cr2,7 ts2.7 ts2,6 ts2,5 ts2,4 ts2,3 ts2,2 ts0,1 pp0,0
Total variables: 52
The final $9$ variables must be set equal to the product $N=15$ in question, that is \begin{array}{}
cr2,2 & ts2,7 & ts2,7 & ts2,5 & ts2,4 & ts2,3 & ts2,2 & ts0,1 & pp0,0\\
0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1\\
\text{false} & \text{false} & \text{false} & \text{false} & \text{false} & \text{true} & \text{true} & \text{true} & \text{true}
\end{array}
Hence, for those $9$ variables we already know the truth values.
The equations that were created during the process are of one of the following forms:
- $x_1 = x_2\;\text{and}\;x_3$, e.g., pp0,0 = x0 and y0
- $x_1 = x_2 + x_3, \stackrel{\text{carry}}{\rightarrow} x_4$, e.g., ts2,6 = cr2,5 + ts1,4 carry:cr2,6
- $x_1 = x_2 + x_3 + x_4, \stackrel{\text{carry}}{\rightarrow} x_5$, e.g., ts2,5 = cr2,4 + cr0,4 + ts1,3 carry:cr2,5
We have to remove the "=" and "+" operators and replace them with only "and" and "or".
For type 1. equations we use:
\begin{align*}
x_1 = x_2\;\text{and}\;x_3 \Leftrightarrow & (x_1\;\text{or}\;-x_2\;\text{or}\;-x_3)\;\text{and}\;(-x_1\;\text{or}\;x_2)\;\text{and}\;(-x_1\;\text{or}\;x_3)\\
\hat{=} & (x_1\vee -x_2\vee -x_3)\wedge (-x_1\vee x_2)\wedge (-x_1\vee x_3)
\end{align*} So we get $3$ clauses from one type 1 equation.
For type 2. equations we use:
\begin{align*}
x_1 = x_2 + x_3, \stackrel{\text{carry}}{\rightarrow} x_4 \hat{=} & (-x_1\vee x_2\vee x_3)\wedge(-x_1\vee -x_2\vee -x_3)\wedge \\
& (x_1\vee x_2\vee -x_3)\wedge(x_1\vee -x_2\vee x_3)\wedge \\
& (-x_1\vee -x_2\vee x_3)\wedge(x_1\vee x_2\vee -x_3)\wedge \\
& (-x_2\vee -x_3\vee x_4)\wedge(x_2\vee x_3\vee -x_4)\wedge \\
& (x_2\vee -x_4)\wedge(x_3\vee -x_4) \\
\end{align*} So we get $10$ clauses from one type 2 equation.
For type 3. equations we use:
\begin{align*}
x_1 = x_2 + x_3 + x_4, \stackrel{\text{carry}}{\rightarrow} x_5 \hat{=} &
( -x_1 \vee x_2 \vee x_3 \vee x_4 ) \wedge
( x_1 \vee -x_2 \vee -x_3 \vee x_4 ) \wedge \\
&( -x_1 \vee -x_2 \vee x_3 \vee -x_4 ) \wedge
( -x_1 \vee x_2 \vee -x_3 \vee -x_4 ) \wedge \\
&( x_1 \vee -x_2 \vee -x_3 \vee -x_4 ) \wedge
( x_1 \vee x_2 \vee x_3 \vee -x_4 ) \wedge \\
&( x_1 \vee x_2 \vee -x_3 \vee x_4 ) \wedge
( x_1 \vee -x_2 \vee x_3 \vee x_4 ) \wedge \\
&( -x_2 \vee -x_3 \vee x_5 ) \wedge
( -x_2 \vee -x_4 \vee x_5 ) \wedge \\
&( -x_3 \vee -x_4 \vee x_5 ) \wedge
( x_2 \vee x_3 \vee -x_5 ) \wedge \\
&( x_2 \vee x_4 \vee -x_5 ) \wedge
( x_3 \vee x_4 \vee -x_5 )
\end{align*} So we get $14$ clauses from one type 3 equation.
The number of variables and clauses could be roughly estimated by: We have $2n$ variables for $x_{n-1}$ to $x_0$ and $y_{n-1}$ to $y_0$. This yields $n^2$ variables for each combination. Further, an addition yields two new variables and reduces the amount summands (mostly) by $-1$. The $n^2$ are reduced until $\approx 2n$ remain. Therefore we need $n^2-2n = n(n-2)$ additions, hence we get $2n(n-2)$ variables this way. So in total we have $$\#\text{vars} = 2n + n^2 + 2n(n-2) = 3n^2 - 2n \in \mathcal{O}(n^2)$$
The number of clauses could be roughly estimated by: From the $n^2$ variables that result from the "and" equations we get $3n^2$ clauses. We have $\epsilon$ variables that come from the $+$ equations with $2$ summands and $(1-\epsilon)$ variables that come from the $+$ equations with $3$ summands. In total this are $n(n-2)$. Thus we get
\begin{align*}
\#\text{clauses}\;&3n^2 + 10\epsilon n(n-2) + 14(1-\epsilon) n(n-2) \\
= & 3n^2 + 10\epsilon n^2 - 20\epsilon n + 14(1-\epsilon) n^2 - 28(1-\epsilon) n \\
= & 3n^2 + 10\epsilon n^2 - 20\epsilon n + 14n^2 - 14\epsilon n^2 - 28n + 28\epsilon n \\
= & 17n^2 - 4\epsilon n^2 + 8\epsilon n - 28n \\
\end{align*}
For $\epsilon = 0$ we have $\#\text{clauses} = 17n^2 - 28n$ and for $\epsilon = 1$ we have $\#\text{clauses} = 13n^2 - 20n$.
Phase Transition. The phase transition for SAT tells how the ratio of clauses and variables of a formula is related to the satisfiability of the formula. For a ratio between $$ 3.5 < \frac{\#\text{clauses}}{\#\text{variables}} < 4.5 $$ SAT instances are mostly hard. For larger ratios they are nearly always non satisfiable and a smaller ratio gives formulas that can be easily satisfied.
In the case of the factoring approach shown above, we have $\#\text{vars} = 3n^2 - 2n$ and $\#\text{clauses} = 13n^2 - 20n$ or $\#\text{clauses} = 17n^2 - 28n$ respectively. If we neglect the small linear terms we get the ratio:
\begin{equation}
4.3 \leq \frac{13n^2 }{3n^2} \leq \frac{17n^2}{3n^2 } \leq 5.6
\end{equation}
which shows, that the resulting SAT instances will probably be hard. Note that such a created instance is always satisfiable.
No comments:
Post a Comment