Tuesday, August 13, 2013

The Collatz Conjecture

The Collatz Conjecture is another famous conjecture that concerns the natural numbers. There has been written a lot about the collatz conjecture, just see the beautiful blog post from Terence Tao or the overview from Lagrias [1]. The conjecture is also known as the $3n+1$ conjecture, the Ulam conjecture or the Syracuse conjecture. The handsomeness of this conjecture is based on the fact that it is one of these conjectures that are easy to state and even understandable by non-mathematicians.

The collatz conjecture in its original form is this:

Definition [Collatz-Function]. The Collatz function for an integer $n \in \mathbb{N}$ is defined as $$ C(n) := \begin{cases} 3n+1 & \text{if}\;n\;\text{is odd} \\ n/2 & \text{if}\;n\;\text{is even}\end{cases} $$ $\blacktriangleleft$

Then the conjecture is:
Collatz-Conjecture. For each \(n \in \mathbb{N}\) there exists an \(k \in \mathbb{N}\) such that \(C^k(n) = 1\).

The \(C^k(n)\) means that \(C\) is applied \(k\)-times in a row using the output from the previous round as the input for the next round. 

If there is a counterexample \(n\) to the Collatz conjecture, then either
  1. \(n\) is part of an infinite non-repeating trajectory
  2. \(n\) is part of an non-trivial cycle
Regarding point 2. it is proven by Lagarias [1] that the minimum length of such a cyclic could not be less than 275.000.

First i want to give a heuristic argument about the correctness of the CC.

# An Heuristic Argument #

Sometimes it is convient to describe the collatz function a little bit different.

Definition [Collatz-Function (2)]. The Collatz function for an integer $n \in \mathbb{N}$ is defined as $$ C(n) = \begin{cases} (3n+1)/2 & \text{if}\;n\;\text{is odd} \\ n/2 & \text{if}\;n\;\text{is even}\end{cases} $$  $\blacktriangleleft$

Since if \(n\) is odd, \(3n+1\) is even, hence one could immediatelly divide by \(2\).

We neglect the \(+1\) in the odd case and assume the average behaviour. That means, case 1 and case 2 occur both with probability 1/2. So \(C^k(n)\) is with probability one-half \(C^{k-1}(n)/2\) and with probability one-half \(C^{k-1}(n)3/2\). The denominator gets in each round an additional factor of \(2\), whereof the numerator on the average only increases in every second round by a factor of \(3\). Hence, after \(2k\) round we have $$C^{2k}(n) \approx \frac{3^{k}}{2^{2k}}n = \left(\frac{3}{4}\right)^{k}n$$ which reaches \(1\) after \(\log_{3/4}(n^{-1})\) iterations, that is \(\mathcal{O}(\log n)\).
So, we expect the collatz funktion to decrease very rapidly, which is a heuristic justification of the conjecture.

# The 0 to 1 ratio #

To prove the CC, one has to show that there are no infinite trajectories and that no trajectory passes into a non-trivial cycle.

Definition. Consider the function $f$ for some integer $n \in \mathbb{N}$ and some bit $b \in \{0,1\}$ $$f(b,n) \stackrel{\mathrm{def}}{=} \left( \frac{5b+1}{2} \right)n + b$$ then the function $F: \{0,1\}^* \times \mathbb{N} \rightarrow \mathbb{Q}$ is defined as $$F(\mathsf{V},n) = F((b_0,...,b_{m-1}),n) \stackrel{\mathrm{def}}{=} f(b_{m-1},f(b_{m-2},...,f(b_1,f(b_0,n))...)$$ $\blacktriangleleft$

The function $f$ behaves like the collatz function, whereof the bit $b$ decides if case 1 or case 2 is executed. Note that $F$ will not be an integer for all input vectors. To be a valid "collatz vector" the distribution of the $1$s and $0$s in the vectors must follow some characteristics. At least of a $1$ there must follow a $0$, which is a called the $10$-rule.
From a given collatz trajectory (i.e., the integers $C^k(n)$) one could build the corresponding parity vectory by simply reducing each $C^k(n)$ modulo $2$.

Definition.  If $\mathsf{V}$ is a binary vector $(b_0,...,b_{m-1})$ of length $m$, then we denote as $N_{\mathsf{V},1}$ the number of $1$s in $\mathsf{V}$, i.e, the Hamming-weight of $\mathsf{V}$. And as $N_{\mathsf{V},0}$ the number $0$s to be found in $\mathsf{V}$. $\blacktriangleleft$

Andrei [2] proved that for all integer that are no counterexample to the CC it holds:

Theorem [Andrei et al.] Let $n$ be an integer from the Collatz tree and let $\mathcal{P}(n)$ the corresponding parity vector. Then the two integers $N_{\mathcal{P}(n),0}$ and $N_{\mathcal{P}(n),1}$ satisfy the inequality \begin{align*} \frac{N_{\mathcal{P}(n),1}}{N_{\mathcal{P}(n),0}} < \frac{\log 2}{\log 3} \end{align*}

What i want to show next, is that this ratio also holds for the $1$s and $0$s in the non-trivial cycle. And then this ratio is used to show that there are only very few pairs of $N_{\mathsf{V}_0}$ and $N_{\mathsf{V}_1}$ that are possible.

The processing of several collatz steps, that can be captured by the function $F$, leads to an output value that depends entirely on the order and the ratio of the involved operations, i.e., the $0$ and $1$. In order to proceed, we construct an explicit formula for the output of the function $F$, when it is called with a binary vector $\mathsf{V}$ on an integer $n$.

Definition. Let $\mathsf{V} = (b_0,...,b_{m-1})$ be a binary vector and let $\mathrm{pos}_1(i)$ denote the position of the $i$-th $1$ in $\mathsf{V}$. The function $v_\mathsf{V}(i)$ is then defined as $$v_\mathsf{V}(i) \stackrel{\mathrm{def}}{=} |\{b_j \in \mathsf{V}; j < \mathrm{pos}_1(i)\;\mathsf{and}\; b_j = 0\}| $$

 The function $v$ counts the number of $0$s in $\mathsf{V}$ up to the $i$-th  $1$-bit, in particular $v_\mathsf{V}(i) \leq \mathsf{pos}_1(i)$.

Lemma [Böhm and Sontacchi 1978 - Modified]. Given an integer $n$ and a binary vector $\mathsf{V}=(b_0,...,b_{m-1})$ then the function $F(\mathsf{V},n)$ evalutes to $$\label{appl} F(\mathsf{V},n) = \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n + \frac{1}{2^{N_{\mathsf{V},0}}}\sum^{N_{\mathsf{V},1}-1}_{i=0}2^{v_\mathsf{V}(i+1)}3^{N_{\mathsf{V},1}-1-i} $$

The Lemma above is only modified in respect to the notation and the left side of the equation. Actually, Böhm and Sontacchi proved that an integer $n$ that is part of a non trivial cycle, must be of the form
$$(\text{Eq. 1}):\;\;\;n = \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n + \frac{1}{2^{N_{\mathsf{V},0}}}\sum^{N_{\mathsf{V},1}-1}_{i=0}2^{v_\mathsf{V}(i+1)}3^{N_{\mathsf{V},1}-1-i}$$
 We simply, replaced $n$ with the $F$ function, which generates $n$ whenever the correct $\mathsf{V}$ vector is used.

Next, i show that from this it follows also the same ratio, as Andrei proved for an integer from the collatz tree.

Just take Eq. 1 and rearrange it: $$\left(2^{N_{\mathsf{V},0}}- 3^{N_{\mathsf{V},1}}\right)n = \sum^{N_{\mathsf{V},1}-1}_{i=0}2^{v_\mathsf{V}(i+1)}3^{N_{\mathsf{V},1}-1-i}$$ The right-hand-side is positiv, hence the left-hand-side has to be, so $$ 2^{N_{\mathsf{V},0}}- 3^{N_{\mathsf{V},1}} > 0$$ Taking the $\log_2$ $$N_{\mathsf{V},0}- \frac{\log 3}{\log 2}N_{\mathsf{V},1} > 0$$
which is $$ (*)\;\;\;\frac{\log 2}{\log 3} > \frac{N_{\mathsf{V},1}}{N_{\mathsf{V},0} } $$

Lemma. Given an integer $n \in \mathbb{N}$, $n > 1$, then the $m$-length binary vector that does not violate the $10$-rules and is of the form $(0^{D}|(10)^{N_{\mathsf{V},1}-1}|1)$ maximizes $F(\mathsf{V},n)$, with $N_{\mathsf{V},0}-N_{\mathsf{V},1} + 1 = D$.

Before we prove this lemma, we comment on the notation of $(0^{D}|(10)^{N_{\mathsf{V},1}-1}|1)$. First of all, the notation leaves out the commata, and presents a binary string. However, there is no problem in converting the binary string into a vector. The symbol $"|"$ stands for the usual string concatenation. $0^m$ is the short-version to symbolise a string that consists of entirely $m$ zeros. And $(10)^m$ is the shortcut for the $m$ times repetition of the binary string $10$. In this case, we get vectors of the form $(0,0,...,0,1,0,1,0,1,0,...,1,0,1)$.

Proof.  Let us first take a look at the $v$ function and how it behaves for such a vector.
If one calls the function $F$ with a vector $\mathsf{V}$ of the form $\mathsf{V}^* = (0^{D}|(10)^{N_{\mathsf{V},1}-1}|1)$, then the result can be written as $$F(\mathsf{V}^*,n) = \frac{3^{N_{\mathsf{V}^*,1}}}{2^{N_{\mathsf{V}^*,0}}}n + \frac{1}{2^{N_{\mathsf{V}^*,0}}}\sum^{N_{\mathsf{V}^*,1}-1}_{i=0}2^{v_\mathsf{V}(i+1)}3^{N_{\mathsf{V}^*,1}-1-i} $$ The $v$-function for the increasing input values for the vector $\mathsf{V}^*$ is equal to
$$ v_{\mathsf{V}^*}(i+1) = D+i $$
We now show, that $F(\mathsf{V},n) < F(\mathsf{V}^*,n)$ for $\mathsf{V}$ being an arbitrary valid permutation of $\mathsf{V}^*$ that is not the identity (so $\mathsf{V}^* \neq \mathsf{V}$). The term $\frac{3^{N_{\mathsf{V}^*,1}}}{2^{N_{\mathsf{V}^*,0}}}n$ does not change, since only contain the total sums, which do not change when for a permutation. The only terms that do concern the distribution of the $0$ and $1$ in the vector are those that are connected to the $v$-function. To maximize these terms, we show that $D$ zeros must be at the beginning. Since all other values of $0$ and $1$ must occur alternating in order not to invalidate the $10$-law. But this is easy to show, because if you move one $0$ from the beginning to somewhere behind the  $j > 1$ and before the $j+1$-th $1$. Then all terms $$2^{v_\mathsf{V}(i+1)},\text{for}\;\;i \leq 1$$ decrease, whereof all $$2^{v_\mathsf{V}(i+1)},\text{for}\;\;i > j$$ stay the same, since the number of preceeding $0$ does not change. Hence you only can decrease the total value when moving $0$ from the left to the right.

Next we apply this maximizing vector  to a circle, and investigate what could happen, in particular, we prove the following Theorem.

Theorem. Given the parity vector $\mathsf{V}$ of an non-trivial cycle from the point of view of the smallest integer $n_c$ in that cycle. Then $N_{\mathsf{V},0}$ and $N_{\mathsf{V},1}$ satisfy the inequality $$\frac{\log 2}{\log 3}\left(1-\frac{1}{N_{\mathsf{V},0}}\right) < \frac{N_{\mathsf{V},1}}{N_{\mathsf{V},0}} < \frac{\log 2}{\log 3}$$

Proof. Suppose the trajectory of an integer $n$ passes into a $m$-length cycle after $k$ steps. Then within the $m$ integers that build the cycle, there must be a smallest integer, say $n_{c}$. W.l.o.g. we can assume that the cycle starts with $n_{c}$ and define everthing before as introduction. The output of $F(\mathsf{V},n_c) = n_c$ can be written as \begin{align*} n_c = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + \frac{1}{2^{N_{\mathsf{V},0}}}\sum^{N_{\mathsf{V},1}-1}_{i=0}2^{D+i}3^{N_{\mathsf{V},1}-1-i} \end{align*} The length of the cycle does not play a role. The maximizing structure is reflected in the term $2^{D+i}$ rather than $2^{v_\mathsf{V}(i+1)}$.

\begin{align*} n_c = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\sum^{N_{\mathsf{V},1}-1}_{i=0}\frac{3^{N_{\mathsf{V},1}-1-i}}{2^{N_{\mathsf{V},0}-i}} = \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}-1}}{2^{N_{\mathsf{V},0}}}\sum^{N_{\mathsf{V},1}-1}_{i=0}\frac{3^{-i}}{2^{-i}}\\ = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}-1}}{2^{N_{\mathsf{V},0}}}\sum^{N_{\mathsf{V},1}-1}_{i=0}\frac{2^{i}}{3^{i}} = \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}-1}}{2^{N_{\mathsf{V},0}}} \left( \frac{1-(2/3)^{N_{\mathsf{V},1}}}{1-2/3} \right) \\ % = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^j\frac{3^{N_{\mathsf{V},1}-1}}{2^{N_{\mathsf{V},0}}} \left( 3-3(\frac{2}{3})^{N_{\mathsf{V},1}} \right) \\ = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}} - 2^j\frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}} \left(\frac{2}{3}\right)^{N_{\mathsf{V},1}} = \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}} - 2^j\frac{2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}\\ = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}}-2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}} \end{align*} The term $2^D$ symbolises the leading $D$ zeros. By the defintion of the Collatz function $2^D$ must be less than $n_c$, since otherwise one would have reached $1$ (and thus stopped), when making $D$ steps of division by $2$. This is equivalent to the known fact, that the shortest possible trajectory is made by the descent of perfect powers of two. \begin{align*} n_c = &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + 2^D\frac{3^{N_{\mathsf{V},1}}-2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}} \\ < &\; \frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c + n_c\frac{3^{N_{\mathsf{V},1}}-2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}} \\ = &\; \frac{2\cdot 3^{N_{\mathsf{V},1}}-2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c\\ = &\; 2\frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c - \frac{2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c \end{align*} Furthermore \begin{align*} n_c < &\; 2\frac{3^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c - \frac{2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c\\ = &\; \frac{2^{\log_2(3)\cdot N_{\mathsf{V},1}+1}}{2^{N_{\mathsf{V},0}}}n_c - \frac{2^{N_{\mathsf{V},1}}}{2^{N_{\mathsf{V},0}}}n_c\\  = &\; \frac{2^{\log_2(3)\cdot N_{\mathsf{V},1}+1} - 2^{N_{\mathsf{V},1} }}{2^{N_{\mathsf{V},0}}}n_c\\  = &\; \left(2^{\log_2(3)\cdot N_{\mathsf{V},1}+1-N_{\mathsf{V},0}} - 2^{N_{\mathsf{V},1}-N_{\mathsf{V},0}}\right)n_c \end{align*}
Clearly, the term $2^{\log_2(3)\cdot N_{\mathsf{V},1}+1-N_{\mathsf{V},0}} - 2^{N_{\mathsf{V},1}-N_{\mathsf{V},0}}$ is positive. Since the left exponent is always larger than the right. However, when is this term $<1$?

So when is $\log_2(3)\cdot N_{\mathsf{V},1}+1-N_{\mathsf{V},0} < 0$. We have
0 > &\log_2(3)\cdot N_{\mathsf{V},1}+1-N_{\mathsf{V},0} \\
N_{\mathsf{V},0} > &\log_2(3)\cdot N_{\mathsf{V},1}+1 \\
1 > &\log_2(3)\cdot\frac{N_{\mathsf{V},1}}{N_{\mathsf{V},0}}+\frac{1}{N_{\mathsf{V},0}} \\
Since we know that $\frac{1}{N_{\mathsf{V},0}} \leq \frac{N_{\mathsf{V},1}}{N_{\mathsf{V},0}} < \frac{\log 2}{\log 3}$ let us assume that $\frac{N_{\mathsf{V},1}}{N_{\mathsf{V},0}} = \frac{\log 2}{\log 3} - \epsilon$ for some $\epsilon > 0$. Hence
0 > &\log_2(3)\cdot N_{\mathsf{V},1}+1-N_{\mathsf{V},0} \\
N_{\mathsf{V},0} > &\log_2(3)\cdot N_{\mathsf{V},1}+1 \\
1 > & 1-\log_2(3)\epsilon+\frac{1}{N_{\mathsf{V},0}} \\
\log_2(3)\epsilon > & \frac{1}{N_{\mathsf{V},0}} \\
\epsilon > & \frac{\log 2}{\log 3 N_{\mathsf{V},0}} \\ 
Hence we finally get
\frac{\log 2}{\log 3}\left(1-\frac{1}{N_{\mathsf{V},0}}\right) <  \frac{N_{\mathsf{V},1}}{N_{\mathsf{V},0}} < \frac{\log 2}{\log 3}\\ 

So the corridor for this $0$ to $1$ ratio is very small, i.e., for an value $N_{\mathsf{V},0}$ there is only one or two fitting value for $N_{\mathsf{V},1}$.

(I did not carefully proofread all this yet, so if someone finds a mistake please give me a note.)

[1] Lagarias, J. C. The 3x+1 problem: An annotated bibliography II (2000-2009). http://arxiv.org/pdf/math/0608208v5.
[2] Stefan Andrei, Manfred Kudlek, R. S. N. Some results on the Collatz problem. Acta Informatica 37, 2 (2000), 145.
[3] C. Böhm and G. Sontacchi: On the Existence of Cycles of given Length in Integer Sequences like x_(n+1) = x_n/2 if x_n even, and x_(n+1) = 3x_n + 1 otherwise. Atti della Accademia Nazionale dei Lincei. Rendiconti. Classe di Scienze Fisiche, Matematiche e Naturali. Serie VIII 64 (1978), 260-264.

No comments:

Post a Comment