Many Ones Deduction, Part II

Prerequisites: Rings (e.g. Chapter 7 of Dummit and Foote), multiplicative functions and a bit of calculus. Most things discussed here can easily be adapted (with increased verbosity) to fit the sort of introduction to number theory that does not mention groups or rings. If we accept the concavity of the function $\log_2$, then the calculus in the proof of Lemma 3 can be omitted.

In a previous blog post, I wrote about some fun problems and concluded with an unanswered question that stemmed from my setup. With little confidence in number theory, I figured that I would certainly not be able to answer this question myself, so I posted it on StackExchange. Within an hour, I got some wonderful pointers and heuristic observations from Greg Martin. This guided me towards one of those fantastic days, when things you already know coalesce into a solution to a problem that had previously stumped you. As such, I think I am now able to answer the question on which I confusedly ended my last post.  The goal of this blog post is just to provide proofs for a few results, not to outline a list of recreational problems, but I will leave the show/hide format for anyone who would like to attempt the proofs themselves.

Let's begin with a recap, using slightly altered notation. For integers $b,n>0$, we define $$c_n(b)=11\dots 11_b=\sum_{i=0}^{n-1}b^i=\frac{b^n-1}{b-1}$$ This is the number whose $b$-ary expansion consists of $n$ ones (note that the last formula only makes sense if $b\neq 1$). We can also describe the sequence $n\mapsto c_n(b)$ recursively, by defining $c_1(b)=1$ and $c_{n+1}(b)=bc_n(b)+1.$

For any integers $a,b>0$, we will write $P(a,b)$ for the number of residue classes (mod $a$) assumed by the set $\{c_n(b):n>0\}$. 
Each $c_n(b)$ is a polynomial in $b$ that is considered modulo $a$, so $P(a,b)$ depends only on the residue class of $b$ (mod $a$). We are interested in finding those $b$ for which the set $\{c_n(b):n>0\}$ covers all residues (mod $a$), so we define $$\mathscr R(a)=\{b\in \mathbb Z/a\mathbb Z:P(a,b)=a\}$$ and $R(a)=\#\mathscr R(a)$. In what follows, we will describe $\mathscr R$ in full, which will characterize $R$ and finally address the last post's final question.

Since $c_n(1)=n$ for all $n>0$, we have $P(a,1)=a$ and therefore $1\in \mathscr R(a)$ for any $a>0$. For a slightly more interesting example, let's look at $c_n(b)$ mod $4$ for $b=1,2,3,$ and $4$: $$c_n(1):\ 1,2,3,0,1,2,3,\dots\quad \Longrightarrow \quad P(4,1)=4$$ $$c_n(2):\ 1,3,3,3,3,3,3,\dots\quad \Longrightarrow \quad P(4,2)=2$$ $$c_n(3):\ 1,0,1,0,1,0,1,\dots\quad \Longrightarrow \quad P(4,3)=2$$ $$c_n(4):\ 1,1,1,1,1,1,1,\dots\quad \Longrightarrow \quad P(4,4)=1$$ Hence, we see that $R(4)=1$. Next, we recall/expand upon some facts from the last post:
  • The sequence $n\mapsto c_n(b)$ is periodic modulo $a$ if and only if there exists some $n>1$ with $c_n(b)\equiv 1$ (mod $a$). In this case, $P(a,b)$ is the smallest period and given any $d\mid a$, we have $P(d,b)\mid P(a,b).$
    Proof:
    The sequence $n\mapsto c_n(b)$ is defined recursively, with each term depending only on its immediate predecessor. Hence, the sequence will begin to repeat as soon as it repeats a single term (and there are only finitely many possible values modulo $a$), so $n\mapsto c_n(b)$ is eventually periodic. The sequence is periodic if and only if it repeats its first term, meaning that there is some $n>1$ with $c_n(b)\equiv 1$ (mod $a$).
        Now assume that $n\mapsto c_n(b)$ is periodic (mod $a$). Since the sequence begins to repeat as soon as a single term repeats, the shortest period of this sequence equals the number of values assumed (mod $a$), namely $P(a,b)$. If we descend along the map $\mathbb Z/a\mathbb Z\rightarrow \mathbb Z/d\mathbb Z$, the sequence $n\mapsto c_n(b)$ remains periodic (mod $d$) with $P(a,b)$ as a period. This may not be the smallest period, but the smallest period $P(d,b)$ will divide this period $P(a,b)$. $\square$
  • We have $\gcd(a,b)=1$ if and only if there exists some $n>0$ with $c_n(b)\equiv 0$ (mod $a$), in which case $P(a,b)$ is the smallest such $n$. Hence $P(a,b)=a\,\Longrightarrow\,\gcd(a,b)=1$. Additionally, $\gcd(a,b)=1$ implies that the sequence $n\mapsto c_n(b)$ is periodic.
    Proof:
    The case of $a=1$ is straightforward, so we assume that $a>1.$ If there is some $n>0$ with $c_n(b)\equiv 0$ (mod $a$), then we must actually have $n>1$, since $c_1(b)=1$. Then $$bc_{n-1}(b)+1=c_n(b)\equiv 0\text{ (mod }a),$$ so $b$ is invertible (mod $a$) and thus $\gcd(a,b)=1$. Conversely, suppose that $\gcd(a,b)=1$. Then there exists some $k>0$ with $b^k\equiv 1$ (mod $a$). We will prove by induction that $$c_{jk}(b)\equiv jc_k(b)\text{ (mod }a)$$ for all integers $j>0$. This is obvious when $j=1$ and for the inductive step, we calculate $$c_{(j+1)k}(b)=b^kc_{jk}(b)+c_k(b)\equiv c_{jk}(b)+c_k(b)\equiv jc_k(b)+c_k(b)=(j+1)c_k(b)\text{ (mod }a).$$ Now $c_{ak}(b)\equiv ac_k(b)\equiv 0$ (mod $a$), proving the first assertion.
        If $P(a,b)=a$, then the set $\{c_n(b):n>0\}$ covers all residues (mod $a$), so there exists some $n>0$ with $c_n(b)\equiv 0$ (mod $a$). By the first assertion, this implies that $\gcd(a,b)=1$.
        Lastly, if $\gcd(a,b)=1$, then there is some $n>0$ with $c_n(b)\equiv 0$ (mod $a$), so $n+1>1$ and $c_{n+1}(b)\equiv 1$ (mod $a$). This implies that $n\mapsto c_n(b)$ is periodic. Since $\gcd(a,b)=1$, there is some $d\in Z$ with $db\equiv 1$ (mod $a$). Thus if $k>0$ and $c_{k+1}(b)\equiv 1$ (mod $a$), then $$c_{k}(b)\equiv dbc_{k}(b)=d\big(c_{k+1}(b)-1\big)\equiv 0\text{ (mod }a).$$ Hence, we can calculate the period of $n\mapsto c_n(b)$ mod $a$ as follows: $$\min\big\{n>0:c_n(b)\equiv 0\text{ (mod }a)\big\}=\min\big\{n>0:c_{n+1}(b)\equiv 1\text{ (mod }a)\big\}=P(a,b).\ \square$$
We now give a lemma that stems from a comment by Greg Martin.

Lemma 1: If $\gcd(a,b)=\gcd(a,b-1)=1$, then $P(a,b)\mid \varphi(a)$. If $a>1$ and $b\in \mathscr R(a)$, then it follows that $\gcd(a,b-1)>1$. For any prime $p$, we can use this to show that $$P(p,b)=p\ \Longleftrightarrow\ b\equiv 1\ \text{(mod }p).$$

Proof:
Suppose that $\gcd(a,b)=\gcd(a,b-1)=1$. Then $b-1$ has an inverse $k$ (mod $a$), so $$c_n(b)=\frac{b^n-1}{b-1}\equiv k(b^n-1)\,(\text{mod }a).$$ Since $k\in (\mathbb Z/a\mathbb Z)^\times\!,$ the following four sets assume the same number of residues (mod $a$): $$\{c_n(b):n>0\},\qquad \{k(b^n-1):n>0\},\qquad\{b^n-1:n>0\},\quad\text{and}\quad \{b^n:n>0\}.$$ The number of residues assumed by the first set is $P(a,b)$, while number of residues assumed by the last set is the order of $b\in (\mathbb Z/a\mathbb Z)^\times\!.$ The order of $b$ divides the order $\varphi(a)$ of the group, so we have $P(a,b)\mid \varphi(a)$, proving the first assertion.
    Next, we suppose that $a>1$ and $b\in \mathscr R(a)$. Then $P(a,b)=a$, which implies $\gcd(a,b)=1$. Since $a>1$, we have $\varphi(a)<a$. If we had $\gcd(a,b-1)=1$, this would give $a=P(a,b)\mid \varphi(a)$ by the first assertion, which would contradict $\varphi(a)<a$. This proves the second assertion.
    Finally, we already know that $b\equiv 1$ (mod $p$) $\Longrightarrow$ $P(p,b)=p$. Conversely, if $P(p,b)=p$, then $b\in \mathscr R(p)$ and therefore $\gcd(p,b-1)>1$ by the second assertion. But $p$ is prime, so this implies that $p\mid b-1$, or equivalently $b\equiv 1$ (mod $p$). $\square$


We make one more definition. For any prime $p$, let $\alpha_p(a)$ denote the power of $p$ that occurs in the prime factorization of $a$. This function $\alpha_p$ is totally additive, i.e. $\alpha_p(ab)=\alpha_p(a)+\alpha_p(b)$ for any $a,b>0$. Given any $a\mid b$, it follows that $\alpha_p(b/a)=\alpha_p(b)-\alpha_p(a)$. Notice also that $\alpha_p(a)\leq \log_p(a)$, with equality if and only if $a$ is a power of $p$.

Lemma 2: If $p$ is prime, $1\leq l<p$ and $0<i\leq lp^j$, then we have $\alpha_p\displaystyle\binom{lp^{\,j}}{i}\textstyle=j-\alpha_p(i)$.

Proof:
First, we use total additivity in the following equation: $$\alpha_p\binom{lp^{\,j}}{i}=\alpha_p\left(\frac{lp^{\,j}(lp^{\,j}-1)\dots(lp^{\,j}-i+1)}{1\cdot 2\cdot \dots \cdot i}\right)=j-\alpha_p(i)+\sum_{k=1}^{i-1}\Big(\alpha_p(lp^{\,j}-k)-\alpha_p(k)\Big).$$ But $0<k<lp^{\,j}<p^{\,j+1}\Longrightarrow\alpha_p(lp^{\,j}-k)=\alpha_p(k)$, so all summands after $j-\alpha_p(i)$ vanish. $\square$

Lemma 3: Suppose that $p,i\geq 2$ and $p+i\neq 4$ (where $p$ is prime). Then $i-\alpha_p(i)\geq 2$.

Proof:
For $b\geq 2$ and $t>1$, recall that $\log_bt$ is strictly decreasing in $b$. Thus we have $$t-\log_bt\geq t-\log_2t\qquad\text{and}\qquad \tfrac{d}{dt}(t-\log_2t)=1-\frac{1}{t\ln 2}>1-\frac{2}{t},$$ because $\ln 2>1/2$. Thus $t-\log_2t$ is strictly increasing on $(2,\infty),$ so $t\geq 4$ gives $$t-\log_bt\geq t-\log_2t\geq 4-\log_24=2.$$ Thus if $i\geq 4$, then $i-\alpha_p(i)\geq i-\log_p(i)\geq 2$. For $i=3$, we have $\alpha_p(3)\leq 1$ and thus $$3-\alpha_p(3)\geq 3-1=2.$$ For $i=2$, we must have $p>2$. But $p\neq 2$ implies that $\alpha_p(2)=0$ and thus $2-\alpha_p(2)=2$. $\square$

Lemma 4:  Consider a prime $p$ and define the number $$e=\left\{\begin{array}[cc] ~2,\ \ \ \, p=2\\ 1, \ \ p\text{ is odd}\end{array}\right.$$ For any integer $k≥e$, we then have $P(p^k,b)=p^k$ if and only if $b\equiv 1\,(\text{mod }ep)$.
(Note: while the cases are similar enough to mostly combine their proofs, the interested reader is still encouraged to tackle the slightly trickier case of $p=2$ separately, at least on a first try.)

Proof: 
Suppose that $P(p^k,b)=p^k$. Then Lemma 1 gives $\gcd(p^k,b-1)>1$, so $b\equiv 1\,(\text{mod }p)$. But when $p=2$, we can say more: For $k\geq 2$, we will prove that $b\equiv 1$ (mod $4$) by induction. We already addressed the base case of $k=2$ as an example above, so we assume that $k>2$. Then if $P(2^k,b)=2^k\!,$ the set $\{c_n(b):b>0\}$ assumes all residues (mod $2^k$), so it also assumes all residues (mod $2^{k-1}$). This gives $P(2^{k-1},b)=2^{k-1}$ and thus $b\equiv 1\,(\text{mod }4)$ by induction. Hence, we have proven that $b\equiv 1$ (mod $ep$) is necessary to have $P(p^k,b)=p^k$.

To show that it is also sufficient, suppose that $b=jep+1$ for some $j\in \mathbb Z$. Because we only care about the residue of $b$ (mod $a$), we may assume that $j>0$ and thus $b>1$. The base cases of $k=e$ are discussed above, so we assume that $k>e$. We have $p^{k-1}=P(p^{k-1},b)\mid P(p^k,b)$ by induction. Now consider $n=lp^{k-1}$ with $1\leq l<p$. Then the binomial theorem gives $$c_n(b)=\frac{b^n-1}{b-1}=\frac{(jep+1)^n-1}{jep}=\frac{1}{jep}\sum_{i=1}^n\binom{n}{i}(jep)^i=\sum_{i=1}^n\binom{lp^{k-1}}{i}(jep)^{i-1}\qquad (*)$$ By the total additivity of $\alpha_p$, as well as Lemma 2, we have the following for all $i\geq 2$: $$\alpha_p\!\left(\binom{lp^{k-1}}{i}(jep)^{i-1}\right)=k-1-\alpha_p(i)+i-1+\alpha_p(e^{i-1})+\alpha_p(j^{\,i-1}).$$ But notice that $\alpha_p(j^{\,i-1})\geq 0$ and $\alpha_p(e^{i-1})=(i-1)(e-1)$ for any $i\geq 2$. Moreover, Lemma 3 implies that $i-\alpha_p(i)\geq 3-e$ for any $i\geq 2$, because $e\geq 1$ and $2-\alpha_2(2)=1$. Applying these various inequalities, we can now calculate $$\begin{array}[cc]~\alpha_p\!\left(\binom{lp^{k-1}}{i}(jep)^{i-1}\right)\\~\\~\\~\end{array}\begin{array}[cc]~=k-1-\alpha_p(i)+i-1+\alpha_p(e^{i-1})+\alpha_p(j^{\,i-1})\\\geq k+i-\alpha_p(i)-2+(i-1)(e-1)\\\geq k+1-e+(i-1)(e-1)\\=k+i(e-1)\geq k.\end{array}$$ Hence, all summands in $(*)$ with $i\geq 2$ vanish modulo $p^k$. Therefore, we can see that $$c_n(b)\equiv \binom{lp^{k-1}}{1}(jp)^0=lp^{k-1}\not\equiv 0\,(\text{mod }p^k).$$ Thus $P(p^k,b)\neq n$ when $n\in \{lp^{k-1}:1\leq l<p\}$. But since $p^{k-1}\mid P(p^k,b)$ and $P(p^k,b)\leq p^k\!,$ the only possible value that remains is $P(p^k,b)=p^k\!$, as desired. $\square$ 


Lemma 5: Suppose that $\gcd(a,b)=1$. Then the isomorphism $\mathbb Z/ab\mathbb Z\cong \mathbb Z/a\mathbb Z\times \mathbb Z/b\mathbb Z$ from the Chinese Remainder Theorem descends to a bijection $\mathscr R(ab)\longleftrightarrow \mathscr R(a)\times \mathscr R(b).$ 

Proof:
Notice that each $c_n(b)$ is a polynomial in $b$, so they can actually be considered for elements of an arbitrary ring (the polynomial ring $\mathbb Z[t]$ is a free object with one generator). Note that $\mathscr R(a)$ is characterized by the vanishing and non-vanishing of certain polynomials: $$\mathscr R(a)=\{b\in \mathbb Z/a\mathbb Z:c_a(b)\equiv 0\text{ and }c_n(b)\not\equiv 0\text{ for }1≤n<a\}$$ This will interact nicely with isomorphisms and Cartesian products. For any ring $S$, define $$C(S)=\{b\in S:c_a(b)=0\text{ and }c_n(b)\neq 0\text{ for }1\leq n<a\}.$$ If $\psi:S_1\rightarrow S_2$ is an isomorphism and $p\in \mathbb Z[t]$ is any polynomial, then we have $\psi\circ p=p\circ \psi$. Thus, the vanishing (and complementary non-vanishing) sets of $p$ in $S_1$ and $S_2$ are identified under $\psi$. Applying this for every polynomial $c_n$ with $1\leq n\leq a$, we get $\psi\big(C(S_1)\big)=C(S_2)$. For a Cartesian product $S_1\times S_2$, we have $p(a,b)=0$ if and only if $p(a)=0$ and $p(b)=0$, since the polynomial $p$ is evaluated component-wise. Thus, the vanishing (or non-vanishing) of any polynomial "commutes" with products and we get $C(S_1\times S_2)=C(S_1)\times C(S_2)$. Finally, we note that $\mathscr R(a)=C(\mathbb Z/a\mathbb Z)$. Therefore, $$\mathscr R(ab)=C(\mathbb Z/ab\mathbb Z)\longleftrightarrow C(\mathbb Z/a\mathbb Z\times \mathbb Z/b\mathbb Z)=C(\mathbb Z/a\mathbb Z)\times C(\mathbb Z/b\mathbb Z)=\mathscr R(a)\times \mathscr R(b)$$ are identified under the isomorphism given by the Chinese Remainder Theorem. $\square$ 

With these results in hand, we now have a full description of $\mathscr R$. Lemma 4 very simply describes $\mathscr R(p^k)$ for $p$ prime. To write this explicitly, let $p$ be an odd prime. Then we have $$\mathscr R(2^k)=1+4\mathbb Z/2^k\mathbb Z\qquad\text{and}\qquad \mathscr R(p^k)=1+p\mathbb Z/p^k\mathbb Z.$$ The calculation of any $\mathscr R(a)$ now follows from Lemma 5 and the prime factorization of $a$. Recall that there is a simple formula for the inverse map $\mathbb Z/a\mathbb Z\times \mathbb Z/b\mathbb Z\longrightarrow \mathbb Z/ab\mathbb Z$ in the Chinese Remainder Theorem. First, we can write $ak+bl=1$ via the Euclidean algorithm. For any $i,j\in \mathbb Z$, the unique residue class $x$ (mod $ab$) with $x\equiv i\,(\text{mod }a)$ and $x\equiv j\,(\text{mod }b)$ is then given by $x=akj+bli$. This gives a constructive procedure to calculate any $\mathscr R(a)$.

Now, let us return to the question of counting. Lemma 5 immediately implies that the function $R$ is multiplicative. Thus $R$ is determined by its values on prime powers. If $p$ is an odd prime and $k>0$, then Lemma 4 shows that $R(p^k)=p^{k-1}$. But for any $k>1$, Lemma 4 also shows that $R(2^k)=2^{k-2}$ (and we clearly have $R(2)=1$). This completely describes $R$, which has the exact values that Greg Martin predicted! In particular, we see that $R(a)=1$ if and only if $\alpha_2(a)\leq 2$ and $\alpha_p(a)\leq 1$ for $p$ odd. Writing $S\subset \mathbb N$ for the set of odd square-free numbers, we have $R^{-1}(1)=S\sqcup 2S\sqcup 4S$.

Lastly, we return to the calculation of the density of $R^{-1}(1)$, again following Greg Martin.  First, what do we mean by (asymptotic) density? For any $A\subset \mathbb N$, we consider the limit $$\delta(A)=\lim_{n\rightarrow \infty}\frac{\#\big(A\cap \{1,\dots,n\}\big)}{n}$$ If this limit exists, it gives a notion of "how likely'' it is that a natural number belongs to $A$. Given disjoint sets $A,B\subset \mathbb N$ with well-defined densities, we may notice that $$\delta(A\sqcup B)=\delta(A)+\delta(B)\qquad\text{and}\qquad \delta(nA)=\tfrac{1}{n}\delta(A).$$ Note that the set of square-free natural numbers is just $S\sqcup 2S$. Apparently, it is a well-known fact (though not to me) that the square-free integers have density $6/\pi^2$. I will not prove this (there is a proof-sketch on Wikipedia), but I also don't want this mysterious number to appear out of nowhere. If the reader has heard of the Basel problem before, they should recall that $$\sum_{n=1}^\infty \frac{1}{n^2}=\frac{\pi^2}{6}$$ This provides some link between inverses of squares and $\pi^2/6$, so it should not be shocking that squares are somehow related to the inverse value $(\pi^2/6)^{-1}=6/\pi^2$. Now, notice that $$\tfrac{3}{2}\delta(S)=\delta(S)+\tfrac{1}{2}\delta(S)=\delta(S)+\delta(2S)=\delta(S\sqcup 2S)=\frac{6}{\pi^2}$$ Thus $\delta(S)=4/\pi^2$ and we can finally calculate the desired result: $$\delta\big(R^{-1}(1)\big)=\delta(S\sqcup 2S\sqcup 4S)=\delta(S\sqcup 2S)+\tfrac{1}{4}\delta(S)=\frac{6}{\pi^2}+\frac{1}{\pi^2}=\frac{7}{\pi^2}$$ Since $\pi^2$ is only slightly less than $10$, we end up with $\delta\big(R^{-1}(1)\big)\approx 0.709,$ which is quite close to my previous estimate of $71\%$. This resolves my confusion, providing a lovely answer and another fun list of interesting, exploratory problems. Thanks, Greg!

Comments

  1. $latex \pi^2$ is quite close to the gravitational constat on Earth.

    ReplyDelete

Post a Comment