Many Ones Deduction, Part I

Prerequisites: Modular arithmetic and geometric series.

In 2018, I participated in the Putnam Competition for the first and only time. In general, I think that math competitions tend to foster toxic and exclusive environments, of which I would like no part. But I still felt the impulse to participate in this toxicity and affirm myself through competition, so I signed up anyways, trying to tell myself that it was just for fun… Of course, that didn't really work, so I went scrounging for practice problems to "prepare" myself. I found that Alexander Givental had publicly posted homeworks for his Putnam Workshop, and a thread of related problems caught my eye. While the competition was ultimately as unrewarding as I should have expected, I did have fun thinking about these problems and I'd like to share those thoughts here.

For any positive integer $n$, we write $c_n$ for the natural number which consists of $n$ ones in its decimal expansion (e.g. $c_7=1111111$). We give three problems about the numbers $c_n$ below, each of which generalizes to any base $b$. The given solutions stick to base 10, but we will also state the more general hypotheses in terms of the base $b$. The interested reader is encouraged to tackle the general case. Please comment with any questions or observations regarding these problems and the subsequent discussion.


Problem 1: For integers $m,n>0$, show that $\gcd(c_m,c_n)=c_{\gcd(m,n)}$.


Solution: We will prove the desired formula by induction on $m+n$. The result is obvious when $m=n$, $m=1$ or $n=1$, so we assume without loss of generality that $m>n>1$. The greatest common divisor is a multiplicative function, meaning that $\gcd(ab,c)=\gcd(a,c)\cdot \gcd(b,c)$ when $\gcd(a,b)=1$. We also have $\gcd(x,y)=\gcd(x-y,y)$ for any positive integers $x$ and $y$. If $k=m-n>0$, then we have the equality $$c_m-c_n=\sum_{i=0}^{m-1}10^i-\sum_{j=0}^{n-1}10^j=\sum_{i=n}^{m-1}10^i=10^n\sum_{i=0}^{k-1}10^i=10^nc_k.$$ Note that $\gcd(10^i,c_j)=1$ for any positive integers $i$ and $j$. By the multiplicativity of $\gcd$, we have $$\gcd(c_m,c_n)=\gcd(c_m-c_n,c_n)=\gcd(10^n c_k,c_n)=\gcd(10^n,c_n)\cdot \gcd(c_k,c_n)=\gcd(c_k,c_n).$$ But $\gcd(k,n)=\gcd(m-n,n)=\gcd(m,n)$ and $k+n=m<m+n$. It follows by induction that $$\gcd(c_m,c_n)=\gcd(c_k,c_n)=c_{\gcd(k,n)}=c_{\gcd(m,n)}.$$


Problem 2: Let $a$ be a positive integer such that $\gcd(10,a)=1$ (or $\gcd (b,a)=1$ in the general case). Show that there exists a positive integer $n$ with $a\mid c_n$.

Solution: Since $\gcd(10,a)=1$, there exists a positive integer $k$ with $10^k\equiv 1(\text{mod }a)$. We will prove by induction that $c_{jk}\equiv jc_k(\text{mod }a)$ for all positive integers $j$. This is trivial for $j=1$. For greater $j$, we have $$c_{(j+1)k}=10^kc_{jk}+c_k\equiv c_{jk}+c_k\equiv jc_k+c_k=(j+1)c_k\, (\text{mod }a),$$ by induction. This proves that $c_{jk}\equiv jc_k(\text{mod }a)$ for all $j$. Setting $n=ak$, we have $c_n\equiv ac_k\equiv 0\,(\text{mod }a)$.


Problem 3: If $p$ is prime, show that $p\mid c_p$ if and only if $p=3$ (or $p\mid b-1$ in the general case).

Solution:  Since $111=3\cdot 37$, we have $3\mid c_3$. Thus it suffices to assume that $p\neq 3$ and prove that $p\nmid c_p$. Since $p\neq 3$ and thus $p\nmid 9$, there exists an integer $k$ with $9k\equiv 1(\text{mod }p)$. Hence we have the following:
$$c_p=\sum_{i=0}^{p-1}10^i=\frac{10^p-1}{10-1}\equiv k(10^p-1)\,(\text{mod }p).$$ Since $k$ has a multiplicative inverse modulo $p$, we have $p\mid c_p\Longleftrightarrow p\mid 10^p-1$. Now assume for the sake of contradiction that $p\mid 10^p-1$. Fermat's little theorem implies that $10\equiv 10^p\equiv 1(\text{mod }p)$, which contradicts the assumption that $p\nmid 9=10-1$. Therefore, we have $p\nmid 10^p-1$ and thus $p\nmid c_p$, as originally desired.


Question: Problems 2 and 3 can be seen as facets of the following, more open-ended question, for which it is necessary to consider an arbitrary base $b$. We use the same symbol $c_n$, but it is now dependent on $b$. 

Suppose that $a$ is a positive integer with $\gcd(b,a)=1$. Problem 2 shows that the sequence $(c_n\textbf{ mod } a)$ contains $0$. But $(c_n)$ is given by the recursive equation $c_{n+1}=bc_n+1$, and it follows that the sequence is periodic when considered modulo $a$. What is the period $P(a,b)$? This function satisfies the following:
  • $P(a,b)$ is the smallest positive integer $n$ such that $a\mid c_n$.
  • For any positive integer $n$, we have $a\mid c_n\Longleftrightarrow P(a,b)\mid n$. 
  • $P(a,b)$ is the number of residues mod $a$ occuring in the sequence $(c_n)$.
  • $P(a,b)$ depends only on $a$ and $b\textbf{ mod }a$, not on the specific value of $b$.
  • $1\leq P(a,b)\leq a$ for all $a$ and $b$. 
  • If $b\equiv 1(\text{mod }a)$, then $P(a,b)=a$.
For which values of $a$ and $b$ do we have $P(a,b)=a$? This is precisely the case in which $\{c_1,\dots,c_a\}$ forms a complete residue system modulo $a$. Define $R(a)$ to be the number of residues $b\textbf{ mod }a$ such that $\gcd(a,b)=1$ and $P(a,b)=a$. Equivalently, we can define $R(a)$ as the number of residues $b\textbf{ mod }a$ for which $c_1,\dots,c_a$ is a complete residue system modulo $a$ (we only defined $P(a,b)$ for $\gcd(a,b)=1$, but this condition is always satisfied when $c_1,\dots,c_a$ is a complete residue system mod $a$). Then we have $1\leq R(a)\leq \varphi(a)$, where $\varphi$ is the totient function. Starting with $a=1$, the first forty values are $$1,1,1,1,1,1,1,2,3,1,1,1,1,1,1,4,1,3,1,1,1,1,1,2,5,1,9,1,1,1,1,8,1,1,1,3,1,1,1,2.\quad$$One may show that the solution to Problem 3 implies that $R(p)=1$ for all primes $p$. In particular, there are infinitely many inputs $a$ with $R(a)=1$. We don't know if there are infinitely many $a$ with $R(a)>1$, but an automated calculation of the first 5000 values in MATLAB seems to indicate that  $R(a)=1$ about $71\%$ of the time. This heuristic approach is illustrated by the following graph. For an input $x$, the function graphed represents the fraction of the set $\{1,2,\dots,x\}$ for which $R(a)=1$ (we omit small values of $x$).

Does this pattern continue? If so, why? What is special about the value it seems to be converging to???

Update: There is now a Part II blog post, which completely describes $R$ and shows why this occurs.

Comments