Lagrange's four-square theorem

Lagrange's four-square theorem, also known as Bachet's conjecture, states that every nonnegative integer can be represented as a sum of four non-negative integer squares.^[1] That is, the squares form an additive basis of order four. $p=a^{2}+b^{2}+c^{2}+d^{2}$ where the four numbers $a,b,c,d$ are integers. For illustration, 3, 31, and 310 can be represented as the sum of four squares as follows: ${\begin{aligned}3&=1^{2}+1^{2}+1^{2}+0^{2}\\[3pt]31&=5^{2}+2^{2}+1^{2}+1^{2}\\[3pt]310&=17^{2}+4^{2}+2^{2}+1^{2}\\[3pt]&=16^{2}+7^{2}+2^{2}+1^{2}\\[3pt]&=15^{2}+9^{2}+2^{2}+0^{2}\\[3pt]&=12^{2}+11^{2}+6^{2}+3^{2}.\end{aligned}}$

This theorem was proven by Joseph Louis Lagrange in 1770. It is a special case of the Fermat polygonal number theorem.

Historical development

From examples given in the Arithmetica, it is clear that Diophantus was aware of the theorem. This book was translated in 1621 into Latin by Bachet (Claude Gaspard Bachet de Méziriac), who stated the theorem in the notes of his translation. But the theorem was not proved until 1770 by Lagrange.^[2]

Adrien-Marie Legendre extended the theorem in 1797–8 with his three-square theorem, by proving that a positive integer can be expressed as the sum of three squares if and only if it is not of the form $4^{k}(8m+7)$ for integers $k$ and $m$ . Later, in 1834, Carl Gustav Jakob Jacobi discovered a simple formula for the number of representations of an integer as the sum of four squares with his own four-square theorem.

The formula is also linked to Descartes' theorem of four "kissing circles", which involves the sum of the squares of the curvatures of four circles. This is also linked to Apollonian gaskets, which were more recently related to the Ramanujan–Petersson conjecture.^[3]

Proofs

The classical proof

Several very similar modern versions^[4]^[5]^[6] of Lagrange's proof exist. The proof below is a slightly simplified version, in which the cases for which m is even or odd do not require separate arguments.

The classical proof

It is sufficient to prove the theorem for every odd prime number p. This immediately follows from Euler's four-square identity (and from the fact that the theorem is true for the numbers 1 and 2).

The residues of a² modulo p are distinct for every a between 0 and $(p - 1)/2$ (inclusive). To see this, take some a and define c as a² mod p. a is a root of the polynomial $x 2 - c$ over the field $Z/ p Z$ . So is $p - a$ (which is different from a). In a field K, any polynomial of degree n has at most n distinct roots (Lagrange's theorem (number theory)), so there are no other a with this property, in particular not among 0 to $(p - 1)/2$ .

Similarly, for b taking integral values between 0 and $(p - 1)/2$ (inclusive), the $- b 2 - 1$ are distinct. By the pigeonhole principle, there are a and b in this range, for which a² and $- b 2 - 1$ are congruent modulo p, that is for which $a^{2}+b^{2}+1^{2}+0^{2}=np.$

Now let m be the smallest positive integer such that mp is the sum of four squares, $x 12 + x 22 + x 32 + x 42$ (we have just shown that there is some m (namely n) with this property, so there is a least one m, and it is smaller than p). We show by contradiction that m equals 1: supposing it is not the case, we prove the existence of a positive integer r less than m, for which rp is also the sum of four squares (this is in the spirit of the infinite descent^[7] method of Fermat).

For this purpose, we consider for each x_i the y_i which is in the same residue class modulo m and between $(- m + 1)/2$ and m/2 (possibly included). It follows that $y 12 + y 22 + y 32 + y 42 = mr$ , for some strictly positive integer r less than m.

Finally, another appeal to Euler's four-square identity shows that $mpmr = z 12 + z 22 + z 32 + z 42$ . But the fact that each x_i is congruent to its corresponding y_i implies that all of the z_i are divisible by m. Indeed, ${\begin{cases}z_{1}&=x_{1}y_{1}+x_{2}y_{2}+x_{3}y_{3}+x_{4}y_{4}&\equiv x_{1}^{2}+x_{2}^{2}+x_{3}^{2}+x_{4}^{2}&=mp\equiv 0&{\pmod {m}},\\z_{2}&=x_{1}y_{2}-x_{2}y_{1}+x_{3}y_{4}-x_{4}y_{3}&\equiv x_{1}x_{2}-x_{2}x_{1}+x_{3}x_{4}-x_{4}x_{3}&=0&{\pmod {m}},\\z_{3}&=x_{1}y_{3}-x_{2}y_{4}-x_{3}y_{1}+x_{4}y_{2}&\equiv x_{1}x_{3}-x_{2}x_{4}-x_{3}x_{1}+x_{4}x_{2}&=0&{\pmod {m}},\\z_{4}&=x_{1}y_{4}+x_{2}y_{3}-x_{3}y_{2}-x_{4}y_{1}&\equiv x_{1}x_{4}+x_{2}x_{3}-x_{3}x_{2}-x_{4}x_{1}&=0&{\pmod {m}}.\end{cases}}$

It follows that, for $w i = z i / m$ , $w 12 + w 22 + w 32 + w 42 = rp$ , and this is in contradiction with the minimality of m.

In the descent above, we must rule out both the case $y 1 = y 2 = y 3 = y 4 = m /2$ (which would give $r = m$ and no descent), and also the case $y 1 = y 2 = y 3 = y 4 = 0$ (which would give $r = 0$ rather than strictly positive). For both of those cases, one can check that $mp = x 12 + x 22 + x 32 + x 42$ would be a multiple of m², contradicting the fact that p is a prime greater than m.

Proof using the Hurwitz integers

Another way to prove the theorem relies on Hurwitz quaternions, which are the analog of integers for quaternions.^[8]

Proof using the Hurwitz integers

The Hurwitz quaternions consist of all quaternions with integer components and all quaternions with half-integer components. These two sets can be combined into a single formula $\alpha ={\frac {1}{2}}E_{0}(1+\mathbf {i} +\mathbf {j} +\mathbf {k} )+E_{1}\mathbf {i} +E_{2}\mathbf {j} +E_{3}\mathbf {k} =a_{0}+a_{1}\mathbf {i} +a_{2}\mathbf {j} +a_{3}\mathbf {k}$ where $E_{0},E_{1},E_{2},E_{3}$ are integers. Thus, the quaternion components $a_{0},a_{1},a_{2},a_{3}$ are either all integers or all half-integers, depending on whether $E_{0}$ is even or odd, respectively. The set of Hurwitz quaternions forms a ring; that is to say, the sum or product of any two Hurwitz quaternions is likewise a Hurwitz quaternion.

The (arithmetic, or field) norm $\mathrm {N} (\alpha )$ of a rational quaternion $\alpha$ is the nonnegative rational number $\mathrm {N} (\alpha )=\alpha {\bar {\alpha }}=a_{0}^{2}+a_{1}^{2}+a_{2}^{2}+a_{3}^{2}$ where ${\bar {\alpha }}=a_{0}-a_{1}\mathbf {i} -a_{2}\mathbf {j} -a_{3}\mathbf {k}$ is the conjugate of $\alpha$ . Note that the norm of a Hurwitz quaternion is always an integer. (If the coefficients are half-integers, then their squares are of the form ${\tfrac {1}{4}}+n:n\in \mathbb {Z}$ , and the sum of four such numbers is an integer.)

Since quaternion multiplication is associative, and real numbers commute with other quaternions, the norm of a product of quaternions equals the product of the norms: $\mathrm {N} (\alpha \beta )=\alpha \beta ({\overline {\alpha \beta }})=\alpha \beta {\bar {\beta }}{\bar {\alpha }}=\alpha \mathrm {N} (\beta ){\bar {\alpha }}=\alpha {\bar {\alpha }}\mathrm {N} (\beta )=\mathrm {N} (\alpha )\mathrm {N} (\beta ).$

For any $\alpha \neq 0$ , $\alpha ^{-1}={\bar {\alpha }}\mathrm {N} (\alpha )^{-1}$ . It follows easily that $\alpha$ is a unit in the ring of Hurwitz quaternions if and only if $\mathrm {N} (\alpha )=1$ .

The proof of the main theorem begins by reduction to the case of prime numbers. Euler's four-square identity implies that if Lagrange's four-square theorem holds for two numbers, it holds for the product of the two numbers. Since any natural number can be factored into powers of primes, it suffices to prove the theorem for prime numbers. It is true for $2=1^{2}+1^{2}+0^{2}+0^{2}$ . To show this for an odd prime integer $p$ , represent it as a quaternion $(p,0,0,0)$ and assume for now (as we shall show later) that it is not a Hurwitz irreducible; that is, it can be factored into two non-unit Hurwitz quaternions $p=\alpha \beta .$

The norms of $p,\alpha ,\beta$ are integers such that $\mathrm {N} (p)=p^{2}=\mathrm {N} (\alpha \beta )=\mathrm {N} (\alpha )\mathrm {N} (\beta )$ and $\mathrm {N} (\alpha ),\mathrm {N} (\beta )>1$ . This shows that both $\mathrm {N} (\alpha )$ and $\mathrm {N} (\beta )$ are equal to $p$ (since they are integers), and $p$ is the sum of four squares $p=\mathrm {N} (\alpha )=a_{0}^{2}+a_{1}^{2}+a_{2}^{2}+a_{3}^{2}.$

If it happens that the $\alpha$ chosen has half-integer coefficients, it can be replaced by another Hurwitz quaternion. Choose $\omega =(\pm 1\pm \mathbf {i} \pm \mathbf {j} \pm \mathbf {k} )/2$ in such a way that $\gamma \equiv \omega +\alpha$ has even integer coefficients. Then $p=({\bar {\gamma }}-{\bar {\omega }})\omega {\bar {\omega }}(\gamma -\omega )=({\bar {\gamma }}\omega -1)({\bar {\omega }}\gamma -1).$

Since $\gamma$ has even integer coefficients, $({\bar {\omega }}\gamma -1)$ will have integer coefficients and can be used instead of the original $\alpha$ to give a representation of $p$ as the sum of four squares.

As for showing that $p$ is not a Hurwitz irreducible, Lagrange proved that any odd prime $p$ divides at least one number of the form $u=1+l^{2}+m^{2}$ , where $l$ and $m$ are integers.^[8] This can be seen as follows: since $p$ is prime, $a^{2}\equiv b^{2}{\pmod {p}}$ can hold for integers $a,b$ , only when $a\equiv \pm b{\pmod {p}}$ . Thus, the set $X=\{0^{2},1^{2},\dots ,((p-1)/2)^{2}\}$ of squares contains $(p+1)/2$ distinct residues modulo $p$ . Likewise, $Y=\{-(1+x):x\in X\}$ contains $(p+1)/2$ residues. Since there are only $p$ residues in total, and $|X|+|Y|=p+1>p$ , the sets $X$ and $Y$ must intersect.

The number $u$ can be factored in Hurwitz quaternions: $1+l^{2}+m^{2}=(1+l\;\mathbf {i} +m\;\mathbf {j} )(1-l\;\mathbf {i} -m\;\mathbf {j} ).$

The norm on Hurwitz quaternions satisfies a form of the Euclidean property: for any quaternion $\alpha =a_{0}+a_{1}\mathbf {i} +a_{2}\mathbf {j} +a_{3}\mathbf {k}$ with rational coefficients we can choose a Hurwitz quaternion $\beta =b_{0}+b_{1}\mathbf {i} +b_{2}\mathbf {j} +b_{3}\mathbf {k}$ so that $\mathrm {N} (\alpha -\beta )<1$ by first choosing $b_{0}$ so that $|a_{0}-b_{0}|\leq 1/4$ and then $b_{1},b_{2},b_{3}$ so that $|a_{i}-b_{i}|\leq 1/2$ for $i=1,2,3$ . Then we obtain ${\begin{aligned}\mathrm {N} (\alpha -\beta )&=(a_{0}-b_{0})^{2}+(a_{1}-b_{1})^{2}+(a_{2}-b_{2})^{2}+(a_{3}-b_{3})^{2}\\&\leq \left({\frac {1}{4}}\right)^{2}+\left({\frac {1}{2}}\right)^{2}+\left({\frac {1}{2}}\right)^{2}+\left({\frac {1}{2}}\right)^{2}={\frac {13}{16}}<1.\end{aligned}}$

It follows that for any Hurwitz quaternions $\alpha ,\beta$ with $\alpha \neq 0$ , there exists a Hurwitz quaternion $\gamma$ such that $\mathrm {N} (\beta -\alpha \gamma )<\mathrm {N} (\alpha ).$

The ring $H$ of Hurwitz quaternions is not commutative, hence it is not an actual Euclidean domain, and it does not have unique factorization in the usual sense. Nevertheless, the property above implies that every right ideal is principal. Thus, there is a Hurwitz quaternion $\alpha$ such that $\alpha H=pH+(1-l\;\mathbf {i} -m\;\mathbf {j} )H.$

In particular, $p=\alpha \beta$ for some Hurwitz quaternion $\beta$ . If $\beta$ were a unit, $1-l\;\mathbf {i} -m\;\mathbf {j}$ would be a multiple of $p$ , however this is impossible as $1/p-l/p\;\mathbf {i} -m/p\;\mathbf {j}$ is not a Hurwitz quaternion for $p>2$ . Similarly, if $\alpha$ were a unit, we would have $(1+l\;\mathbf {i} +m\;\mathbf {j} )H=(1+l\;\mathbf {i} +m\;\mathbf {j} )pH+(1+l\;\mathbf {i} +m\;\mathbf {j} )(1-l\;\mathbf {i} -m\;\mathbf {j} )H\subseteq pH$ so $p$ divides $1+l\;\mathbf {i} +m\;\mathbf {j}$ , which again contradicts the fact that $1/p-l/p\;\mathbf {i} -m/p\;\mathbf {j}$ is not a Hurwitz quaternion. Thus, $p$ is not Hurwitz irreducible, as claimed.

Generalizations

Lagrange's four-square theorem is a special case of the Fermat polygonal number theorem and Waring's problem. Another possible generalization is the following problem: Given natural numbers $a,b,c,d$ , can we solve

$n=ax_{1}^{2}+bx_{2}^{2}+cx_{3}^{2}+dx_{4}^{2}$

for all positive integers $n$ in integers $x_{1},x_{2},x_{3},x_{4}$ ? The case $a=b=c=d=1$ is answered in the positive by Lagrange's four-square theorem. The general solution was given by Ramanujan.^[9] He proved that if we assume, without loss of generality, that $a\leq b\leq c\leq d$ then there are exactly 54 possible choices for $a,b,c,d$ such that the problem is solvable in integers $x_{1},x_{2},x_{3},x_{4}$ for all $n$ . (Ramanujan listed a 55th possibility $a=1,b=2,c=5,d=5$ , but in this case the problem is not solvable if $n=15$ .^[10])

Algorithms

In 1986, Michael O. Rabin and Jeffrey Shallit^[11] proposed randomized polynomial-time algorithms for computing a single representation $n=x_{1}^{2}+x_{2}^{2}+x_{3}^{2}+x_{4}^{2}$ for a given integer $n$ , in expected running time $\mathrm {O} (\log(n)^{2})$ . It was further improved to $\mathrm {O} (\log(n)^{2}\log(\log(n))^{-1})$ by Paul Pollack and Enrique Treviño in 2018.^[12]

Number of representations

The number of representations of a natural number n as the sum of four squares of integers is denoted by r₄(n). Jacobi's four-square theorem states that this is eight times the sum of the divisors of n if n is odd and 24 times the sum of the odd divisors of n if n is even (see divisor function), i.e.

$r_{4}(n)={\begin{cases}8\sum \limits _{m\mid n}m&{\text{if }}n{\text{ is odd}}\\[12pt]24\sum \limits _{\begin{smallmatrix}m|n\\m{\text{ odd}}\end{smallmatrix}}m&{\text{if }}n{\text{ is even}}.\end{cases}}$

Equivalently, it is eight times the sum of all its divisors which are not divisible by 4, i.e.

$r_{4}(n)=8\sum _{m\,:\,4\nmid m\mid n}m.$

We may also write this as $r_{4}(n)=8\sigma (n)-32\sigma (n/4)\ ,$ where the second term is to be taken as zero if n is not divisible by 4. In particular, for a prime number p we have the explicit formula $r 4 (p) = 8(p + 1)$ .^[13]

Some values of r₄(n) occur infinitely often as $r 4 (n) = r 4 (2 m n)$ whenever n is even. The values of r₄(n)/n can be arbitrarily large: indeed, r₄(n)/n is infinitely often larger than 8√log n.^[13]

Uniqueness

The sequence of positive integers which have only one representation as a sum of four squares of non-negative integers (up to order) is:

1, 2, 3, 5, 6, 7, 8, 11, 14, 15, 23, 24, 32, 56, 96, 128, 224, 384, 512, 896 ... (sequence A006431 in the OEIS).

These integers consist of the seven odd numbers 1, 3, 5, 7, 11, 15, 23 and all numbers of the form $2(4^{k}),6(4^{k})$ or $14(4^{k})$ .

The sequence of positive integers which cannot be represented as a sum of four non-zero squares is:

1, 2, 3, 5, 6, 8, 9, 11, 14, 17, 24, 29, 32, 41, 56, 96, 128, 224, 384, 512, 896 ... (sequence A000534 in the OEIS).

These integers consist of the eight odd numbers 1, 3, 5, 9, 11, 17, 29, 41 and all numbers of the form $2(4^{k}),6(4^{k})$ or $14(4^{k})$ .

Further refinements

Lagrange's four-square theorem can be refined in various ways. For example, Zhi-Wei Sun^[14] proved that each natural number can be written as a sum of four squares with some requirements on the choice of these four numbers.

One may also wonder whether it is necessary to use the entire set of square integers to write each natural as the sum of four squares. Eduard Wirsing proved that there exists a set of squares $S$ with $|S|=O(n^{1/4}\log ^{1/4}n)$ such that every positive integer smaller than or equal to $n$ can be written as a sum of at most 4 elements of $S$ .^[15]

Notes

^ Andrews, George E. (1994), Number Theory, Dover Publications, p. 144, ISBN 0-486-68252-8
^ Ireland & Rosen 1990.
^ Sarnak 2013.
^ Landau 1958, Theorems 166 to 169.
^ Hardy & Wright 2008, Theorem 369.
^ Niven & Zuckerman 1960, paragraph 5.7.
^ Here the argument is a direct proof by contradiction. With the initial assumption that m > 2, m < p, is some integer such that mp is the sum of four squares (not necessarily the smallest), the argument could be modified to become an infinite descent argument in the spirit of Fermat.
^ ^a ^b Stillwell 2003, pp. 138–157.
^ Ramanujan 1917.
^ Oh 2000.
^ Rabin & Shallit 1986.
^ Pollack & Treviño 2018.
^ ^a ^b Williams 2011, p. 119.
^ Sun 2017.
^ Spencer 1996.

References

Hardy, G. H.; Wright, E. M. (2008) [1938]. Heath-Brown, D. R.; Silverman, J. H.; Wiles, Andrew (eds.). An Introduction to the Theory of Numbers (6th ed.). Oxford University Press. ISBN 978-0-19-921985-8.
Ireland, Kenneth; Rosen, Michael (1990). A Classical Introduction to Modern Number Theory (2nd ed.). Springer. doi:10.1007/978-1-4757-2103-4. ISBN 978-1-4419-3094-1.
Landau, Edmund (1958) [1927]. Elementary Number Theory. Vol. 125. Translated by Goodman, Jacob E. (2nd ed.). AMS Chelsea Publishing.
Niven, Ivan; Zuckerman, Herbert S. (1960). An introduction to the theory of numbers. Wiley.
Oh, Byeong-Kweon (2000). "Representations of Binary Forms by Quinary Quadratic Forms" (PDF). Trends in Mathematics. 3 (1): 102–107.
Rabin, M. O.; Shallit, J. O. (1986). "Randomized Algorithms in Number Theory". Communications on Pure and Applied Mathematics. 39 (S1): S239–S256. doi:10.1002/cpa.3160390713.
Ramanujan, S. (1917). "On the expression of a number in the form ax² + by² + cz² + dw²". Proc. Camb. Phil. Soc. 19: 11–21.
Sarnak, Peter (2013). "The Ramanujan Conjecture and some Diophantine Equations". YouTube (Lecture at Tata Institute of Fundamental Research). ICTS Lecture Series. Bangalore, India.
Stillwell, John (2003). Elements of Number Theory. Undergraduate Texts in Mathematics. Springer. doi:10.1007/978-0-387-21735-2. ISBN 978-0-387-95587-2. Zbl 1112.11002.
Sun, Z.-W. (2017). "Refining Lagrange's four-square theorem". J. Number Theory. 175: 167–190. arXiv:1604.06723. doi:10.1016/j.jnt.2016.11.008. S2CID 119597024.
Williams, Kenneth S. (2011). Number theory in the spirit of Liouville. London Mathematical Society Student Texts. Vol. 76. Cambridge University Press. ISBN 978-0-521-17562-3. Zbl 1227.11002.
Spencer, Joel (1996). "Four Squares with Few Squares". Number Theory: New York Seminar 1991–1995. Springer US. pp. 295–297. doi:10.1007/978-1-4612-2418-1_22. ISBN 9780387948263.
Pollack, P.; Treviño, E. (2018). "Finding the four squares in Lagrange's theorem" (PDF). Integers. 18A: A15.

External links

[andrews-1] Andrews, George E. (1994), Number Theory, Dover Publications, p. 144, ISBN 0-486-68252-8

[2] Ireland & Rosen 1990.

[3] Sarnak 2013.

[4] Landau 1958, Theorems 166 to 169.

[5] Hardy & Wright 2008, Theorem 369.

[6] Niven & Zuckerman 1960, paragraph 5.7.

[7] Here the argument is a direct proof by contradiction. With the initial assumption that m > 2, m < p, is some integer such that mp is the sum of four squares (not necessarily the smallest), the argument could be modified to become an infinite descent argument in the spirit of Fermat.

[Stillwell_2003-8] Stillwell 2003, pp. 138–157.

[9] Ramanujan 1917.

[10] Oh 2000.

[11] Rabin & Shallit 1986.

[12] Pollack & Treviño 2018.

[Williams_2011-13] Williams 2011, p. 119.

[14] Sun 2017.

[15] Spencer 1996.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]