### Viggo Brun (1885–1978)

Brun's method is perhaps our most powerful elementary tool in number theory. (1965)

—Paul Erdős (1913–1996)

En av de mest egenartede og dypt originale begavelser vårt land har fostret. (One of the most distinctively gifted and profoundly original people our country has fostered.) (Eulogy, 1979)

—Atle Selberg (1917–2007)

Brun's works started the true life of sieve theory. I would compare this moment of the history of prime numbers to that of the publication of the memoir on the zeta function by Bernhard Riemann (1826–1866). (1996)

—Henryk Iwaniec (1947–)

At the 1912 ICM in Cambridge, Edmund Landau (1887–1938) listed four basic problems about prime numbers which he characterized in his speech as “unattackable at the present state of mathematics.” The two first ones were

1. Goldbach's Conjecture: Can every even integer greater than 2 be written as the sum of two primes?

2. Twin Prime Conjecture: Are there infinitely many primes $$p$$ such that $$p + 2$$ is prime?

Landau, who was initially skeptical about the power of Brun's sieve, would change his opinion and called Brun “Ein Bahnbrecher.”

David Hilbert (1862–1943) in his famous speech at the ICM meeting in Paris in 1900, where he proposed 23 problems for the mathematicians of the 20th century, listed Landau's two problems as part of his 8th problem, which now is better known as the Riemann Hypothesis.

In a speech given in Copenhagen in 1921 at Dansk Matematisk Forening, G.H. Hardy (1877–1947) pronounced that Goldbach's Conjecture is “probably as difficult as any of the unsolved problems in mathematics.”

Theorem 1 (Brun 1919)
$$\pi_2(x) = O\left(\dfrac{x}{\log^2 x}\right)$$, i.e. $$\pi_2(x) \leq c \dfrac{x}{\log^2 x}$$ for some $$0 < c < \infty$$.

Corollary
$$\dfrac13 + \dfrac15 + \dfrac15 + \dfrac17 + \dfrac1{11} + \dfrac1{13} + \dfrac1{17} + \dfrac1{19} + \cdots < \infty$$
(The sum of the inverses of prime twins converges.)

Theorem 2 (Brun 1919)
Every large enough even number is of the form $$P_9 + P_9$$.

Jean Merlin (d. 1914)

According to Selberg, Brun's sieve is one of the most original innovations of 20th century mathematics.

Brun's sieve has been extended and modified in many directions by several mathematicians, one of them being Selberg, who in 1946 introduced another sieve method which leads to more precise results than Brun's method.

Furthermore, Selberg's upper bound method is surprisingly simple and has a certain air of finality to it.

These more advanced sieve methods are essential ingredients in Chen Jingrun's (1933–1996) proof from 1966 that every sufficiently large even number is of the form $$P + P_2$$.

Sieve methods are also important in Yitang Zhang's (1955–) proof from 2013 that there exist infinitely many pairs of primes whose differences are the same.

A curious feature of the literature on sieves is that while there is frequent use of Brun's method, there are only a few attempts to formulate a general Brun theorem.

(The nearest candidate would be $$\pi_2(x) = O\left(\frac{x}{\log^2 x}\right)$$.)

However, there are surprisingly many papers which repeat in considerable detail the steps of Brun's argument.

### Abel's partial summation

This is a basic tool in expressing a sum of products $$\sum\limits_{k = 1}^n a_kb_k$$ in terms of partial sums of the $$a_k$$'s and differences of the $$b_k$$'s.

Let $$u_k = a_kb_k$$, $$\, \sigma_k = u_1 + u_2 + \cdots + u_k$$, $$\, s_k = a_1 + a_2 + \cdots + a_k$$. Then

\begin{align} u_k = a_kb_k &= (s_k - s_{k - 1})b_k = s_kb_k - s_{k - 1}b_k\\ &= s_k(b_k - b_{k + 1}) + s_k b_{k + 1} - s_{k - 1} b_k. \end{align}

We get

$\sigma_n = \sum_{k = 1}^n u_k = \sum_{k = 1}^n a_kb_k = \sum_{k = 1}^n s_k(b_k - b_{k + 1}) + s_n b_{n + 1}.$

Let $$T$$ denote the set of twin primes. Then $$\sum\limits_{p \in T} \frac1p < \infty$$ is a consequence of $$\pi_2(x) = |\{n \leq x \colon n\in T\}| \leq c \frac{x}{\log^2 x}$$.

In fact, by Abel's partial summation we get

\begin{align} \sum_{\{ p \in T \colon p \leq n\}} \frac1p &= \sum_{k = 2}^n \frac1k \chi_T(k) = \frac{\pi_2(n)}n + \sum_{k = 2}^n \left(\frac1k - \frac1{k + 1}\right) \pi_2(k)\\ &\leq \frac{c}{\log^2 n} + c \sum_{k = 2}^n \frac1{(k + 1)\log^2 k} \leq \frac{c}{\log^2 n} + c \sum_{k = 2}^\infty \frac1{k\log^2 k} \leq K < \infty. \end{align}

Hence, $$\sum\limits_{p\in T} \frac1p < \infty$$.

Comment
Similary we can prove that $$\sum\limits_{p\in T} \frac1p < \infty$$ if we use the slightly weaker estimate $$\pi_2(x) = O\left(\frac{x (\log\log x)^2}{\log^2 x}\right)$$.

Many number-theoretic problems can be reduced to a problem of the following type:

A finite set $$A$$ of natural numbers and a finite set $$P$$ of prime numbers are given. Estimate the number of members of $$A$$ that are not divisible by any primes belonging to $$P$$.

In other words, we “sift out” the multiples of the prime numbers belonging to $$P$$ from $$A$$, leaving the residual set $$S(A,P) = \{ n\in A \colon (n,p) = 1$$ for all $$p\in P\}$$, whose cardinality $$|S(A,P)|$$ we wish to estimate.

### Eratosthenes' (276–194 BC) sieve in Legendre's (1752–1833) formulation:

A natural number $$n$$ such that $$\sqrt x < n \leq x$$ is prime if and only if there is no prime $$\tilde{p}\in P = \{p \colon p \leq \sqrt x\}$$ such that $$\tilde{p} \mid n$$.

Hence, if $$A = \{n \colon n \leq x\}$$, we get \begin{align} |S(A,P)| &= |\{ n\in A \colon (n,p) = 1 \text{ for all } p\in P\}|\\ &= |\{n\in A \colon (n, P(\sqrt x)) = 1\}|\\ &= |\{1\} \cup \{p \colon \sqrt x < p \leq x\}|. \end{align}

(Here $$P(z)$$ denotes $$\prod\limits_{p\leq z} p$$.)

The inclusion-exclusion principle yields: $|S(A,P)| = |A| + \sum\limits_{k = 1}^{\pi(\sqrt x)} (-1)^k \sum\limits_{p_1 < \cdots < p_k \leq \sqrt x} |A(p_1\cdots p_k)| = \sum\limits_{d \mid P(\sqrt x)} \mu(d) |A(d)|$ where $$A(d) = \{n\in A \colon d \mid n\}$$ ($$= \left[ \frac{x}n \right]$$).

### The Möbius function $$\mu \colon \mathbb{N} \to \{-1,0,1\}$$

$\mu(d) = \begin{cases} 1 & d = 1\\ 0 & d \text{ not square free}\\ (-1)^k & d \text{ square free and } d = p_1\cdots p_k. \end{cases}$

### Eratosthenes' (276–194 BC) sieve in Legendre's (1752–1833) formulation:

A natural number $$n$$ such that $$\sqrt x < n \leq x$$ is prime if and only if there is no prime $$\tilde{p}\in P = \{p \colon p \leq \sqrt x\}$$ such that $$\tilde{p} \mid n$$.

Hence, if $$A = \{n \colon n \leq x\}$$, we get \begin{align} |S(A,P)| &= |\{ n\in A \colon (n,p) = 1 \text{ for all } p\in P\}|\\ &= |\{n\in A \colon (n, P(\sqrt x)) = 1\}|\\ &= |\{1\} \cup \{p \colon \sqrt x < p \leq x\}|. \end{align}

(Here $$P(z)$$ denotes $$\prod\limits_{p\leq z} p$$.)

The inclusion-exclusion principle yields: $|S(A,P)| = |A| + \sum\limits_{k = 1}^{\pi(\sqrt x)} (-1)^k \sum\limits_{p_1 < \cdots < p_k \leq \sqrt x} |A(p_1\cdots p_k)| = \sum\limits_{d \mid P(\sqrt x)} \mu(d) |A(d)|$ where $$A(d) = \{n\in A \colon d \mid n\}$$ ($$= \left[ \frac{x}n \right]$$).

### Eratosthenes' sieve

We get

$\pi(x) \approx \sum_{d \mid P(\sqrt x)} \mu(d) \left[ \frac{x}d\right] = x \sum_{d \mid P(\sqrt x)} \frac{\mu(d)}d - \sum_{d \mid P(\sqrt x)} \mu(d) \left\{\frac{x}d\right\}$

where $$\frac{x}d = \left[\frac{x}d\right] + \left\{\frac{x}d\right\}$$, $$0 \leq \left\{\frac{x}d\right\} < 1$$.

The “error” term $$\sum\limits_{d \mid P(\sqrt x)} \mu(d) \left\{\frac{x}d\right\}$$ has $$\tau(P(\sqrt x)) = 2^{\pi(\sqrt x)}$$ summands.
(Here $$\tau(n)$$ denotes the number of divisors of $$n$$.)

So potentially the error term could be $$2^{\pi(\sqrt x)}$$, which is much bigger than $$\pi(x)$$. If we ignore the error term altogether, we get

$\pi(x) \approx x \sum_{d \mid P(\sqrt x)} \frac{\mu(d)}d = x \prod_{p \leq \sqrt x} \left(1 - \frac1p\right).$

By an elementary result in number theory (Chebyshev (1821–1894), Mertens (1840–1927)) $$x \prod\limits_{p \leq \sqrt x} \left(1 - \frac1p\right)$$ tends to $$c \frac{x}{\log x}$$, where $$c = 1.123{\ldots}$$ as $$x\to \infty$$.

The previous example illustrates the main flaw of the Eratosthenes–Legendre sieve:

The error term $$\sum\limits_{d \mid P(\sqrt x)} \mu(d) \left\{\frac{x}d\right\}$$ has a huge number of summands which yields an error beyond our control.

The first successful sieve, Brun's (pure) sieve, was achieved by restricting the Möbius function $$\mu(d)$$ to numbers $$d$$ having a limited number of prime divisors.

That Brun succeeded was against all odds — the prevailing opinion among the experts were that his approach was doomed to fail.

\begin{aligned} \zeta(s) &= \sum_{n = 1}^\infty \frac1{n^s}\\ &= (1 + 2^{-s} + 2^{-2s} + 2^{-3s} + \cdots) (1 + 3^{-s} + 3^{-2s} + 3^{-3s} + \cdots) \cdots\\ &= \frac1{1 - 2^{-s}} \cdot \frac1{1 - 3^{-s}} \cdot \frac1{1 - 5^{-s}} \cdot \frac1{1 - 7^{-s}} \cdots \qquad \qquad \text{(Re(\(s) $$> 1$$)}\0.5cm] \frac1{\zeta(s)} &= \prod_{p\in P} (1 - p^{-s}) = (1 - 2^{-s})(1 - 3^{-s})(1 - 5^{-s}) \cdots = \sum_{n = 1}^\infty \frac{\mu(n)}{n^s}\\[0.5cm] \hline M(x) &= \sum_{n \leq x} \mu(n) \qquad \text{(Merten's function)}\\ M(x) &= O(x^{1/2 + \epsilon}) \text{ for all } \epsilon > 0 \iff \text{Riemann Hypothesis is true}\\ M(x) &= o(x) \iff \text{Prime Number Theorem } \left( \pi(x) \sim \frac{x}{\log x}\right).\end{aligned}\) \begin{aligned} \pi_2(x) &= |\{ p \colon p, p + 2 \text{ primes}, p \leq x - 2\}|\\ A &= \{n(n + 2) \colon n \leq x - 2\}, \quad P(\sqrt x) = \prod_{p \leq \sqrt x} p, \quad P = \{p \colon p \leq \sqrt x\}\\ |A(d)| &= |\{ n(n + 2) \colon n \leq x - 2, d \mid n(n + 2)\}|\\[0.5cm] \hline |S(A,P)| &= |S(A,P(\sqrt x))| = |\{ n(n + 2) \colon n \leq x - 2, (n(n + 2), P(\sqrt x)) = 1\}|\\ &= |\{p \colon p, p + 2 \text{ primes}, \sqrt x < p \leq x - 2\}| = \pi_2(x) - \pi_2(\sqrt x)\\ &= |A| + \sum_{k = 1}^{\pi (\sqrt x)} (-1)^k \sum_{p_1 < \cdots < p_k \leq \sqrt x} |A(p_1p_2\cdots p_k)|\\ &= \sum_{d \mid P(\sqrt x)} \mu(d) |A(d)| \end{aligned} Brun's first idea is to reduce the number of terms in the sum $$\sum\limits_{d \mid P(\sqrt x)} \mu(d) |A(d)|$$ by reducing the number of sifting primes. The simplest way to do this is to replace the condition $$p \leq \sqrt x$$ by $$p \leq z$$, where $$z\ll \sqrt x$$. So let $$P = \{p \colon p \leq z\}$$, $$P(z) = \prod\limits_{p \leq z} p$$. Let $$A\subseteq \mathbb{N}$$ be a finite set. Then \[\begin{align} |S(A,P)| &= |\{n\in A \colon (n,p) = 1 \text{ for all } p\in P\}|\\ &= |\{n\in A \colon (n,P(z)) = 1\}| = |S(A,P(z))|.\end{align}

By the inclusion-exclusion principle

$\begin{equation} |S(A,P)| = \sum_{d \mid P(z)} \mu(d) |A(d)| \tag{$$i$$} \end{equation}$

where $$A(d) = \{n \in A \colon d \mid n\}$$.

The inclusion-exclusion formula $$(i)$$ is based on the identity $$\sum\limits_{k = 0}^n (-1)^k {n \choose k} = (1 - 1)^n = 0$$.

This identity is a special case of the more general identity

$\begin{equation} \sum_{k = 0}^m (-1)^k {n \choose k} = (-1)^m {n - 1 \choose m}\tag{$$ii$$}\end{equation}$

for all $$0 \leq m \leq n$$.

This identity implies that the sum on the left-hand side of $$(ii)$$ is $$\geq 0$$ for even $$m$$ and $$\leq 0$$ for odd $$m$$.

The second idea of Brun is to utilize the alternation of sign for the sum $$(ii)$$.

In fact, it implies the following extension of $$(i)$$:

\begin{equation} \begin{aligned} &|S(A,P)| \leq \sum_{\substack{d \mid P(z)\\ \nu(d) \leq 2t}} \mu(d) |A(d)|\\ &|S(A,P)| \geq \sum_{\substack{d \mid P(z)\\ \nu(d) \leq 2t - 1}} \mu(d) |A(d)|. \end{aligned}\tag{$$iii$$} \end{equation}

(Here $$\nu(d) =$$ number of prime factors of $$d$$, and $$t$$ is any natural number.)

Let $$A = \{ n(n + 2) \colon n \leq x - 2\}$$, $$|A(d)| = |\{ n(n + 2) \colon n \leq x - 2, d \mid n(n + 2)\}|$$. Let $$\omega(d) = |\{ n \colon 0 \leq n < d, d \mid n(n + 2)\}|$$.

Then for $$\mu(d) \neq 0$$, i.e. $$d$$ is square-free, we get

$\begin{equation} \left|A(d) - \omega (d) \frac{x}d \right| \leq \omega(d) \leq 2^{\nu(d)}. \tag{$$iv$$} \end{equation}$

Clearly, $$\omega(2) = 1$$ and $$\omega(p) = 2$$ for $$p \neq 2$$.

By the Chinese Remainder Theorem $$\omega$$ is multiplicative, i.e. $$\omega(mn) = \omega(m) \omega(n)$$ if $$(m,n) = 1$$.

\begin{aligned} A &= \{n (n + 2) \colon n \leq x - 2\} \qquad P = \{p \leq z\}, \qquad P(z) = \prod\limits_{p\leq z} p\\ A(d) &= \{ n(n + 2) \colon n \leq x - 2, d \mid n (n + 2)\}\\ |S(A,P)| &= |\{n (n + 2) \in A \colon (n(n + 2), P(z)) = 1\}|\\ &= |\{n \colon n \leq x - 2, (n,p) = (n + 2,p) = 1 \text{ for all } p \leq z\}|\0.5cm] \hline \end{aligned} By $$(iii)$$ and $$(iv)$$ we get \[ |S(A,P)| = x \sum_{\substack{d \mid P(z)\\ \nu(d) \leq 2t}} \mu(d) \frac{\omega(d)}d + O\bigl(\sum_{\substack{d \mid P(z)\\ \nu(d) = 2t}} |A(d)|\bigl)

for any choice of $$t$$. This implies the following:

$\begin{equation} |S(A,P)| = x\sum_{d \mid P(z)} \mu(d) \frac{\omega(d)}d + O \bigl(\overbrace{x \sum_{\substack{d \mid P(z)\\ \nu(d) \geq 2t}} \frac{\omega(d)}d}^{E_1}\bigl) + O\bigl(\overbrace{\sum_{\substack{d \mid P(z)\\ \nu(d) \leq 2t}} 2^{\nu(d)}}^{E_2}\bigl). \tag{$$v$$} \end{equation}$

Theorem
The number $$N = |S(A,P)|$$ of natural numbers $$n$$ such that $$n \leq x - 2$$ and such that both $$n$$ and $$n + 2$$ are free of prime factors up to $$z$$ satisfies

$N \sim \alpha \frac{x}{\log^2 z}$

for some constant $$0 < \alpha < \infty$$, and where the asymptotic relation holds as $$x,z \to \infty$$ in the region $$z\leq x^{1/(20 \log\log x)}$$.

Corollary
$$\pi_2(x) = O \bigl( \frac{x(\log\log x)^2}{\log^2 x}\bigl)$$. In particuar, the sum $$\sum \frac1p$$ for primes $$p$$, with $$p + 2$$ also prime, is finite or converges.

Proof
Set $$z = x^{1/(20\log\log x)}$$ in the theorem.

### Sketch of proof of Theorem

The main term $$x \sum\limits_{d \mid P(z)} \mu(d) \frac{\omega(d)}d$$ of $$(v)$$ can be computed, using the fact that $$d \to \frac{\omega(d)}d$$ is multiplicative:

\begin{align} x \sum_{d \mid P(z)} \mu(d) \frac{\omega(d)}d &= x \prod_{p \leq z} \left(1 - \frac{\omega(p)}p\right) = \frac{x}2 \prod_{2 < p \leq z} \left(1 - \frac2p\right)\\ &\leq \frac{x}2 \prod_{p \leq z} \left(1 - \frac1p\right)^2 = \frac{x}2 \biggl(\prod_{p \leq z} \left(1 - \frac1p\right)\biggl)^2\\ &\to \beta\frac{x}{\log^2 z}\end{align}

for some $$\beta > 0$$, as $$z \to \infty$$. To finish the proof it is sufficient to show that $$E_1 + E_2 = o\bigl(\frac{x}{\log^2 z}\bigl)$$.

To estimate the error terms $$E_1$$ and $$E_2$$ in $$(v)$$, we choose first $$t = [5 \log\log z]$$. By using $$\omega(d) \leq 2^{\nu(d)}$$ and the multinomial theorem, we get for $$E_1$$:

\begin{align} E_1 &= x \sum_{\substack{d \mid P(z)\\ \nu(d) \geq 2t}} \frac{\omega(d)}d \leq x \sum_{l \geq 2t} 2^l \sum_{\substack{d \mid P(z)\\ \nu(d) = l}} \frac1d\\ &\leq x \sum_{l \geq 2t} \frac{2^l}{l!} \biggl(\sum_{p \leq z} \frac1p\biggl)^l \leq x\sum_{l \geq 2t} \frac1{l!} (2\log\log z + 2c)^l.\end{align}

(In the last inequality we use a weak form of Mertens theorem: $$\sum\limits_{p \leq z} \frac1p \leq \log\log z + c$$.)

The terms in this last sum are decaying at least geometrically with a common ratio bounded below $$1$$, so that the sum is majorized by its first term.

Thus

\begin{align} E_1 &= O \left(\frac{x}{[2t]!} [2\log\log z + 2c]^{2t}\right) = O\left(x \left[\frac{2e\log\log z + 2ce}{2t}\right]^{2t}\right)\\ &= O\left(x \left(\frac15e\right)^{10\log\log z}\right) = O\left(\frac{x}{\log^6 z}\right).\end{align}

The majorization of $$E_2$$ is easier. We have

$E_2 = \sum_{\substack{d \mid P(z)\\ \nu(d) \leq 2t}} 2^{\nu(d)} \leq 2^{2t} \sum_{\substack{d \mid P(z)\\ \nu(d) \leq 2t}} 1 = 2^{2t} \sum_{l = 0}^{2t} {\pi(z) \choose l} \leq 2^{2t} \pi(z)^{2t}$

For large $$z$$, we have $$\pi(z) \leq \frac12 z$$ (in fact, $$z \geq 8$$ will do), so that

$E_2 \leq z^{2t} = O(x^{1/2}).$

This finishes the proof of the theorem.

### Final remark

It it conjectured that

$\pi_2(x) \sim c\frac{x}{\log^2 x}$

where

$c = 2\prod_{p > 2} (1 - (p - 1)^{-2}) \approx 1.320323$

is called the twin prime constant.

### References

V. Brun, Über das Goldbachsche Gesetz und die Anzahl der Primzahlpaare, Archiv for Matematik og Naturvidenskab, Bind 34 (1915), 3–19.

V. Brun, Om fordelingen av primtallene i forskjellige tallklasser. En øvre begrænsning, Nyt Tidsskrift for Matematik 27B (1916), 45–58.

V. Brun, La série $$\frac15 + \frac17 + \frac1{11} + \frac1{13} + \frac1{17} + \frac1{19} + \frac1{29} + \frac1{31} + \frac1{41} + \frac1{43} + \frac1{59} + \frac1{61} + \cdots$$ où les dénominateurs sont “nombres premiers jumeaux” est convergente ou finie, Bull. Sci. Math. 43 (1919), 100–104, 124–128.

V. Brun, Le crible d'Ératosthène et le théorème de Goldbach, C.R. Math. Acad. Sci. Paris 168 (1919), 544–546.

V. Brun, Le crible d'Ératosthène et le théorème de Goldbach, Christiania Videnskap. Selsk. Skr. (1920), 36 pp.

J. Merlin, Sur quelques théorèmes d'Arithmétique et un énoncé qui les contient, C.R. Math. Acad. Sci. Paris 152 (1911), 516–518.

J. Hadamard, Un travail de Jean Merlin sur les nombres premiers, Bull. Sci. Math. 39 (1915), 121–136.

H. Halberstam, H.-E. Richert, Sieve Methods, LMS Monograph, Academic Press (1974).

C. Pomerance, A. Sárközy, Combinatorial Number Theory, Chapter 20, Handbook of Combinatorics, Elsevier (1995) (edited by Graham, Grötschel and Lovász).

G. Greaves, Sieves in Number Theory, Vol. 43, Ergebnisse der Mathematik, Springer (2001).

P. Pollack, Not Always Buried Deep, AMS (2009).

J. Friedlander, H. Iwaniec, Opera de Cribo, AMS Colloquium Publ., Vol. 57 (2010).