Zim + Teemo for Quanta Magazine

91

Two mathematicians have uncovered a simple, previously unnoticed property of prime numbers — those numbers that are divisible only by 1 and themselves. Prime numbers, it seems, have decided preferences about the final digits of the primes that immediately follow them.

Among the first billion prime numbers, for instance, a prime ending in 9 is almost 65 percent more likely to be followed by a prime ending in 1 than another prime ending in 9. In a paper posted online today, Kannan Soundararajan and Robert Lemke Oliver of Stanford University present both numerical and theoretical evidence that prime numbers repel other would-be primes that end in the same digit, and have varied predilections for being followed by primes ending in the other possible final digits.

“We’ve been studying primes for a long time, and no one spotted this before,” said Andrew Granville, a number theorist at the University of Montreal and University College London. “It’s crazy.”

The discovery is the exact opposite of what most mathematicians would have predicted, said Ken Ono, a number theorist at Emory University in Atlanta.  When he first heard the news, he said, “I was floored. I thought, ‘For sure, your program’s not working.’”

This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers. Most mathematicians would have assumed, Granville and Ono agreed, that a prime should have an equal chance of being followed by a prime ending in 1, 3, 7 or 9 (the four possible endings for all prime numbers except 2 and 5).

“I can’t believe anyone in the world would have guessed this,” Granville said. Even after having seen Lemke Oliver and Soundararajan’s analysis of their phenomenon, he said, “it still seems like a strange thing.”

Yet the pair’s work doesn’t upend the notion that primes behave randomly so much as point to how subtle their particular mix of randomness and order is. “Can we redefine what ‘random’ means in this context so that once again, [this phenomenon] looks like it might be random?” Soundararajan said. “That’s what we think we’ve done.”

Prime Preferences

Soundararajan was drawn to study consecutive primes after hearing a lecture at Stanford by the mathematician Tadashi Tokieda, of the University of Cambridge, in which he mentioned a counterintuitive property of coin-tossing: If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses.

Waheeda Khalfan

Kannan Soundararajan, left, and Robert Lemke Oliver of Stanford University in February.

Soundararajan wondered if similarly strange phenomena appear in other contexts. Since he has studied the primes for decades, he turned to them — and found something even stranger than he had bargained for. Looking at prime numbers written in base 3 — in which roughly half the primes end in 1 and half end in 2 — he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1. Likewise, a prime ending in 2 prefers to be followed a prime ending in 1.

Soundararajan showed his findings to postdoctoral researcher Lemke Oliver, who was shocked. He immediately wrote a program that searched much farther out along the number line — through the first 400 billion primes. Lemke Oliver again found that primes seem to avoid being followed by another prime with the same final digit. The primes “really hate to repeat themselves,” Lemke Oliver said.

Lemke Oliver and Soundararajan discovered that this sort of bias in the final digits of consecutive primes holds not just in base 3, but also in base 10 and several other bases; they conjecture that it’s true in every base. The biases that they found appear to even out, little by little, as you go farther along the number line — but they do so at a snail’s pace. “It’s the rate at which they even out which is surprising to me,” said James Maynard, a number theorist at the University of Oxford. When Soundararajan first told Maynard what the pair had discovered, “I only half believed him,” Maynard said. “As soon as I went back to my office, I ran a numerical experiment to check this myself.”

Lemke Oliver and Soundararajan’s first guess for why this bias occurs was a simple one: Maybe a prime ending in 3, say, is more likely to be followed by a prime ending in 7, 9 or 1 merely because it encounters numbers with those endings before it reaches another number ending in 3. For example, 43 is followed by 47, 49 and 51 before it hits 53, and one of those numbers, 47, is prime.

But the pair of mathematicians soon realized that this potential explanation couldn’t account for the magnitude of the biases they found. Nor could it explain why, as the pair found, primes ending in 3 seem to like being followed by primes ending in 9 more than 1 or 7. To explain these and other preferences, Lemke Oliver and Soundararajan had to delve into the deepest model mathematicians have for random behavior in the primes.

Random Primes

Prime numbers, of course, are not really random at all — they are completely determined. Yet in many respects, they seem to behave like a list of random numbers, governed by just one overarching rule: The approximate density of primes near any number is inversely proportional to how many digits the number has.

In 1936, Swedish mathematician Harald Cramér explored this idea using an elementary model for generating random prime-like numbers: At every whole number, flip a weighted coin — weighted by the prime density near that number — to decide whether to include that number in your list of random “primes.” Cramér showed that this coin-tossing model does an excellent job of predicting certain features of the real primes, such as how many to expect between two consecutive perfect squares.

Despite its predictive power, Cramér’s model is a vast oversimplification. For instance, even numbers have as good a chance of being chosen as odd numbers, whereas real primes are never even, apart from the number 2. Over the years, mathematicians have developed refinements of Cramér’s model that, for instance, bar even numbers and numbers divisible by 3, 5, and other small primes.

These simple coin-tossing models tend to be very useful rules of thumb about how prime numbers behave. They accurately predict, among other things, that prime numbers shouldn’t care what their final digit is — and indeed, primes ending in 1, 3, 7 and 9 occur with roughly equal frequency.

Yet similar logic seems to suggest that primes shouldn’t care what digit the prime after them ends in. It was probably mathematicians’ overreliance on the simple coin-tossing heuristics that made them miss the biases in consecutive primes for so long, Granville said. “It’s easy to take too much for granted — to assume that your first guess is true.”

The primes’ preferences about the final digits of the primes that follow them can be explained, Soundararajan and Lemke Oliver found, using a much more refined model of randomness in primes, something called the prime k-tuples conjecture. Originally stated by mathematicians G. H. Hardy and J. E. Littlewood in 1923, the conjecture provides precise estimates of how often every possible constellation of primes with a given spacing pattern will appear. A wealth of numerical evidence supports the conjecture, but so far a proof has eluded mathematicians.

The prime k-tuples conjecture subsumes many of the most central open problems in prime numbers, such as the twin primes conjecture, which posits that there are infinitely many pairs of primes — such as 17 and 19 — that are only two apart. Most mathematicians believe the twin primes conjecture not so much because they keep finding more twin primes, Maynard said, but because the number of twin primes they’ve found fits so neatly with what the prime k-tuples conjecture predicts.

In a similar way, Soundararajan and Lemke Oliver have found that the biases they uncovered in consecutive primes come very close to what the prime k-tuples conjecture predicts. In other words, the most sophisticated conjecture mathematicians have about randomness in primes forces the primes to display strong biases. “I have to rethink how I teach my class in analytic number theory now,” Ono said.

At this early stage, mathematicians say, it’s hard to know whether these biases are isolated peculiarities, or whether they have deep connections to other mathematical structures in the primes or elsewhere. Ono predicts, however, that mathematicians will immediately start looking for similar biases in related contexts, such as prime polynomials — fundamental objects in number theory that can’t be factored into simpler polynomials.

And the finding will make mathematicians look at the primes themselves with fresh eyes, Granville said. “You could wonder, what else have we missed about the primes?”

• Noel Gardeli says:

This is amazing and very curious! Prime numbers are really fascinating.

Thank you for this beautiful expository article. I wish I could write more in this comment, but I'm in a hurry because I'm now going to try to read the arXiv prepint!

• jose says:

this was not obvious before? even with small primes, you can notice some sort of weird pattern between the prime numbers themselves. one would have thought that mathematicians looked at these weird patterns once and once again…

• Lol says:

They used the decimal system, right? How anthropomorphic. I conducted the same research using the binary system and discovered that primes larger than 10 have the probability of 110 0100% to have their last digit equal to 1. Surely this can't be random.

• Will Prescott says:

Does this have any impact on the algorithms used for encryption?

• Richard says:

If one is seeking a subtle mix of randomness and order, one could do far worse than to have recourse to the concept of a * spin glass * ….

• Odyssios Redux says:

As I read the first paragraph, I wondered if the effect was an artifact of using 10 as the number representation base. Apparently not, as I soon found out.

Each prime, in isolation, is an object which seems to have no remarkable properties other than impoverished divisibility. But taken as a set, there seem to be strange patterns, or hints of patterns, or … some kinds of properties that only become apparent the more of them one has to play with. Of course, being a countably infinite set, we'll never have the whole lot at once, and so may never discover everything there is to be known about them. Or know whether we have or not!

So I wonder, if there are within the primes, links analogous to 'entanglement' between particles in quantum mechanics? This is an analogy, neither assertion or denial of any property of quantum mechanical systems! The Riemann hypothesis (if proved) would be an indication of some such link. I'll stay tuned.

• Xinke Guo-Xue says:

This was noted for mod 4 quite a bit earlier https://www2.bc.edu/~ashav/Papers/ABGS-PrimePairsFinal.pdf. I'm surprised no one checked mod 10.

• Richard says:

While we're on the subject of coy conspiracies, one cannot help but observe that — had he lived — today is the day on which Albert Einstein would have turned * 137 * — a number whose siren-song appeal for physicists* is a matter of long (and strange) standing, indeed.

* In addition to the speculated significance of its relationship to the fine-structure constant — of which line of "reasoning", I know but little (and understand less) — it also happens to be the number of the hospital room in which Wolfgang Pauli died!

• Andrej Lesko says:

If we accept the fact of "being prime number" as "probability event" in strict mathematical sense, then the above mentioned result is not surprising, because in the first case -A-( both consecutive primes ending with same digit ) is unconditional event and in the second case -B- ( prime ending with some digit (say 9)is followed by prime ending with same digit) we are dealing with conditional event and consequently with conditional probability for such event. Both probabilities can not be the same, unless both events are independent, and they are evidently not. I think that the catch is in proposition, that being prime number is " well defined " probabilistic event in some probability space. But on the other hand we can from such proposition ( because probabilities are in this case computable ) backward "compute" how much our probabilistic model correspond to real situation.

• wally says:

"But taken as a set, there seem to be strange patterns"
Of course. As the article says – primes are not random, they are fully determined.
However, the patterns for predicting that determination grow tremendously complex as the size of each prime increases, so the practicality, not the possibility, of calculating them by pattern becomes the obstacle.

• Doug Smith says:

How irritating! After spending months writing code and identifying this pattern (primes following other primes 9 after 3 etc.) I tried sharing it with a mathematician. He convinced me last summer that many people "smart than I am" have looked at this and I would never get anywhere. Because I didn't go to college and don't share his P.H.D. in math. I hope he hears about this news wherever he is now.

• Joshua Williams says:

When will math people stop using the word "random"? There is no such thing. We don't have to redefine random. We have to understand that just because we can't tell or predict an outcome (due to our own limitations) doesn't make it random.

• Freecloud says:

I'm 32, and I discovered this in middle school. It's too obvious!

• John W says:

Actually, Doug Smith, you'll find (tongue in cheek) that the Mathematician you spoke to got the idea from you, and published his study shortly thereafter. He had to use a pseudonym so you wouldn't realize that he got the idea from you.

• Joe D says:

I don't understand this paragraph:
—————
he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1. Likewise, a prime ending in 2 prefers to be followed a prime ending in 1.
———-

"A prime ending in 2" ??? The only prime number that ends in 2 is 2. Every other number that ends in 2 is even and greater than 2 and therefore not prime… right?

• Buczo says:

Doug Smith : I was taking part in a 20 people IT project on my Technical University that lasted for three years. Our professor was acting as a leader in this project. During the first month of this 3 year journey I tried to raise how absurd our data-to-rdbms mapping is and how to make it better. Nobody was listening – as I was the student with lower grades than other that were taking part in this project.

After two years my professor announced that together with top-tier students he figured out a way to make our code/rdbms mapping better. It was EXACTLY what I was proposing 2 years before.

• Lewis Robinson M. D. says:

Great article — it's why I love this site

• Ken B says:

Holy FSM! ALL but one primes end with '1' in base 2!

• Quiet Desperation says:

Not sure why they expect randomness.

Look at it from the other direction. The non-primes are 2, 4, 6, 8… and 3, 6, 9, 12… and so on with very simple numeric series. Since the primes are the spots where these regular and orderly patterns miss, why would yo expect randomness? You can define the boundaries, so to speak, of the prime realm by using simple series of numbers. Have I just gotten too abstract?

Makes me think of fractals. The math is very simple, but the results infinitely complex. However, they are in no way random.

• NOEMI DIVINAGRACIA says:

IT is a great discovery and research.Thanks to the patience and dedication of today's young mathematicians.

• Scott V says:

@Joe D: Earlier in the paragraph they explain that they did the test initially in base 3, that is, using only the digits 0, 1, and 2. Thus 3 and 4 in our decimal system would be written 10 and 11 and so forth.

• dodo says:

I have even greater discovery – in binary number system, all primes (except one) end with … 1 !

• brian dash says:

Joe D: the passage you mention was describing the pattern under base 3.

• Dr Leota Tucker says:

As a few others have noted….how can a prime end in 2?????? No prime ends in 2 but 2.

• Joe Thomas says:

Joe D wrote:
"A prime ending in 2" ??? The only prime number that ends in 2 is 2. Every other number that ends in 2 is even and greater than 2 and therefore not prime… right?

That's true in base 10, but not in base 3.
10 base 3 is 3 decimal (prime)
12 base 3 is 5 decimal (prime)

The reason you can tell by looking at the last digit whether a number written in base 10 is divisible by 2 or 5 is because those are the factors of the base (10). Since 3 doesn't have any factors, the last digit of a number written in base 3 doesn't tell you anything about any factors it might have (other than 3).

• Howard Brown says:

to: Joe D
I think prime final digits of 1 or 2 study you are referring to is for base 3. Yes, in base 3, a prime can have a final digit of 2. Example: 11 (base 10) equals 102 (base 3)

• Newcomb says:

For Joe D –
When written in ternary (base 3), every prime, except 3,
ends in 1 or 2.
Ending in 1: 7, 13, 19, 31, etc
Ending in 2: 2, 5, 11, 17, 23, 29, etc

• jerodast says:

Joe D: He was talking about those numbers represented in base-3 (trinary). For instance, 12 (trinary) = 1*3 + 2*1 = 5 (decimal). Since all digit places have odd values (1, 3, 9, 27, etc), every increment to ANY digit flips the parity (evenness) of the number, so half the trinary numbers ending in 2 are even, and half are odd.

• AY says:

Joe D:
It's using base 3, so 11 in base 10 for instance becomes 102 in base 3. All numbers are going to end in 0,1, or 2 and anything ending in 0 would be divisible by 3, just how in base 10 ending in 0 would be divisible by 10. This would leave just 1 and 2 as potential endings for primes.

• M Kent says:

Joe D:
"A prime ending in 2" ??? The only prime number that ends in 2 is 2. Every other number that ends in 2 is even and greater than 2 and therefore not prime… right?

You've missed right above in the paragraph, where it says they're looking at primes written in BASE 3. Where 5 would be "12", 17 "32" etc.

• Stephen Haggett says:

Benfords Law. 65% sounds right for 4 possibilities. Correct me if I'm wrong.

• Martin Richardt says:

I can not reproduce the claim about "counterintuitive property of coin-tossing" (try this at home). As I understand it you stated that Alice needed in average 4 coin tosses to see the sequence of Head, Tail and Bob needed 6 tosses to see Head, Head. This is really counterintuive to my beliefs, so I wrote a little python script to reproduce a similar result. The calculated relative probalities are approximately 1/4 which are not counterintuitive. So what did I wrong?

# begin script

import random

alice, bob = 0, 0

head = lambda x: x < 0.5
tail = lambda x: not head(x)

c1 = random.random()
N = 1000000
for _ in xrange(N):
c2 = random.random()
# head = < 0.5, tail >= 0.5
alice += 1

bob += 1

c1 = c2

print(bob, alice)
print("relative probability bob=%f alice=%f" %(bob*1.0/N, alice*1.0/N))

# end script

It outputs:

(248930, 250123)
relative probability bob=0.248930 alice=0.250123

• Shayne M Konar says:

Wasn't this already on an episode of Big Bang Theory? Sheldon figured it out!

• Phil Freedom says:

This conspiracy is a function of the self-organization of numbers based on high compositability. For every 12 numbers there exists a possibility for 4 numbers to be prime. Imagine the 12 positions of the clock. Primes appear in pairs on both sides of 6 and 12. (5/7), (11/13), (17/19), and the next pair of where the twin primes would be is (23/25), but 5 (the first of the normal primes, considering the unique properties of 2 And 3) "knocks out" the 1st twin as its square. Multiples of 5, 7, 11, 13 begin to intersect with the twin prime fields and begin reducing the amount of twins. 35/49/55/65 etc.
Check this visual graphic I made to demonstrate how numbers are organized…

• jim flee says:

I believe the example for expected time to Head-Head and Head-Tail is reversed. Consider 4 tosses of a coin. The number of Tails can be from 0 to 4. The numbers of possible ways for each of these outcomes are: 0 -> 1, 1 -> 4, 2 -> 6, 3-> 4, 4 -> 1. (ie, C(4, 0), C(4, 1), C(4, 2), etc., for 16 total possibilities (ie, 2^4 (ie, the number of subsets of a 4-element set))). Of the 1 way to get 4 Heads, HH occurs in it. Of the 4 ways to get 3 Heads, HH occurs in all 4. Of the 6 ways to get 2 Heads, HH occurs in 3 of them. (Of the 4 ways to get 1 Heads, HH occurs in none of them and likewise for the one way to get zero Heads.) 1 + 4 + 3 = 8. 8/16 = 50%.

• D Brutlag says:

Some folks here are missing what the essence of being a prime means. Prime has to do with the geometry of the number. A number is prime if and only if there is no rectangular array for the number except 1 by something. Therefore the whether a number is prime or not is independent of the base in which it is written. For example, eleven is prime no matter which base it is written in.
The author simply used base 3 as a tool to explore this new, interesting property of primes.
Thanks for this well written article.
Been a long time since I've seen anything new about primes.

• MathGuy says:

Joe D: They were talking about the base 3 expansion of the number. In base 3, a number ending in 2 is not necessarily even. For example, 5 written in base 3 is 12.

• marcela says:

For example, there are, in some sense, "much more" "zeros" than "ones" of prime rank in the Thue-Morse sequence
http://arxiv.org/abs/0803.0852

• Bill says:

Joe D: That part is referring to primes expressed in base 3 – where every number ends in 0, 1, or 2.

• JS says:

Martin: your program calculates, for a long sequence of coin tosses, how many occurrences of HH there are versus occurrences of HT – which will indeed be about 1/4 each. However, the 'counterintuitive' phenomenon is that the expected number of tosses until the first occurrence of HH is more than until the first occurrence of HT. I think it's actually easier to compare HH and TH; then in any sequence of tosses, the only way HH can occur before TH is if the first two tosses are HH — this is the source of the bias towards seeing TH first. If you're not convinced you can theoretically calculate the probabilities!

• Clint Y says:

The "pattern" is due to the fact we use BASE10.

Change the base and the pattern changes.

• Anon says:

Martin: The experiment is to measure how many flips are necessary to reach a sequence of heads-heads. Your script is measuring how many heads-heads sequences there are in N flips. These are different things. Try writing a script to follow the experiment more literally and you will see the counterintuitive effect.

The explanation is because tails is completely useless to Bob, whereas tails is sometimes useful to Alice. For both people's first flip, tails is useless and heads is useful. But for the second flip, Bob is penalized harder for flipping the wrong result, because tails is completely useless to him and he needs another heads to begin the sequence again. However, Alice loses less for flipping a heads second, because that heads can begin a new sequence.

• Super Hans says:

Several times the phrase "a prime ending in 2" is used. Maybe I'm tired, but isn't the only prime ending in 2, 2 itself?

• Mike says:

@ Will Prescott "Does this have any impact on the algorithms used for encryption?"

No. Encryption using prime numbers relies on how difficult it is to factorize the product of two prime numbers. The product of the two prime numbers is a public key, while the primes are the private key. Finding the private key primes with only the product (a.k.a. factorization) is non-trivial and the bigger the primes (1024, 2048 ect) the more painful and energy intensive the decryption.

There are theoretically infinite prime numbers, hard to prove infinity and all that. Over a billion of them are availiable online so predicting how many end in a certain number, while interesting doesn't solve factorization, or make it any quicker.

• Daniel Fernandes says:

Anon: Your explanation makes it very clear. Thanks!

• Erica Klarreich says:

Xinke Guo-Xue — Thanks for pointing out that paper!

In the discussions that have ensued today, I've heard about two papers that each presented some of the numerical findings that the two Stanford researchers have just analyzed: one by Ko in the May, 2002 issue of Chaos, Solitons and Fractals, and the one you cited by Ash, Beltis, Gross and Sinnott in volume 20 (2011) of Experimental Mathematics, which also made a preliminary mathematical analysis of the numerical data (although as I understand it, their analysis doesn't go as far as that in the new paper).

It's interesting that these earlier papers didn't percolate through the mathematics community to attract the notice of number theorists such as Granville and Maynard, who have studied prime patterns very closely. Such gaps in communication were common in mathematics hundreds of years ago, when news traveled slowly, and even 50 years ago, when mathematicians in the Soviet Union and the United States were often unaware of each others' contributions. Today, it probably stems mainly from the explosion in the number of published papers.

• MIke S says:

I'd like to see analyses of primes in several different bases, not just base 10, I think we'd learn a lot that way.

• Tom Purcell says:

For the coin tossing experiment, this is how I reason it. Correct me if my reasoning is wrong.
If A stands for the expected number of coin tosses before Alice gets HT , then the scenarios are:
Alice gets T on her first throw with probability 1/2 which adds 1 to the number of tosses, since she effectually has to start again.
Alice throws HH on her first two throws with probability 1/4. However, this makes no further impact on the average A as she is still in with a shout on the third toss.
Alice throws HT in her first two throws with probability 1/4
Then
A = (A + 1)/2 + A/4 + 2*1/4
= 3A/4 + 1
A/4 = 1
A = 4

Let B stand's for Bob's expected number of throws. The scenarios are:
Bob throws T on the first throw with probability 1/2 which adds 1 to the number of tosses
Bob throws HT with probability 1/4 which adds 2 to the number of tosses
Bob throws HH with probability 1/4 in the first two throws.
Then
B = (B + 1)/2 + (B + 2)/4 + 2 * 1/4
= 3B/4 + 3/2
B/4 = 3/2
B = 6

I realise that this is off the main topic, but the exhortation to "try it at home" persuaded me put in my two-penn'orth as it were. As for the observation of prime numbers ending in a particular digit having a preponderance to be followed by a prime ending in another particular digit, might this be at least partially explained by looking at the relative densities of composite integers ending in the digits 1, 3, 7 and 9, say up to the billionth prime number? Note that this suggestion is just off the top of my head; for all I know the densities may well be roughly equal and thus bear no correlation at all.

• Robert Shriner says:

As I read this article and sense some enthusiasm towards finding anything that will help to understand the problem of the Prime. I believe that there are many people out here, like myself, who have worked hard, for many years, on trying to under stand or find a direct formula pattern to give an up or down answer as to being a prime and all have failed but once in a while one idea shakes the tree and something falls out. [“We’ve been studying primes for a long time, and no one spotted this before,” said Andrew Granville, a number theorist at the University of Montreal and University College London. “It’s crazy.”

The discovery is the exact opposite of what most mathematicians would have predicted, said Ken Ono, a number theorist at Emory University in Atlanta. When he first heard the news, he said, “I was floored. I thought, ‘For sure, your program’s not working.’”

This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers. Most mathematicians would have assumed, Granville and Ono agreed, that a prime should have an equal chance of being followed by a prime ending in 1, 3, 7 or 9 (the four possible endings for all prime numbers except 2 and 5).]. Isn't this inspiring.

• Hal Heinrich says:

I fooled around with this for primes under a million. The data shows the number of matching final digits of consecutive primes versus the number expected, if the final digits were random.
In radix 31, we see less than 1% of the expected number of matches. I would have posted a graph of that was permitted.

Here's the data:
3 0.836257086
5 0.614118456
7 0.460638799
9 0.445896877
11 0.308713545
13 0.193412754
15 0.205292866
17 0.11620647
19 0.077294563
21 0.073703123
23 0.043733673
25 0.027525391
27 0.032113775
29 0.019268756
31 0.006117221
33 0.012234754
35 0.006729286

• Tim Russell says:

So is Terry Tao crying or hard at work?

• Martin Richardt says:

@Anon:
Thanks for the explanation. Now I get it.

• vonjd says:

Concerning encryption I posted a question on MO: http://mathoverflow.net/questions/233633/could-this-unexpected-bias-in-the-distribution-of-consecutive-primes-have-any-im

• tz says:

Coins – different state machines with 2 steps
HH: 0u<->1->2 the first head gets to 1, but a tail resets to 0
HT: 0u->1u->2 the first head gets to 1, heads stays so the first tail after any number of heads wins. T[n]HH vs T[n]H[n>0]T

• Alexey Mints says:

Twin primes (https://en.wikipedia.org/wiki/Twin_prime) are well known, so the fact that prime ending 9 and prime ending 1 are common is not surprising at all.
Broadly speaking, it is not surprising that they find patterns in prime numbers in general – stochastic != random

• peter says:

this is not that big news Stanislaw Ulam showed in 1963 that there are patterns

https://en.m.wikipedia.org/wiki/Ulam_spiral

• Timon Krebs says:

I could not believe it too, so i wrote this c# programm. Here you can see that it really is as mentioned in this article.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace CoinToss
{
class Program
{
static void Main(string[] args)
{
Random rnd = new Random();
List<int> a = new List<int>();
List<int> b = new List<int>();

bool remember = false;
int count = 0;

//head = 1, tail = 2
for (int i = 0; i < 100000; i++)
{
count = 0;
remember = false;
while (true)
{
count++;
if (rnd.Next(1, 3) == 1)
{
remember = true;
}else if (remember)
{
break;
}
}
}

remember = false;

for (int i = 0; i < 100000; i++)
{
count = 0;
remember = false;
while (true)
{
count++;
if (rnd.Next(1, 3) == 1)
{
if (remember)
{
break;
}
remember = true;

}else { remember = false; }
}
}

Console.WriteLine("a = " + a.Average());
Console.WriteLine("b = " + b.Average());
}
}
}

• Susanne says:

Hm… Each of the four endings is equally likely, right?
That would imply that the average distance between two primes ending in 1, the average distance between two primes ending in 3, etc, are equal.
But if the average distance between two primes ending in the same digit is n, once you're "at" the location of a certain prime you are distance n away from the next prime ending in the same digit. But, often, your location is already "along the route" (i.e. less than n) towards a prime ending in another digit.
Attempt at visual:
—1———–1—-1——–1———–
———3—-3———–3——–3—–
So then, it wouldn't be surprising that the percentage of consecutive primes ending in the same digit is relatively low. Help? 🙂

• David Andrews says:

GH Hardy-a mathematician's mathematician

If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses.

• Michel says:

Say it takes Alice and Bob equally long to throw the first Head. If Alice fails, (i.e. throws a Head), she can succeed at the next toss. If Bob fails (i.e. throws a Tail), he will need at least 2 more tosses. Do you really want a formal proof?

• Tim Russell says:

This article begs a question. Has anyone ever done an intensive study on the frequency and distribution of primes in different bases? For example, bases 1-19, multiples of 10 (20, 30…90) and in exponential bases of 10 (100, 1000, 10,000), then look for patterns, expected or not, within the bases, and finally compare the results between the different bases. Obviously, this is a enormous computational exercise, and how far out on the number line you go, your collective computing power and time would determine the would determine the parameters. But it could yield some very interesting results and insight.

• William Oliver says:

Among the first billion prime numbers, for instance, a prime ending in 9 is almost 65 percent more likely to be followed by a prime ending in 1 than another prime ending in 9.

First, consider the other properties of prime numbers. There are infinite prime numbers among the natural numbers. So by comparing the first billion primes numbers you have only sampled the smallest of fractions. if 0.999~ = 1 then 0.000~ = 0

Any perceived patterns are an attribute of True Randomness. For something to be random it needs to have every possible size pattern in infinite amounts, at infinite random intervals, to an infinite size number.

Changing the definition of randomness does not effect the definition of primes in any way. They showed that for specific parameters there is an uneven distribution of the 4 digits that every prime number ends in.

Now, that being said this is not entirely useless. How long this 'pattern' continues for may pertain to modern cryptography. Perhaps something that should be looked into further.

• Matt says:

Someone asked why randomness would be expected. Someone else said we should stop trying to define randomness.
We expect randomness because we see it. We look at the first trillion primes (or the first trillion digits in pi) and note things we would expect from randomness. In pi we see equal representation of all ten digits. In primes we see equal representation of 1, 3, 7, and 9 as a final digit. If we select digits a random, make a very long list, then look at it, we expect certain things. We expect to sometimes see consecutive 9s for example. In fact, if we could randomly select a pair of adjacent digits in our list we'd expect the first one to be a 9 10% of the time and then we'd expect the next digit to be a 9 10% of those times. Thus, a pair of consecutive 9s should appear 1% of the time when we look at all pairs of consecutive digits in pi. This is exactly what happens. It is one of the many ways in which pi acts randomly. We cannot define random but we can describe properties randomness should reflect over the long haul.

• Aleksandar Razmovski says:

I tried at home
10k tries on pair: hh,tt,ht,th
oracle plsql dbms_random, got toad opened atm 🙂
<table border=0 width="100%">
<tr>
<th align="left" bgcolor="#C0C0C0" bordercolor="#FFFFFF">AVG</th>
<th align="left" bgcolor="#C0C0C0" bordercolor="#FFFFFF">ID</th>
</tr>
<tr> <td nowrap><div align=right>3,9951</div></td>
<td nowrap>ht</td>
</tr>
<tr> <td nowrap><div align=right>3,9908</div></td>
<td nowrap>th</td>
</tr>
<tr> <td nowrap><div align=right>5,889</div></td>
<td nowrap>hh</td>
</tr>
<tr> <td nowrap><div align=right>5,9804</div></td>
<td nowrap>tt</td>
</tr>
</table>

No comment….

How do Twin Primes fit in here ?

• David Scott Marley says:

I'm not getting this.

Once you get past 5, the possible primes end in 1, 3, 7, and 9.

So we take a largish prime, more or less at random; let's say it ends in 1. We check the next four possible primes, which end in 3, 7, 9, and 1 in that order. The primes are randomly distributed (well, distributed in a way that is indistinguishable from randomness), so the probability that each of these four is prime is p. (p is a decimal fraction between zero and one, like 0.1 or 0.35 or 0.879. Notice that this means that one minus p is also between zero and one; that's important, too

So the probability that the next higher prime is the one that ends in 3 is p.

But the probability that the next higher prime is the one that ends in 7 is NOT p, it’s p times the probability that 3 is NOT prime, or (p)(1 – p).

And the probability that the next higher prime is the one that ends in 9 is also not p, it’s p times the probability that 3 and 7 are BOTH not prime, or (p)(1 – p)(1 – p).

And the probability that the next higher prime is the next one that ends in 1 is (p)(1 – p)(1 – p)(1 – p).

If none of the next four possible primes is in fact prime, then we look at the next four, then the next four, and so on until we find the next higher prime.

On each level, whatever the probability is that the next higher prime ends in 3, the probability that it ends in 7 is a little bit less (because it’s multiplied by a factor of one minus p, which is between zero and one), the probability that it ends in 9 is still less, and the probability that it ends in 1 is still less.

(This is all a slight simplification, because primes gradually get more thinly distributed as the numbers get higher, so p is actually slightly smaller for each higher number. But this doesn't counter the decreases described in the previous paragraph, it amplifies them slightly.)

Since this is true on every level, it’s also true that when we add together all the probabilities for each final digit, the total probability that it’s 3 is higher than the total probability that it’s 7, which is higher than the total probability that it’s 9, which is higher than the total probability that it’s 1.

The argument is symmetrical for any of the four digits. So if we start with the assumption that the primes are andomly distributed, then we can deduce that the total probability that two consecutive primes will end in the same digit MUST be less than one-fourth.

Ergo, the fact that this turns out to be true of the actual primes is NOT in any way evidence that primes are not in fact randomly distributed (er, distributed in a way that is indistinguishable from randomness). Rather, it’s exactly what we would expect to find.

• Martin Richardt says:

@JS, @et al: Thanks for the explanation

• Greg Springer says:

The coin flipping example can be reproduced in R:
rm(list=ls())
n1 <- 0
n2 <- 0
N <- 10001
counter <- 1
outcomes <- NULL
outcomesLHH <- NULL
outcomesLHT <- NULL
brake <- F
coin <- c("H","T")

# HH
while(n1 < N){
counter <- 1
brake <- F
outcomes <- sample(coin,1)
repeat{
counter = counter + 1
outcomes = append(outcomes, sample(coin,1))
if(outcomes[length(outcomes)] == "H" && outcomes[length(outcomes)-1] == "H"){
outcomesLHH <- append(outcomesLHH, counter)
n1 = n1 + 1
brake = T
}
if(brake){break}
}
}

# HT
while(n2 < N){
counter <- 1
brake <- F
outcomes <- sample(coin,1)
repeat{
counter = counter + 1
outcomes = append(outcomes, sample(coin,1))
if(outcomes[length(outcomes)] == "T" && outcomes[length(outcomes)-1] == "H"){
outcomesLHT <- append(outcomesLHT, counter)
n2 = n2 + 1
brake = T
}
if(brake){break}
}
}

paste0("HH mean number of flips = ", mean(outcomesLHH))
paste0("HT mean number of flips = ", mean(outcomesLHT))

• Michael D. Moffitt says:

Heh… no wonder I thought these statistics looked familiar:

I do remember getting an A+ on that project, though. 🙂

• Michel says:

@David Scott Marley. I agree, but for p very close to zero: (1-p) p is approximately p. I think that is in the article, where it says: "Maybe a prime ending in 3, say, is more likely to be followed by a prime ending in 7, 9 or 1 merely because it encounters numbers with those endings before it reaches another number ending in 3. For example, 43 is followed by 47, 49 and 51 before it hits 53, and one of those numbers, 47, is prime. "

• Confuzzled says:

I don't understand why this isn't expected.

The end digits of every integer with multiples of 2, 5 and 3 removed looks like
1, 7, 1, 3, 7, 9, 3, 9, 1, 7, 1…..

I assume numbers earlier in the sequence are more likely to belong to the prime 'next' to the prime ending in 1, as they are encountered first.
3 and 7 are the two numbers next to a 1, and overall earlier in the sequence in comparison to the 1, and so are going to be more likely to appear as the next prime.
9 is further away.
And 1, being on the inevitable end of each cycle of 8 from the previous 1 beginning the cycle, is the furthest away.

Giving points to each number in the sequence, starting from 8 points for a number immediately after a 1, down to 1 point for a number 8 places away from a 1, and granting places for both 1s in the sequence gives:

For 7: 8 + 5 + 7 + 2 points which is 30.5% of the total points
For 3: 6 + 3 + 8 + 5 points which is 30.5% of the total points
For 9: 4 + 2 + 6 + 4 points which is 22.2% of the total points
For 1: 7 + 1 + 3 + 1 points which is 16.6% of the total points

So although this is a very crude model, and there's probably a better way to assign a weight to the ordering of the possible primes, it is still a good approximation of the results that were observed.

I'd expect if the experiment were repeated but for another digit, you'd see the same contribution of the above sequence to the results of that experiment too.

So yeah, the next number that is not ruled out of being a prime by 2, 5 and 3 is indeed more likely to be the next prime than a number further down the number line.

• Jay Robertson says:

I quote you "Prime numbers, of course, are not really random at all… completely determined". Since when? So far, the only formulaic approach to calculate primes I have seen is at CodeProject. As others pointed out, I see little groundbreaking find that this article purports.
http://www.codeproject.com/Articles/875175/Prime-Number-Distribution-Sequence
Disclaimer – I have no affiliation to CodeProject or the author.

• David Jankoski says:

Thanks for the great article! I am curious though, about the counter-intuitive coin tossing property – can someone please point me to some material on this? Thanks and keep up the good work.

• Robert Shaver says:

Here's my line of thought:
1. A number represents a quantity … a value, right?
2. A digit is part of the notation system we use to write and/or think about these number, right?
3. So the last digit of a number written down is dependent on the base.
4. Why is the significant in any way?
5. What if the number base is not an integer?

• Jon Richfield says:

In maths it all depends on where you look and how hard.
A lot of the comments foregoing (NOT all!) were made without looking very hard at all.
For example, this bit about primes not ending with a two. Anyone saying that wasn't even bothering to read the article, and having failed to do so was failing to think of the implications. Accordingly any such a person was wasting his time and everyone else's even discussing the matter.

A prime number P to any prime base B smaller than itself (such as indeed 3 or 37) can end in any number from 1 to B-1, including in particular ending in 2 in ANY odd base B. This is easy to understand and, in case you do not understand, to determine by inspection. consider say, 13 (to base 10), which is 12 to base 11 and 16 to base 7. (Try it!)

To any compound base C, P can end only in digits that are mutually prime with C.

For example and for obvious reasons, primes base 10 end with 1,3,7, or 9. (Again: try it!)

It simplifies the problem to ignore primes smaller than B, which are irrelevant to the current subject anyway.

A bit more thoughtfully some folks ask about non-integer bases. I think if they examine some examples, they might be surprised how little difference it often makes to change to fractional bases. Try for example using base Phi (the golden section) or any integer power of Phi. Try using square root of 10 as a base, or 1/13. Then try using a negative base, such as -12.
Of course, base e or pi make things tricky, but we are really dealing with integer scalar number theory here, so they are not necessarily relevant if you don't go getting tricky with e to the i*pi or the like. (Sooner you than me!)

What I consider a really disconcerting question is whether there is any interesting variable base positional notation in which primes show any unexpected behaviour, say for example where the units position is a prime radix and the next position is the next prime up and so on.

I was inspired to scribble a bit of code, and what surprised me most was how fast the frequency of repetitions dropped unexpectedly low as the radix increased. I do not intend wasting more time on that, but for radix 79 I could find NO repetition in the first couple of million primes and the largest radix up to 255 in which I found any repetition at all was 210 (what a coincidence) and 210 only gave ONE repetition up to 2 million-odd!

Less surprising (in the light of the article) was that ALL the radixes I tried, presented drastically smaller frequencies of units-digit repetition than one might naively expect.
Obviously the chaps doing that research carried their investigations to many orders of magnitude larger than I did, and good luck to them.
Some of the most intriguing aspects were the near-patterns one could see in representing the effects graphically.

Just to be irritating I present the following as a baseless conjecture:
There exist finite positive integer radixes to which no two successive primes whatever end in the same digit, no matter infinitely far one seeks.

• Robert Shriner says:

HMMM…Using 6N+-1 for N=0 step 1 to remove all products of 2 and 3 gives you a repeating sequence of List A=1,5 ,7,11 ,13, 17,19 ,23 ,25 ,29 ,… step +30. Each value A of List A has a products sequence of A (+4A,+2A) and 30A+1 and 30NA+A. Sequentially each product set of A= 1 ,5 ,7 ,11 ,13 ,17, 19, 23, 25, 29, …is finite and repeats in sets of 10 products with no new primes observed in these sequences except for the first value A and the value A=1. At such time that the harmonics of the prime product sets are inclusive to all sets <30 then P+2,P+4,P+6 will no longer have values A as primes in sequences less than 30N+1 and when this occurs each Prime ending in 1 will only have primes ending in 1. And this will be a very large number, much muchlarger than a billion count. IMHO

• Chandra Nepali says:

I do NOT see why it is shocking discovery. Number threorists have see this first time but people in other fields have already seen this kind of properties with random number (more appropriately numbers supposed to be like random number) in different contests. But they did not throught it was a surpring news.

• Jon Richfield says:

Just to be irritating I presented the following as a baseless conjecture about a month ago, in reaction to the graph produced by a program I scribbled:

"There exist finite positive integer radixes to which no two successive primes whatever end in the same digit, no matter infinitely far one seeks."

I conjectured this thoughtlessly on the basis that the frequency of actually recurring low-order digits dropped quite dramatically with the size of the base.

Anyone who thinks out the implications might find several leaks in the conjecture, but one that occurred to me was in the light of the *Prime number theorem* (non-mathematicians can find it in Wikipedia) that implies that the probability that any positive integer chosen at random will be prime, is roughly the reciprocal of the natural log of that integer. The bigger the number the lower the frequency, but not by much, because the log increases slowly as the numbers increase in size. Another way of looking at it is that the nth prime number p is likely to be roughly equal to n*log(n).

So, as a rule, the differences between successive primes are relatively small; in particular a very important point to note is that the gaps become progressively smaller relative to the successive primes; in my most conveniently available file of primes, containing the first 2345678 primes, extending up to about 38 million, nearly half the gaps are 10 or less; roughly 73% of all the gaps are 20 or less; and about 99% are 60 or less. The biggest gap I could find was 210 (the gap between 20831323 and 20831533) and that was an outlier.

Now:

What the discovery amounts to is that, taken to any integer radix, there is a deficiency in the frequency of least significant digits in successive primes being the same. (The first three decimal examples are between 139 – 149, 181 – 191 & 241 – 251 by which time we have had multiple gaps of twos, fours, sixes and an eight, not to mention an outlier of 14 after 113.)

And where such an example occurs, it necessarily implies that, as we see, the difference between the primes is a multiple of the radix. For example 887 – 907 and 4297 – 4327.

But, the gaps above 2 – 3 are all even numbers. This means that in an odd radix such as 3 or 23 the gaps have to be at least twice as large as the radix. And for that, gaps of the sizes of interest generally occur much less than half as often as gaps about half the size. And in particular, if a radix R is a prime, say 11, there are R-1 possible least-significant digits of primes. For example, to radix 3 it could be 1 or 2, and in radix 11 it could be 1 to 10, while for radix 10 only 1, 3, 7 & 9 are possible. A greater number of possible digits increases the scope for the next prime to end with a different digit.

Looked at that way, this effect, though it forced some interesting thinking along unexpected lines, becomes less surprising, though I do not believe that the foregoing material is the whole story; and anyway it certainly does not dispel other, more mysterious, attributes of primes.

• Giovanni M.Mazza says:

The discovery of Soundararajan and Lemke is consistent with the central limit theorem of Lindeberg and Turing. So Riemann's hypothesis appears non-correct:
this would be correct if the distribution of primes was that of tosses of a fair coin.

• Barry Kendrick says:

The coin flipping example is not only counter-intuitive it's also a flipping fantasy!

Jim Flee-continuing your 4 coin flip example where you found 8 HH of 16 possibilities; you will note there are 11 HTs. And, while we find one of Bob's HH to be a loser[i.e,HT comes first], we find 4 of Alice's losers. Which is to say each has 7 ways to win after 4 coin flips.

The nature of the fallacy becomes clear if we look at a mere 3 coin flips. Counting in the same erroneous manner, we find 4 HT & 3 HH but HHT is a loser for Alice, so we have 3 ways for each to win.

Clearly HHH has 2 HH in it: yes, the 2nd "doesn't count"…but in the same fashion that the HT in HHT doesn't count. So, in effect, we've been counting the ones that don't count for Alice, but not all for Bob.

• Barry Kendrick says:

JS said "I think it's easier to compare HH with TH." Well, in any case, this is a totally different problem.

With Bob HH & Alice HT, each has a 50% chance of winning; with HH, TH ; Alice wins 75 % of the time.

In this HH, TH case, consider the sitch after the first 2 flips: either one or the other has won it, or we have HT or TT ; & now-exactly as JS reasoned-we must have TH before HH in these 2 cases.

• Barry Kendrick says:

While a proof would be tedious, the reason for the observed bias is actually obvious. Let's see if I can make it clear.

To make things convenient, assume I have a relatively large # divisible by 100, & this #+1 is prime. I'm only going to write down the last 2 digits of my prime #: 01.

So the written #s are only the last 2 digits.

Note that the next 2 #s ending in 1, namely 11 & 21, each has a 50% chance of being divisible by 3. On the other hand increasing my prime # 01 by 6 gives me a # not divisible by 3. And thus 07, 13, 19(we discard 25) are twice as likely to be prime than 11 or 21! And 07 even comes before 11. 31 is the next # ending in 1 that's definitely not divisible by 3. And that's it! Do you see how things are skewed?!

Now it is true that there is a 1/6 chance that 11 is divisible by 7; a 1/10 chance, by 11, & so on, but the influence of higher primes quickly decreases.

The reason I used 3 is that it's the smallest prime not in my base 10=2*5.

• Barry Kendrick says:

To clarify,my comment of yesterday, compare the probabilities[based only on divisibility by 3] of the next prime ending in a 1 or 7, after 01. Going along the # line, 07, p; 11, .5p; 17, .5p; 21, .5p; 27, .5p; 31, p. Bet on the 7!

Note how, in the text, they find it inexplicable that 3 is so more often followed by a prime ending in a 9 than a 7 or 1. But 09 is 6 after our prime 03[In my notation 03 represents the last two digits of a large prime #]& is therefore not divisible by 3; 07is 4 after 03 & 1 is 8 after 03, so each has a 50% chance of being divisible by 3. It's dead obvious, isn't it !

• Robert Lipp says:

All this is only relevant in base 10. It seems silly to infer anything from that. For example in base 5 and base 15, 12 is a prime number. These results are simple artifacts of base 10. Get more interesting; how about base 2 or really interesting, base n?

• Robert Lipp says:

• Ed says:

In base 2, the probability that p(n+1) ends in 1 is 100%. I'm curious how the property in the article varies through the bases. Base 2 could be considered "attractive" of the final digit being the same, base 10, "repulsive". Base 16 (and all multiples and fractions thereof) is easy to check… I wonder if there are other "attractive" bases, or a base that is as repulsive as base 2 is attractive.

• Doug Dorland says:

Things are not as they seem. The primes are behaving as one would expect. You just need to take a closer look at their order.

• Pete Jones says:

…"Most mathematicians would have assumed, Granville and Ono agreed, that a prime should have an equal chance of being followed by a prime ending in 1, 3, 7 or 9 (the four possible endings for all prime numbers except 2 and 5). "

Would they really have assumed this? I find this surprising. Even without the analysis in this article it seems safe to assume that for the following prime to have the same ending would be the least likely possibility by a decent margin.

• Van Calck, Guy says:

Can someone indicate a hint on the following :
Let d be an even divisor of p-1 or of p+1, where p is a prime number ; when is
gcd(d!,p+d) equal to 1? Not all prime numbers p have such a number d, but I am interested for an obvious reason, namely constructing a 'next' prime number p+d
Any clue?
Thanks

• G. L. Honaker, Jr. says:

How about 2-digit endings? I recently noticed that "57" occurs most often among the first 1000 primes, and wonder if all 40 permissible ones level off as n approaches infinity? (www.PrimeCurios.com)