statistics

A Long-Sought Proof, Found and Almost Lost

When a German retiree proved a famous long-standing mathematical conjecture, the response was underwhelming.
Thomas Royen at his home in Schwalbach am Taunus, Germany.

Thomas Royen at his home in Schwalbach am Taunus, Germany.

Rüdiger Nehmzow for Quanta Magazine

As he was brushing his teeth on the morning of July 17, 2014, Thomas Royen, a little-known retired German statistician, suddenly lit upon the proof of a famous conjecture at the intersection of geometry, probability theory and statistics that had eluded top experts for decades.

Known as the Gaussian correlation inequality (GCI), the conjecture originated in the 1950s, was posed in its most elegant form in 1972 and has held mathematicians in its thrall ever since. “I know of people who worked on it for 40 years,” said Donald Richards, a statistician at Pennsylvania State University. “I myself worked on it for 30 years.”

Royen hadn’t given the Gaussian correlation inequality much thought before the “raw idea” for how to prove it came to him over the bathroom sink. Formerly an employee of a pharmaceutical company, he had moved on to a small technical university in Bingen, Germany, in 1985 in order to have more time to improve the statistical formulas that he and other industry statisticians used to make sense of drug-trial data. In July 2014, still at work on his formulas as a 67-year-old retiree, Royen found that the GCI could be extended into a statement about statistical distributions he had long specialized in. On the morning of the 17th, he saw how to calculate a key derivative for this extended GCI that unlocked the proof. “The evening of this day, my first draft of the proof was written,” he said.

Not knowing LaTeX, the word processer of choice in mathematics, he typed up his calculations in Microsoft Word, and the following month he posted his paper to the academic preprint site arxiv.org. He also sent it to Richards, who had briefly circulated his own failed attempt at a proof of the GCI a year and a half earlier. “I got this article by email from him,” Richards said. “And when I looked at it I knew instantly that it was solved.”

Upon seeing the proof, “I really kicked myself,” Richards said. Over the decades, he and other experts had been attacking the GCI with increasingly sophisticated mathematical methods, certain that bold new ideas in convex geometry, probability theory or analysis would be needed to prove it. Some mathematicians, after years of toiling in vain, had come to suspect the inequality was actually false. In the end, though, Royen’s proof was short and simple, filling just a few pages and using only classic techniques. Richards was shocked that he and everyone else had missed it. “But on the other hand I have to also tell you that when I saw it, it was with relief,” he said. “I remember thinking to myself that I was glad to have seen it before I died.” He laughed. “Really, I was so glad I saw it.”

Rüdiger Nehmzow for Quanta Magazine

Richards notified a few colleagues and even helped Royen retype his paper in LaTeX to make it appear more professional. But other experts whom Richards and Royen contacted seemed dismissive of his dramatic claim. False proofs of the GCI had been floated repeatedly over the decades, including two that had appeared on arxiv.org since 2010. Bo’az Klartag of the Weizmann Institute of Science and Tel Aviv University recalls receiving the batch of three purported proofs, including Royen’s, in an email from a colleague in 2015. When he checked one of them and found a mistake, he set the others aside for lack of time. For this reason and others, Royen’s achievement went unrecognized.

Proofs of obscure provenance are sometimes overlooked at first, but usually not for long: A major paper like Royen’s would normally get submitted and published somewhere like the Annals of Statistics, experts said, and then everybody would hear about it. But Royen, not having a career to advance, chose to skip the slow and often demanding peer-review process typical of top journals. He opted instead for quick publication in the Far East Journal of Theoretical Statistics, a periodical based in Allahabad, India, that was largely unknown to experts and which, on its website, rather suspiciously listed Royen as an editor. (He had agreed to join the editorial board the year before.)

With this red flag emblazoned on it, the proof continued to be ignored. Finally, in December 2015, the Polish mathematician Rafał Latała and his student Dariusz Matlak put out a paper advertising Royen’s proof, reorganizing it in a way some people found easier to follow. Word is now getting around. Tilmann Gneiting, a statistician at the Heidelberg Institute for Theoretical Studies, just 65 miles from Bingen, said he was shocked to learn in July 2016, two years after the fact, that the GCI had been proved. The statistician Alan Izenman, of Temple University in Philadelphia, still hadn’t heard about the proof when asked for comment last month.

No one is quite sure how, in the 21st century, news of Royen’s proof managed to travel so slowly. “It was clearly a lack of communication in an age where it’s very easy to communicate,” Klartag said.

“But anyway, at least we found it,” he added — and “it’s beautiful.”

In its most famous form, formulated in 1972, the GCI links probability and geometry: It places a lower bound on a player’s odds in a game of darts, including hypothetical dart games in higher dimensions.

Lucy Reading-Ikkanda/Quanta Magazine

Imagine two convex polygons, such as a rectangle and a circle, centered on a point that serves as the target. Darts thrown at the target will land in a bell curve or “Gaussian distribution” of positions around the center point. The Gaussian correlation inequality says that the probability that a dart will land inside both the rectangle and the circle is always as high as or higher than the individual probability of its landing inside the rectangle multiplied by the individual probability of its landing in the circle. In plainer terms, because the two shapes overlap, striking one increases your chances of also striking the other. The same inequality was thought to hold for any two convex symmetrical shapes with any number of dimensions centered on a point.

Special cases of the GCI have been proved — in 1977, for instance, Loren Pitt of the University of Virginia established it as true for two-dimensional convex shapes — but the general case eluded all mathematicians who tried to prove it. Pitt had been trying since 1973, when he first heard about the inequality over lunch with colleagues at a meeting in Albuquerque, New Mexico. “Being an arrogant young mathematician … I was shocked that grown men who were putting themselves off as respectable math and science people didn’t know the answer to this,” he said. He locked himself in his motel room and was sure he would prove or disprove the conjecture before coming out. “Fifty years or so later I still didn’t know the answer,” he said.

Despite hundreds of pages of calculations leading nowhere, Pitt and other mathematicians felt certain — and took his 2-D proof as evidence — that the convex geometry framing of the GCI would lead to the general proof. “I had developed a conceptual way of thinking about this that perhaps I was overly wedded to,” Pitt said. “And what Royen did was kind of diametrically opposed to what I had in mind.”

Royen’s proof harkened back to his roots in the pharmaceutical industry, and to the obscure origin of the Gaussian correlation inequality itself. Before it was a statement about convex symmetrical shapes, the GCI was conjectured in 1959 by the American statistician Olive Dunn as a formula for calculating “simultaneous confidence intervals,” or ranges that multiple variables are all estimated to fall in.

Suppose you want to estimate the weight and height ranges that 95 percent of a given population fall in, based on a sample of measurements. If you plot people’s weights and heights on an xy plot, the weights will form a Gaussian bell-curve distribution along the x-axis, and heights will form a bell curve along the y-axis. Together, the weights and heights follow a two-dimensional bell curve. You can then ask, what are the weight and height ranges — call them –w < x < w and –h < y < h — such that 95 percent of the population will fall inside the rectangle formed by these ranges?

If weight and height were independent, you could just calculate the individual odds of a given weight falling inside –w < x < w and a given height falling inside –h < y < h, then multiply them to get the odds that both conditions are satisfied. But weight and height are correlated. As with darts and overlapping shapes, if someone’s weight lands in the normal range, that person is more likely to have a normal height. Dunn, generalizing an inequality posed three years earlier, conjectured the following: The probability that both Gaussian random variables will simultaneously fall inside the rectangular region is always greater than or equal to the product of the individual probabilities of each variable falling in its own specified range. (This can be generalized to any number of variables.) If the variables are independent, then the joint probability equals the product of the individual probabilities. But any correlation between the variables causes the joint probability to increase.

Royen found that he could generalize the GCI to apply not just to Gaussian distributions of random variables but to more general statistical spreads related to the squares of Gaussian distributions, called gamma distributions, which are used in certain statistical tests. “In mathematics, it occurs frequently that a seemingly difficult special problem can be solved by answering a more general question,” he said.

Rüdiger Nehmzow for Quanta Magazine

Royen represented the amount of correlation between variables in his generalized GCI by a factor we might call C, and he defined a new function whose value depends on C. When C = 0 (corresponding to independent variables like weight and eye color), the function equals the product of the separate probabilities. When you crank up the correlation to the maximum, C = 1, the function equals the joint probability. To prove that the latter is bigger than the former and the GCI is true, Royen needed to show that his function always increases as C increases. And it does so if its derivative, or rate of change, with respect to C is always positive.

His familiarity with gamma distributions sparked his bathroom-sink epiphany. He knew he could apply a classic trick to transform his function into a simpler function. Suddenly, he recognized that the derivative of this transformed function was equivalent to the transform of the derivative of the original function. He could easily show that the latter derivative was always positive, proving the GCI. “He had formulas that enabled him to pull off his magic,” Pitt said. “And I didn’t have the formulas.”

Any graduate student in statistics could follow the arguments, experts say. Royen said he hopes the “surprisingly simple proof … might encourage young students to use their own creativity to find new mathematical theorems,” since “a very high theoretical level is not always required.”

Some researchers, however, still want a geometric proof of the GCI, which would help explain strange new facts in convex geometry that are only de facto implied by Royen’s analytic proof. In particular, Pitt said, the GCI defines an interesting relationship between vectors on the surfaces of overlapping convex shapes, which could blossom into a new subdomain of convex geometry. “At least now we know it’s true,” he said of the vector relationship. But “if someone could see their way through this geometry we’d understand a class of problems in a way that we just don’t today.”

Beyond the GCI’s geometric implications, Richards said a variation on the inequality could help statisticians better predict the ranges in which variables like stock prices fluctuate over time. In probability theory, the GCI proof now permits exact calculations of rates that arise in “small-ball” probabilities, which are related to the random paths of particles moving in a fluid. Richards says he has conjectured a few inequalities that extend the GCI, and which he might now try to prove using Royen’s approach.

Royen’s main interest is in improving the practical computation of the formulas used in many statistical tests — for instance, for determining whether a drug causes fatigue based on measurements of several variables, such as patients’ reaction time and body sway. He said that his extended GCI does indeed sharpen these tools of his old trade, and that some of his other recent work related to the GCI has offered further improvements. As for the proof’s muted reception, Royen wasn’t particularly disappointed or surprised. “I am used to being frequently ignored by scientists from [top-tier] German universities,” he wrote in an email. “I am not so talented for ‘networking’ and many contacts. I do not need these things for the quality of my life.”

The “feeling of deep joy and gratitude” that comes from finding an important proof has been reward enough. “It is like a kind of grace,” he said. “We can work for a long time on a problem and suddenly an angel — [which] stands here poetically for the mysteries of our neurons — brings a good idea.”

This article was reprinted on Wired.com.

Comment on this article