Last summer, three researchers took a small step toward answering one of the most important questions in theoretical computer science. To paraphrase Avi Wigderson of the Institute for Advanced Study, that question asks something simple but profound: Can we solve all the problems we hope to solve?
More precisely, computer scientists want to know whether all the problems we hope to solve can be solved efficiently, in a reasonable amount of time — before the end of the universe, say. If not, they are simply far too difficult.
Many problems seem to be this hard, but we won’t know for certain until we can mathematically prove their difficulty. And in a paper from last year, a trio of computer scientists showed that a broad category of problems are indeed too difficult to be solved efficiently, thereby providing one of the best examples yet of what the field has been seeking.
“We’re looking for stronger footholds as we’re climbing this mountain,” said Paul Beame of the University of Washington. “And this is another foothold on this route.”
The problems the researchers studied require only addition and multiplication. When these problems are restricted to being solved in specific ways, with alternating patterns of addition and multiplication, they become extremely difficult to solve.
“It just changed our knowledge dramatically,” said Amir Shpilka of Tel Aviv University in Israel. “We didn’t know how to do anything close to that.”
Surprisingly, the findings required no new frameworks or tools. Instead, the authors figured out how to bypass a mathematical roadblock that had emerged in decades-old work by Wigderson in collaboration with Noam Nisan of the Hebrew University of Jerusalem.
“We realized there was a very silly way of getting around it,” said Srikanth Srinivasan of Aarhus University in Denmark, who authored the new work along with Nutan Limaye of the IT University of Copenhagen and Sébastien Tavenas of Savoy Mont Blanc University in Chambéry, France. “And it feels like if there’s such an easy way to do something we didn’t think possible, surely there should be a way to go further.”
In the 1960s, after computers had been around for a couple decades, a troubling trend emerged. Scientists had successfully created algorithms to get their computers to solve various problems, but sometimes these algorithms took too long — longer than was practical.
They began to suspect that some problems were just fundamentally hard, regardless of the problem’s scale. For example, consider a collection of points with edges connecting them, called a graph. An important problem is to determine whether there exists a path, called a Hamiltonian path, that travels along the edges to visit each point just once. It stands to reason that if you increase the number of points (and edges), it will take longer to determine whether such a path exists. But the best algorithms took exponentially longer with increasing scale, making them impractical.
Many other problems seemed to take exponential time as well. Computer scientists would go on to show that any algorithm that could somehow solve one of these hard problems efficiently could be transformed to do the same for others of similar difficulty. They called this category of problems NP. When Wigderson referred to “all the problems we hope to solve,” he meant the problems in NP, which come up in many contexts.
Of course, there were also many problems that did not seem to be hard, and that did not take exponential time to solve. Computer scientists showed that many of these problems were also equivalent in a certain sense, and they called this category P. They suspected that NP problems were truly harder than P problems, and that problems in NP could never be solved efficiently. But without proof, there was always a chance that they were wrong.
Computer scientists began to search for ways to prove that NP problems really were harder, a question formally known among researchers as P versus NP. To answer it, it would suffice to prove that a hard problem really required exponential time, but they had no idea how to do that. So they sought out other avenues where hardness might not be so hard to pin down.
How Hard Is Hard?
One particular set of problems seemed just right. This was the set that depended only on addition and multiplication. For example, given a set of points, it is possible to count all the possible Hamiltonian paths (if any exist) merely by adding and multiplying data about the points. (The same count can also be performed if other operations are allowed, such as comparing values, but then the problem and solution become more complicated.)
Further, this set of simpler “arithmetic” problems mirrored the larger landscape of more complicated tasks. That is, some arithmetic problems (like counting Hamiltonian paths) seemed to take exponentially more time as the scale of the problem increased. In 1979, Leslie Valiant of Harvard University showed that many arithmetic problems were equivalent in their difficulty, and that others were equivalent in their lack of difficulty. Computer scientists would later name these categories after him — VNP and VP, respectively.
As with the P versus NP question, the hardness of VNP problems could not be proved. Computer scientists had rediscovered the P versus NP problem in a new form — VP versus VNP — only now they had a key advantage. VNP problems are even more difficult than NP problems, because they build on them: Counting paths requires you to be able to determine if a path exists. And if you want to prove something’s hard, you want as much hardness as possible.
“It’s harder than NP,” said Shpilka. “And therefore proving that it’s hard should be easier.”
In the ensuing decades, computer scientists made much more progress on the VP versus VNP question than they had on P versus NP, but most of it was limited to the subfield that Valiant had created known as algebraic complexity. Until the recent work by Limaye, Srinivasan and Tavenas, they still had trouble telling whether there were any arithmetic problems that were hard in a general sense.
Sizing Up Polynomials
To understand the new work, it helps to learn how computer scientists think about addition and multiplication problems. Mathematically, these problems are completely captured by expressions called polynomials — like x2 + 5y + 6 — which consist of variables that are added and multiplied together.
For any particular problem, like counting Hamiltonian paths, you can build a polynomial that represents it. You can represent each point and edge with a variable, for example, so that as you add more points and edges, you also add more variables to the polynomial.
To prove an arithmetic problem like counting Hamiltonian paths is hard, you need to show that when you add more points and edges, the corresponding polynomial takes exponentially more operations to solve. For example, x2 requires one operation (multiplying x by x), whereas x2 + y takes two operations (multiplying x by x and then adding y). The number of operations is called a polynomial’s size.
But a polynomial’s size is a difficult thing to be certain about. Take the polynomial x2 + 2x + 1. It appears to have a size of 4 (two multiplications and two additions). But the polynomial can be rewritten as the product of two sums, (x + 1)(x + 1), which has fewer operations — still two additions, but now only one multiplication. Generally, as a problem scales and more variables are added to its polynomial, you can keep simplifying and shrinking its size.
A few years after Valiant’s work, computer scientists found a way to make the size of a problem easier to analyze. To do so, they came up with a property known as depth, which specifies the number of times the polynomial switches, or alternates, between sums and products. The polynomial x2 + 2x + 1 has a depth of two, for example, because it is a sum of products (like x2 and 2x). In contrast, the expression (x + 1)(x + 1) has a depth of three, because it is technically the same as 0 + (x + 1)(x + 1), so it is a sum of products, one of which is made up of sums — making the whole expression a sum of products of sums.
To simplify polynomials, computer scientists restricted them to a form in which they had a property called constant depth, where the pattern of sums and products doesn’t change as the problem grows. This makes their size more static, since the size of a polynomial tends to decrease as its depth increases. Representations of a certain constant depth (such as depth three) are known as formulas.
By studying polynomials of constant depth, as well as graphs that represent them (called arithmetic circuits), computer scientists were able to make more progress. Gradually, they uncovered a sequence of findings that eventually culminated with the new work.
The first step toward the new result came in a 1996 paper by Nisan and Wigderson. The pair focused on a frequently occurring problem that involved the multiplication of tables of numbers called matrices. They simplified this problem in two ways. First, they represented it with formulas of a constant depth — depth three. Second, they only considered formulas with a certain simple structure, where each variable has a maximum exponent of 1 — a property that makes them “multilinear.” (Actually, they only considered formulas with a slight variation of this property, known as set-multilinear formulas.) Computer scientists already knew that certain problems could be converted to this relatively simple set-multilinear structure, at the cost of a sub-exponential increase in their size — a rate of growth comparable to exponential growth.
Nisan and Wigderson then showed that the matrix multiplication problem took exponential time to solve as the matrices scaled up. In other words, they proved that an important problem was hard, a notable victory in the broader enterprise of proving hardness. However, their result was limited because it only applied to formulas with a simplistic, set-multilinear structure.
“If you were working outside of algebraic complexity, that might not have meant very much for you,” said Beame.
But over the next 20 years, computer scientists came to better understand the properties of depth and structure, building on what we’ve already seen: Increasing the depth of a polynomial tends to cause a decrease in its size, and giving it a simple structure increases it. Depth and structure therefore play a kind of tug of war with hardness.
Over time, computer scientists made the balancing act between these two properties precise. They showed that adding two levels of depth to a depth-three, set-multilinear polynomial balanced out the gain in size from its set-multilinear structure. They could then expand on this to show that if a structured formula of depth five took exponential time, so would a depth-three formula of a general, unstructured nature.
The authors of the new work then showed that depth-five set-multilinear formulas for the matrix multiplication problem do indeed grow at a rate comparable to exponential. By the earlier relation, that means that general depth-three formulas also take exponential time. They then showed that a similar balancing act held for all depths — not just three and five. With that relationship in hand, they proved for the same problem that the size of a general formula of any depth grows at exponential rates as the problem scales.
In so doing, they proved that matrix multiplication is hard whenever it is represented by a formula of a constant depth, regardless of what that depth may be. Although the depth three formulas had been intensively studied before, we still didn’t know if they were hard — and we knew nothing about the hardness (or easiness) of formulas of greater depths. “It’s a humongous breakthrough,” said Shubhangi Saraf of the University of Toronto. “It’s a coming together of a lot of beautiful results from the last 30 years.”
The result provides the first general understanding of when an arithmetic problem is hard — at least, when it is restricted to being represented by formulas of constant depth. The specific problem of matrix multiplication the researchers focused on was already known to be a VP problem. And since VP problems are known to be relatively easy when not restricted to a constant depth, the result isolates the constant-depth restriction as the source of hardness.
“The model is so restricted that even things that should be easy in an unrestricted world become hard,” said Shpilka.
The ultimate question in the field of algebraic complexity is whether VNP problems are hard compared to problems from VP. The new result does not speak to this directly, since it only shows that constant-depth formulas are hard. Nevertheless, researchers are working to build on the result in ways that might allow them to reach an answer.
“That might still be a long shot. It likely is,” said Saraf. “But it is still a big milestone on the way to [showing] VP is not equal to VNP.”
And for the greater P versus NP question, we can now be a little more optimistic about the prospects of one day finding an answer. After all, in order to solve the problems we hope to solve, we first need to know which ones are hopeless.
Correction: May 12, 2022
An earlier version of the “What Is Size?” figure had a formula with the wrong coefficients. It has been replaced.