game theory

The Game Theory of How Algorithms Can Drive Up Prices

Recent findings reveal that even simple pricing algorithms can make things more expensive.
A tangle of stock tickers

Nash Weerasekera for Quanta Magazine

Introduction

Imagine a town with two widget merchants. Customers prefer cheaper widgets, so the merchants must compete to set the lowest price. Unhappy with their meager profits, they meet one night in a smoke-filled tavern to discuss a secret plan: If they raise prices together instead of competing, they can both make more money. But that kind of intentional price-fixing, called collusion, has long been illegal. The widget merchants decide not to risk it, and everyone else gets to enjoy cheap widgets.

For well over a century, U.S. law has followed this basic template: Ban those backroom deals, and fair prices should be maintained. These days, it’s not so simple. Across broad swaths of the economy, sellers increasingly rely on computer programs called learning algorithms, which repeatedly adjust prices in response to new data about the state of the market. These are often much simpler than the “deep learning” algorithms that power modern artificial intelligence, but they can still be prone to unexpected behavior.

So how can regulators ensure that algorithms set fair prices? Their traditional approach won’t work, as it relies on finding explicit collusion. “The algorithms definitely are not having drinks with each other,” said Aaron Roth, a computer scientist at the University of Pennsylvania.

Yet a widely cited 2019 paper showed that algorithms could learn to collude tacitly, even when they weren’t programmed to do so. A team of researchers pitted two copies of a simple learning algorithm against each other in a simulated market, then let them explore different strategies for increasing their profits. Over time, each algorithm learned through trial and error to retaliate when the other cut prices — dropping its own price by some huge, disproportionate amount. The end result was high prices, backed up by mutual threat of a price war.

Implicit threats like this also underpin many cases of human collusion. So if you want to guarantee fair prices, why not just require sellers to use algorithms that are inherently incapable of expressing threats?

In a recent paper, Roth and four other computer scientists showed why this may not be enough. They proved that even seemingly benign algorithms that optimize for their own profit can sometimes yield bad outcomes for buyers. “You can still get high prices in ways that kind of look reasonable from the outside,” said Natalie Collina, a graduate student working with Roth who co-authored the new study.

Researchers don’t all agree on the implications of the finding — a lot hinges on how you define “reasonable.” But it reveals how subtle the questions around algorithmic pricing can get, and how hard it may be to regulate.

“Without some notion of a threat or an agreement, it’s very hard for a regulator to come in and say, ‘These prices feel wrong,’” said Mallesh Pai, an economist at Rice University. “That’s one reason why I think this paper is important.”

No Regrets

The recent paper studies algorithmic pricing through the lens of game theory, an interdisciplinary field at the border of economics and computer science that analyzes the mathematics of strategic competitions. It’s one way to explore the failures of pricing algorithms in a controlled setting.

“What we’re trying to do is create collusion in the lab,” said Joseph Harrington, a University of Pennsylvania economist who wrote an influential review paper on regulating algorithmic collusion and was not involved in the new research. “Once we do so, we want to figure out how to destroy collusion.”

To understand the key ideas, it helps to start with the simple game of rock-paper-scissors. A learning algorithm, in this context, can be any strategy that a player uses to choose a move in each round based on data from previous rounds. Players might try out different strategies over the course of the game. But if they’re playing well, they’ll ultimately converge to a state that game theorists call equilibrium. In equilibrium, each player’s strategy is the best possible response to the other’s strategy, so neither player has an incentive to change.

In rock-paper-scissors, the ideal strategy is simple: You should play a random move each round, choosing all three possibilities equally often. Learning algorithms shine if one player takes a different approach. In that case, choosing moves based on previous rounds can help the other player win more often than if they just played randomly.

Suppose, for instance, that after many rounds you realize that your opponent, a geologist, chose rock more than 50% of the time. If you’d played paper every round, you would have won more often. Game theorists refer to this painful realization as regret.

Researchers have devised simple learning algorithms that are always guaranteed to leave you with zero regret. Slightly more sophisticated learning algorithms called “no-swap-regret” algorithms also guarantee that whatever your opponent did, you couldn’t have done better by swapping all instances of any move with any other move (say, by playing paper every time you actually played scissors). In 2000, game theorists proved that if you pit two no-swap-regret algorithms against each other in any game, they’ll end up in a specific kind of equilibrium — one that would be the optimal equilibrium if they only played a single round. That’s an attractive property, because single-round games are much simpler than multi-round ones. In particular, threats don’t work because players can’t follow through.

In a 2024 paper, Jason Hartline, a computer scientist at Northwestern University, and two graduate students translated the classic results from the 2000 paper to a model of a competitive market, where players can set new prices every round. In that context, the results implied that dueling no-swap-regret algorithms would always end up with competitive prices when they reached equilibrium. Collusion was impossible.

However, no-swap-regret algorithms aren’t the only pricing game strategies in the world of online marketplaces. So what happens when a no-swap-regret algorithm faces a different benign-looking opponent?

The Price Is Wrong

According to game theorists, the best strategy to play against a no-swap-regret algorithm is simple: Start with a specific probability for each possible move, and then choose one move at random every round, no matter what your opponent does. The ideal assignment of probabilities for this “nonresponsive” approach depends on the specific game you’re playing.

In the summer of 2024, Collina and her colleague Eshwar Arunachaleswaran set out to find those optimal probabilities for a two-player pricing game. They found that the best strategy assigned strikingly high probabilities to very high prices, along with lower probabilities for a wide range of lower prices. If you’re playing against a no-swap-regret algorithm, this strange strategy will maximize your profit. “To me, it was a complete surprise,” Arunachaleswaran said.

A smiling man standing on the sidewalk

Eshwar Arunachaleswaran and Collina obtained their result while exploring the best responses to well-behaved pricing algorithms.

Paritosh Verma

Nonresponsive strategies look superficially innocuous. They can’t convey threats, because they don’t react to their opponents’ moves at all. But they can coax learning algorithms to raise their prices, and then reap profits by occasionally undercutting their competitors.

At first, Collina and Arunachaleswaran thought that this artificial scenario wasn’t relevant to the real world. Surely the player using the no-swap-regret algorithm would switch to a different algorithm after realizing that their competitor was profiting at their expense.

But as they studied the problem further and discussed it with Roth and two other colleagues, they realized their intuition was wrong. The two players in their scenario were already in a state of equilibrium. Their profits were nearly equal, and both were as high as possible as long as neither player switched to a different algorithm. Neither player would have an incentive to change strategy, so buyers would be stuck with high prices. What’s more, the precise probabilities weren’t that important. Many different choices led to high prices when pitted against a no-swap-regret algorithm. It’s an outcome you’d expect from collusion, but without any collusive behavior in sight.

It Pays To Be Dumb

So, what can regulators do? Roth admits he doesn’t have an answer. It wouldn’t make sense to ban no-swap-regret algorithms: If everyone uses one, prices will fall. But a simple nonresponsive strategy might be a natural choice for a seller on an online marketplace like Amazon, even if it carries the risk of regret.

“One way to have regret is just to be kind of dumb,” Roth said. “Historically, that hasn’t been illegal.”

As Hartline sees it, the problem of algorithmic collusion has a simple solution: Ban all pricing algorithms except the no-swap-regret algorithms that game theorists have long favored. There may be practical ways to do this: In their 2024 work, Hartline and his colleagues devised a method for checking if an algorithm has a no-swap-regret property without looking at its code.

Hartline acknowledged that his preferred solution wouldn’t prevent all bad outcomes when no-swap-regret algorithms compete with humans. But he argued that scenarios like the one in Roth’s paper aren’t cases of algorithmic collusion.

“Collusion is a two-way thing,” he said. “It fundamentally must be the case that there are actions a single player can do to not collude.”

Either way, the new work still leaves many open questions about how algorithmic pricing can go wrong in the real world.

“We still don’t understand nearly as much as we want,” Pai said. “It’s an important question for our time.”

Comment on this article