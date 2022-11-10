Fixing this problem with new all-in-one chips that put memory and computation in the same place seems straightforward. It’s also closer to how our brains likely process information, since many neuroscientists believe that computation happens within populations of neurons, while memories are formed when the synapses between neurons strengthen or weaken their connections. But creating such devices has proved difficult, since current forms of memory are incompatible with the technology in processors.

Computer scientists decades ago developed the materials to create new chips that perform computations where memory is stored — a technology known as compute-in-memory. But with traditional digital computers performing so well, these ideas were overlooked for decades.

“That work, just like most scientific work, was kind of forgotten,” said Wong, a professor at Stanford.

Indeed, the first such device dates back to at least 1964, when electrical engineers at Stanford discovered they could manipulate certain materials, called metal oxides, to turn their ability to conduct electricity on and off. That’s significant because a material’s ability to switch between two states provides the backbone for traditional memory storage. Typically, in digital memory, a state of high voltage corresponds to a 1, and low voltage to a 0.

To get an RRAM device to switch states, you apply a voltage across metal electrodes hooked up to two ends of the metal oxide. Normally, metal oxides are insulators, which means they don’t conduct electricity. But with enough voltage, the current builds up, eventually pushing through the material’s weak spots and forging a path to the electrode on the other side. Once the current has broken through, it can flow freely along that path.

Wong likens this process to lightning: When enough charge builds up inside a cloud, it quickly finds a low-resistance path and lightning strikes. But unlike with lightning, whose path disappears, the path through the metal oxide remains, meaning it stays conductive indefinitely. And it’s possible to erase the conductive path by applying another voltage to the material. So researchers can switch an RRAM between two states and use them to store digital memory.

Midcentury researchers didn’t recognize the potential for energy-efficient computing, nor did they need it yet with the smaller algorithms they were working with. It took until the early 2000s, with the discovery of new metal oxides, for researchers to realize the possibilities.

Wong, who was working at IBM at the time, recalls that an award–winning colleague working on RRAM admitted he didn’t fully understand the physics involved. “If he doesn’t understand it,” Wong remembers thinking, “maybe I should not try to understand it.”

But in 2004, researchers at Samsung Electronics announced that they had successfully integrated RRAM memory built on top of a traditional computing chip, suggesting that a compute-in-memory chip might finally be possible. Wong resolved to at least try.

Compute-in-Memory Chips for AI

For more than a decade, researchers like Wong worked to build up RRAM technology to the point where it could reliably handle high-powered computing tasks. Around 2015, computer scientists began to recognize the enormous potential of these energy-efficient devices for large AI algorithms, which were beginning to take off. That year, scientists at the University of California, Santa Barbara showed that RRAM devices could do more than just store memory in a new way. They could execute basic computing tasks themselves — including the vast majority of computations that take place within a neural network’s artificial neurons, which are simple matrix multiplication tasks.

In the NeuRRAM chip, silicon neurons are built into the hardware, and the RRAM memory cells store the weights — the values representing the strength of the connections between neurons. And because the NeuRRAM memory cells are analog, the weights that they store represent the full range of resistance states that occur while the device switches between a low-resistance to a high-resistance state. This enables even higher energy efficiency than digital RRAM memory can achieve because the chip can run many matrix computations in parallel — rather than in lockstep one after another, as in the digital processing versions.

But since analog processing is still decades behind digital processing, there are still many issues to iron out. One is that analog RRAM chips must be unusually precise since imperfections on the physical chip can introduce variability and noise. (For traditional chips, with only two states, these imperfections don’t matter nearly as much.) That makes it significantly harder for analog RRAM devices to run AI algorithms, given that the accuracy of, say, recognizing an image will suffer if the conductive state of the RRAM device isn’t exactly the same every time.

“When we look at a lighting path, every time it’s different,” said Wong. “So as a result of that, the RRAM exhibits a certain degree of stochasticity — every time you program them is slightly different.” Wong and his colleagues proved that RRAM devices can store continuous AI weights and still be as accurate as digital computers if the algorithms are trained to get used to the noise they encounter on the chip, an advance that enabled them to produce the NeuRRAM chip.