Boya Sun for Quanta Magazine

Save Save Save Save

49

In the summer of 2015 a team of hackers attempted to take control of an unmanned military helicopter known as Little Bird. The helicopter, which is similar to the piloted version long-favored for U.S. special operations missions, was stationed at a Boeing facility in Arizona. The hackers had a head start: At the time they began the operation, they already had access to one part of the drone’s computer system. From there, all they needed to do was hack into Little Bird’s onboard flight-control computer, and the drone was theirs.

When the project started, a “Red Team” of hackers could have taken over the helicopter almost as easily as it could break into your home Wi-Fi. But in the intervening months, engineers from the Defense Advanced Research Projects Agency (DARPA) had implemented a new kind of security mechanism — a software system that couldn’t be commandeered. Key parts of Little Bird’s computer system were unhackable with existing technology, its code as trustworthy as a mathematical proof. Even though the Red Team was given six weeks with the drone and more access to its computing network than genuine bad actors could ever expect to attain, they failed to crack Little Bird’s defenses.

“They were not able to break out and disrupt the operation in any way,” said Kathleen Fisher, a professor of computer science at Tufts University and the founding program manager of the High-Assurance Cyber Military Systems (HACMS) project. “That result made all of DARPA stand up and say, oh my goodness, we can actually use this technology in systems we care about.”

The technology that repelled the hackers was a style of software programming known as formal verification. Unlike most computer code, which is written informally and evaluated based mainly on whether it works, formally verified software reads like a mathematical proof: Each statement follows logically from the preceding one. An entire program can be tested with the same certainty that mathematicians prove theorems.

“You’re writing down a mathematical formula that describes the program’s behavior and using some sort of proof checker that’s going to check the correctness of that statement,” said Bryan Parno, who does research on formal verification and security at Microsoft Research.

The aspiration to create formally verified software has existed nearly as long as the field of computer science. For a long time it seemed hopelessly out of reach, but advances over the past decade in so-called “formal methods” have inched the approach closer to mainstream practice. Today formal software verification is being explored in well-funded academic collaborations, the U.S. military and technology companies such as Microsoft and Amazon.

The interest occurs as an increasing number of vital social tasks are transacted online. Previously, when computers were isolated in homes and offices, programming bugs were merely inconvenient. Now those same small coding errors open massive security vulnerabilities on networked machines that allow anyone with the know-how free rein inside a computer system.

“Back in the 20th century, if a program had a bug, that was bad, the program might crash, so be it,” said Andrew Appel, professor of computer science at Princeton University and a leader in the program verification field. But in the 21st century, a bug could create “an avenue for hackers to take control of the program and steal all your data. It’s gone from being a bug that’s bad but tolerable to a vulnerability, which is much worse,” he said.

The Dream of Perfect Programs

Courtesy of Kelvin Ma/Tufts University

Kathleen Fisher, a computer scientist at Tufts University, leads the High-Assurance Cyber Military Systems (HACMS) project.

In October 1973 Edsger Dijkstra came up with an idea for creating error-free code. While staying in a hotel at a conference, he found himself seized in the middle of the night by the idea of making programming more mathematical. As he explained in a later reflection, “With my brain burning, I left my bed at 2:30 a.m. and wrote for more than an hour.” That material served as the starting point for his seminal 1976 book, “A Discipline of Programming,” which, together with work by Tony Hoare (who, like Dijkstra, received the Turing Award, computer science’s highest honor), established a vision for incorporating proofs of correctness into how computer programs are written.

It’s not a vision that computer science followed, largely because for many years afterward it seemed impractical — if not impossible — to specify a program’s function using the rules of formal logic.

A formal specification is a way of defining what, exactly, a computer program does. And a formal verification is a way of proving beyond a doubt that a program’s code perfectly achieves that specification. To see how this works, imagine writing a computer program for a robot car that drives you to the grocery store. At the operational level, you’d define the moves the car has at its disposal to achieve the trip — it can turn left or right, brake or accelerate, turn on or off at either end of the trip. Your program, as it were, would be a compilation of those basic operations arranged in the appropriate order so that at the end, you arrived at the grocery store and not the airport.

The traditional, simple way to see if a program works is to test it. Coders submit their programs to a wide range of inputs (or unit tests) to ensure they behave as designed. If your program were an algorithm that routed a robot car, for example, you might test it between many different sets of points. This testing approach produces software that works correctly, most of the time, which is all we really need for most applications. But unit testing can’t guarantee that software will always work correctly because there’s no way to run a program through every conceivable input. Even if your driving algorithm works for every destination you test it against, there’s always the possibility that it will malfunction under some rare conditions — or “corner cases,” as they’re called — and open a security gap. In actual programs, these malfunctions could be as simple as a buffer overflow error, where a program copies a little more data than it should and overwrites a small piece of the computer’s memory. It’s a seemingly innocuous error that’s hard to eliminate and provides an opening for hackers to attack a system — a weak hinge that becomes the gateway to the castle.

“One flaw anywhere in your software, and that’s the security vulnerability. It’s hard to test every possible path of every possible input,” Parno said.

Actual specifications are subtler than a trip to the grocery store. Programmers may want to write a program that notarizes and time-stamps documents in the order in which they’re received (a useful tool in, say, a patent office). In this case the specification would need to explain that the counter always increases (so that a document received later always has a higher number than a document received earlier) and that the program will never leak the key it uses to sign the documents.

This is easy enough to state in plain English. Translating the specification into formal language that a computer can apply is much harder — and accounts for a main challenge when writing any piece of software in this way.

“Coming up with a formal machine-readable specification or goal is conceptually tricky,” Parno said. “It’s easy to say at a high level ‘don’t leak my password,’ but turning that into a mathematical definition takes some thinking.”

To take another example, consider a program for sorting a list of numbers. A programmer trying to formalize a specification for a sort program might come up with something like this:

For every item j in a list, ensure that the element jj+1

Yet this formal specification — ensure that every element in a list is less than or equal to the element that follows it — contains a bug: The programmer assumes that the output will be a permutation of the input. That is, given the list [7, 3, 5], she expects that the program will return [3, 5, 7] and satisfy the definition. Yet the list [1, 2] also satisfies the definition since “it is a sorted list, just not the sorted list we were probably hoping for,” Parno said.

In other words, it’s hard to translate an idea you have for what a program should do into a formal specification that forecloses every possible (but incorrect) interpretation of what you want the program to do. And the example above is for something as simple as a sort program. Now imagine taking something much more abstract than sorting, such as protecting a password. “What does that mean mathematically? Defining it may involve writing down a mathematical description of what it means to keep a secret, or what it means for an encryption algorithm to be secure,” Parno said. “These are all questions we, and many others, have looked at and made progress on, but they can be quite subtle to get right.”

Block-Based Security

Between the lines it takes to write both the specification and the extra annotations needed to help the programming software reason about the code, a program that includes its formal verification information can be five times as long as a traditional program that was written to achieve the same end.

This burden can be alleviated somewhat with the right tools — programming languages and proof-assistant programs designed to help software engineers construct bombproof code. But those didn’t exist in the 1970s. “There were many parts of science and technology that just weren’t mature enough to make that work, and so around 1980, many parts of the computer science field lost interest in it,” said Appel, who is the lead principal investigator of a research group called DeepSpec that’s developing formally verified computer systems.

Even as the tools improved, another hurdle stood in the way of program verification: No one was sure whether it was even necessary. While formal methods enthusiasts talked of small coding errors manifesting as catastrophic bugs, everyone else looked around and saw computer programs that pretty much worked fine. Sure, they crashed sometimes, but losing a little unsaved work or having to restart occasionally seemed like a small price to pay for not having to tediously spell out every little piece of a program in the language of a formal logical system. In time, even program verification’s earliest champions began to doubt its usefulness. In the 1990s Hoare — whose “Hoare logic” was one of the first formal systems for reasoning about the correctness of a computer program — acknowledged that maybe specification was a labor-intensive solution to a problem that didn’t exist. As he wrote in 1995:

Ten years ago, researchers into formal methods (and I was the most mistaken among them) predicted that the programming world would embrace with gratitude every assistance promised by formalization…. It has turned out that the world just does not suffer significantly from the kind of problem that our research was originally intended to solve.

Then came the internet, which did for coding errors what air travel did for the spread of infectious diseases: When every computer is connected to every other one, inconvenient but tolerable software bugs can lead to a cascade of security failures.

“Here’s the thing we didn’t quite fully understand,” Appel said. “It’s that there are certain kinds of software that are outward-facing to all hackers in the internet, so that if there is a bug in that software, it might well be a security vulnerability.”

Courtesy of Jeannette M. Wing

Jeannette Wing, a computer scientist at Microsoft Research, is developing a formally verified protocol for the internet.

By the time researchers began to understand the critical threats to computer security posed by the internet, program verification was ready for a comeback. To start, researchers had made big advances in the technology that undergirds formal methods: improvements in proof-assistant programs like Coq and Isabelle that support formal methods; the development of new logical systems (called dependent-type theories) that provide a framework for computers to reason about code; and improvements in what’s called “operational semantics” — in essence, a language that has the right words to express what a program is supposed to do.

“If you start with an English-language specification, you’re inherently starting with an ambiguous specification,” said Jeannette Wing, corporate vice president at Microsoft Research. “Any natural language is inherently ambiguous. In a formal specification you’re writing down a precise specification based on mathematics to explain what it is you want the program to do.”

In addition, researchers in formal methods also moderated their goals. In the 1970s and early 1980s, they envisioned creating entire fully verified computer systems, from the circuit all the way to the programs. Today most formal methods researchers focus instead on verifying smaller but especially vulnerable or critical pieces of a system, like operating systems or cryptographic protocols.

“We’re not claiming we’re going to prove an entire system is correct, 100 percent reliable in every bit, down to the circuit level,” Wing said. “That’s ridiculous to make those claims. We are much more clear about what we can and cannot do.”

The HACMS project illustrates how it’s possible to generate big security guarantees by specifying one small part of a computer system. The project’s first goal was to create an unhackable recreational quadcopter. The off-the-shelf software that ran the quadcopter was monolithic, meaning that if an attacker broke into one piece of it, he had access to all of it. So, over the next two years, the HACMS team set about dividing the code on the quadcopter’s mission-control computer into partitions.

The team also rewrote the software architecture, using what Fisher, the HACMS founding project manager, calls “high-assurance building blocks” — tools that allow programmers to prove the fidelity of their code. One of those verified building blocks comes with a proof guaranteeing that someone with access inside one partition won’t be able to escalate their privileges and get inside other partitions.

Later the HACMS programmers installed this partitioned software on Little Bird. In the test against the Red Team hackers, they provided the Red Team access inside a partition that controlled aspects of the drone helicopter, like the camera, but not essential functions. The hackers were mathematically guaranteed to get stuck. “They proved in a machine-checked way that the Red Team would not be able to break out of the partition, so it’s not surprising” that they couldn’t, Fisher said. “It’s consistent with the theorem, but it’s good to check.”

In the year since the Little Bird test, DARPA has been applying the tools and techniques from the HACMS project to other areas of military technology, like satellites and self-driving convoy trucks. The new initiatives are consistent with the way formal verification has spread over the last decade: Each successful project emboldens the next. “People can’t really have the excuse anymore that it’s too hard,” Fisher said.

Verifying the Internet

Security and reliability are the two main goals that motivate formal methods. And with each passing day the need for improvements in both is more apparent. In 2014 a small coding error that would have been caught by formal specification opened the way for the Heartbleed bug, which threatened to bring down the internet. A year later a pair of white-hat hackers confirmed perhaps the biggest fears we have about internet-connected cars when they successfully took control of someone else’s Jeep Cherokee.

As the stakes rise, researchers in formal methods are pushing into more ambitious places. In a return to the spirit that animated early verification efforts in the 1970s, the DeepSpec collaboration led by Appel (who also worked on HACMS) is attempting to build a fully verified end-to-end system like a web server. If successful, the effort, which is funded by a $10 million grant from the National Science Foundation, would stitch together many of the smaller-scale verification successes of the last decade. Researchers have built a number of provably secure components, such as the core, or kernel, of an operating system. “What hadn’t been done, and is the challenge DeepSpec is focusing on, is how to connect those components together at specification interfaces,” Appel said. Over at Microsoft Research, software engineers have two ambitious formal verification projects underway. The first, named Everest, is to create a verified version of HTTPS, the protocol that secures web browsers and that Wing refers to as the “Achilles heel of the internet.” The second is to create verified specifications for complex cyber-physical systems such as drones. Here the challenge is considerable. Where typical software follows discrete, unambiguous steps, the programs that tell a drone how to move use machine learning to make probabilistic decisions based on a continuous stream of environmental data. It’s far from obvious how to reason about that kind of uncertainty or pin it down in a formal specification. But formal methods have advanced a lot even in the last decade, and Wing, who oversees this work, is optimistic formal methods researchers are going to figure it out. Correction: This article was revised on September 21 to clarify that in formally verified software, each statement follows logically from the preceding one, not from the next. This article was reprinted on Wired.com. Save View Reader Comments (49) Leave a Comment ## Reader CommentsLeave a Comment • Frank says: Formal verification is good work but I think the main problem is why any bug in any running software allows a hacker to take control. This seems like a general Operating System design flaw. What if an OS run all other software like an interpreter? I think then it could detect any crash and stop execution of that software in a harmless way that cannot be used by hackers to take control. • Adrian Jansen says: Maybe I'm just a dumb programmer, but it seems to me there are two different problems here. One is to formally prove that a program can do what it is intended, the other is to prevent a 'hacker' from access. But if for instance the 'hacker' can insert code which also meets the criterion of formal proof, how does the machine 'know' that that code is malicious. A formal proof on the running system will still show that the code passes as 'true', in the sense of meeting the criteria of formal proof of the logic. Its trivial to find the 'hack' in other ways, eg a real-time check that the existing code generates a valid checksum, for instance. I dont see quite how formal proof gains anything more than that, in the hacker case. • Simba says: "Error Free" is not the same thing as "Hacker Proof". There is no such thing as unhackable. Even the most solid code will develop vulnerabilities in time. Sometimes it is similar to discovering a new dimension. • Austin Hook says: Article fails to convey any real clues as to how program proof is done. Not even a slice of what the augmented code looks like. • bzzz says: paragraph 4 : Each statement follows logically from the next. Well… no … :-)) • Adam says: So, TDD? Aight. • Paul Snively says: Adrian Jansen: good point. Part of what needs to be done is to either disallow untrusted users "inserting code" at all, or to securely confine inserted code to harmless operations. In the drone example given in the article, the hackers were deliberately given a level of access they wouldn't actually be given in practice, in order to test the theory that the "partitioning" of systems was secure. And that test worked (at least so far). But specifying what "limited, secure access" means is definitely tricky. Austin Hook: the article links to the Coq and Isabelle proof assistants, both of which are very good, and both of which support extracting running code in a few different languages (Scheme, Standard ML, OCaml, and Haskell) from proof developments. Both are very well documented. Isabelle's own documentation is quite good. In Coq's case, I think the better documentation comes from outside the Coq project itself. One such example is <https://www.cis.upenn.edu/~bcpierce/sf/current/index.html>. Another is <http://adam.chlipala.net/cpdt/>. • Scott says: OMG – well written code with extensive testing creates applications that are 'hacker-resistant'. Who would have guessed! • Calvin Ratsavong says: Developing formally-verified code in the 21st century is somewhat unpractical in my view. We live in the digital age where code needs to be updated constantly for the sake of being agile and responsive to security breaches. If for example, a developer were to create formally-verified code which takes about twice as long as developing code that is efficient enough to work for it's purpose, how does the developer go about changes? What if they had a security vulnerability? How can they be responsive to a threat when they're spending a large amount of time trying to supposedly mitigate software/hardware failure? Shifting focus to hotfixes is why companies are able to recover from code/hardware failures. Code should never meet the perfectionist. • David Legg says: This all sounds very good, but misses a fundamental issue. If the tools used to do the checking are compromised, then so are the results. Way, way back when Unix was new, there was a simple proof of concept created to insert a back door into the login program. There were two parts, one inserted the backdoor into the login program if login.c was compiled, and the second inserted the insertion code into the C compiler if it was rebuilt. Once created all the malicious code was removed from the source, but continued to exist in both the executables. No amount of source code checking would reveal the vulnerability, but if you knew the proper login and password, you had access to the system. This was a proof of concept, so it was easy to defeat by changing the file names, but that was not the point. The point was that unless you had created the computer system from the silicon up, could you really trust that nothing was going on you didn't know about? While I agree that formal verification for critical systems will go a long way to protect systems from unintentional bugs and serious flaws, I think it's short sighted to think they are hacker proof. Hackers (especially state sponsored ones) are patient and resourceful, and will continue to look for indirect attack routes when direct attacks are stopped. This article talks about formal verification of the code, but how do you prove (in the mathematical sense) that a: the binary created from the source matches the source. b: correctly implements the source. • Aaron Kulkis says: And you STILL haven't eliminated the Garbage In, Garbage Out problem. Per the example given: For every item j in a list, ensure that the element j ≤ j+1 This is WRONG. All this does is enure that Element 1 ≤2, Element 2 ≤ 3, Element 3 ≤ 4, etc. The CORRECT specification for a sort it this: For every item j in a list, ensure that the element j ≤ element (j+1). In most languages, this problem is already solved through the use of libraries — to wit: sort (elements, UP); or for a reverse sort: sort (elements, DOWN); You cannot eliminate bugs caused by programmer incompetence, because THAT requires a computer that is not merely a mindreader, but understand the difference between what the programmer wants, and what the programmer actually specifies. The best that can be done is that a language finds syntax errors, and "lint" checkers can find unused variables, and other things deemed "unusual". But NOTHING in the world can protect against a syntactically correct specification which, at first glance APPEARS to specify the solution to the problem, but in fact, defines the solution to another problem which has a solution of very similar form (as in the example above) • Brian Balke says: Calvin Ratsavong: We spend huge amounts of money developed tools to guard against exploitation of software vulnerabilities. If the behavior of those implementations could be verified formally, many of the holes in open source networking would have been caught before they were released to the wild. We also should not underestimate the value of knowing that what we're building on is rock-solid. Widely-used and distributed libraries would benefit enormously from this kind of analysis, and would become adopted because their users were confident that they would behave as expected. They would become "invisible," much as the power grid and telephone system are. Where I see potential problems – not explored here in the article – is with run-time performance. It's one thing to protect access to firmware from the communications driver – the firmware is still running in the foreground as it always has. It's another thing when the verified software is running a communications network. Some kind of overhead must be involved in making the state boundary checks required to ensure conformance to the formal behavior. If those affect performance, the problem is to determine how and where such checks are required to ensure that high-performance code is protected from unexpected states. • Timothy123456 says: Like most articles, the headline is misleading. Sure, you can verify the code matches the formal definition exactly. And there are a lot of use cases for this. It does not make the application "unhackable" because if the formal definition itself is wrong, the code will be precisely wrong in the same way. It's another tool in the toolbelt. It's not a panacea. • Zack Yezek says: There is a core conceptual land mine this is going to run into, as well as practical problems. Conceptually, the core problem is that the Platonic universe of formal mathematics is but a small subset of the larger conceptual universe algorithms & computation can explore. Formal mathematics doesn't have bugs, or garbage inputs, or really even any concept of state. A very simple example is dividing by zero. Formal math simply throws up its hands & says 'stop!', but real software & algorithms have to do SOMETHING with that NaN, since even crashing is an output state. Practically speaking, all 'formal validation' can do is ensure your program behaves as expected given known inputs. They can maybe give you a kind of 'inductive' unit testing, where like inductive proofs in formal mathematics you can cover a simple but infinite space of legal inputs & outputs with a finite # of logical steps. And if that's all they're really claiming to provide, that's fine- we always can use more good tools. But no system, even in principle, will be totally 'hacker-proof' because humans always come into the loop somewhere, and plenty of hacks of a system don't require modifying or exploiting the software at all. • Jim Balter says: This article is buggy. • GlenO says: This could really have a positive effect, if customers start making formal verification a requirement. It is a bit late though, since I doubt that software generated through deep learning will benefit, and it is a growing proportion of the code needing verification. • Daniel Bilar says: These hacker-proof announcement are overly optimistic. One aspect ignored are transduction attacks (eg RowHammer https://arxiv.org/abs/1507.06955 ). Another is the logical possibility via incompleteness of weird machine instantiations in formally correct verified code (eg in proof carrying code https://t.co/xHTegrSHA9 ) To wit: The L4 formally verified kernel project (seL4) found 160 bugs in 7500 C LoC w 25 man-years of effort (10 steady state). Proof was 200k LoC in Isabelle script. Even they did not claim seL4 was free of bugs (some bug classes remain, like covert/timing channels) that could be exploited. They worded it very carefully "enforces specific security properties, namely integrity and confidentiality, under certain assumptions." https://wiki.sel4.systems/FrequentlyAskedQuestions#Does_seL4_have_zero_bugs.3F • jackolantern says: I am a simple user to whom linux command lines in a terminal are difficult. So after reading the arrticle, I had the impression that there was an unhackable system that would protect all the corporate and MIC 'human-free gotta haves' that are in the offing out there, e.g., driverless Planes, Trains and Automobiles.. And then I come to the comment section and I learn that there really is no such thing as a system that cannot be hacked and even though I don't understand much of what is being discussed, I do understand that it is being discussed. And that brings me to this question: If this comment section comes to the conclusion that unhackable is an impossible dream, how is it that the megaminds at MS and DARPA can't figure this out and how is it that the article doesn't cover this? • Tom Vaughn says: To the extent that formal verification tools can reduce the number of errors in code, or even eliminate certain classes of errors, it is a good thing. Any communication system (which is the milieu in which this technique was discussed) will be engineered with compromises between functionality, security, cost and schedule restraints. Software is only one part of the system, although a very important part. It is all well and good to explore the usefulness of formal verification in an academic setting to determine its benefits and limitations. If the most difficult part of doing formal verification is producing an error-free and unambiguous specification, then that is where the most effort should be placed. The process of creating the specification then needs to be formally verified to ensure that nothing will be included that is not needed, and everything that is needed will be included. But to do that, one needs an error-free and unambiguous specification of how to produce a specification! It remains to be seen whether this approach, however, sexy it appears now, will actually survive the rigors of the real world. • BillH32 says: Back in 1975, my Master's thesis was program proving by inductive assertion, a process by which, given a set of inputs is correct, one can prove the output will be correct. The thesis was the base for a doctoral dissertation about guaranteeing the inputs can be guaranteed to be correct, further up the chain. It was a very laborious process, manually performed. That said, things like buffer overflows can be avoided when programming a module as a simple parameteric check. That is an input that is "not correct". It is the guarantee that the returned result will be correct (or identified as bogus) that is also important here. Hackability implies the ability of the hackers to gain access to the internals of the system and take command. One comment posited that if you can insert malicious code, you could hack the system. If the hacker's tried that and failed (and I would suspect they did this, probably a lot of different ways), this would render that avenue of intrusion moot, thus un-hackable for that type of intrusion. Only one of many ways to attempt entry, after all. This is the problem with all software systems. Unless you design in the protections before you code it, it can be hacked, or will give incorrect results. It would appear that the "Red Team" failed, despite their best efforts. Time to put these practices into general circulation, exposing possible entry points, or showing that, nope, not going to happen. I'd certainly feel safer! • Ethan says: I can formally verify that this article is 100% garbage. 1. Mathematically verified constructions of logic are exactly what software consists of today. 2. If you want to 'break' that helicopter. Hit it with a baseball bat. If you want to then change it's trajectory, duct tape it to a drone that flies with a larger force. Done. 3. Read "Reflections on Trusting Trust" by Ken Thompson. If I build a backdoor into your compiler or another upstream dependency.. 4. Trojan Horses, Ransomware, Software Bombs, other forms of malware. There are no errors with those applications. They do their job quite well. 5. CSRF, Clickjacking, and forms of Phishing attacks. These are methods of exploitation that involve human deceit. 6. Covert Channels. Hackers can jump air gaps, bypass IDSes, Anti-Malware, etc, using side effects of technology to their benefit. For example, creating a creating a long for loop of heavy computation to increase the CPU fan speed to simulate a 1 and then idling for a 0. With any binary representation information can be transmitted and there's no way for a machine to detect them or prevent them. 7. Weak passwords. Exploiting human weaknesses. Etc.. The point being: if it is made by man, it can be broken by man. • Andrew says: Nonsense. The mere suggestion that code is all there is to do with an impervious system is ignorant. • Matthew Talbot says: Enjoyed the irony in the article correction where ambiguous English introduced an error that would have been interpreted correctly by most, and the update ensured it was also logically correct. Surely if the author had used their logic checker together with their spell checker then the error would have been avoided 🙂 • Shane Miller says: Consider this comment from 'Calvin Ratsavong': September 21, 2016 at 10:38 am Developing formally-verified code in the 21st century is somewhat unpractical in my view. We live in the digital age where code needs to be updated constantly […] how does the developer go about changes? What if […]? How can they be responsive to a […]? […] Code should never meet the perfectionist. Programmers are interesting creatures: as a manager, if I tell a programmer I want you to do scope X but break it down into small parts p1, p2, … pN so that we have manageable, well-scoped pieces that can be done over a few days each, programmers will immediately agree. They get it: agree on full scope, but work the scope in parts. But if you apply the same thinking process to identifying, eliminating, and preventing bugs sort of like what's talked about here … forget it … 85% of most programmers will starting protesting, complaining, and playing down expectations like Ratsavong. They say it's impractical, what about this?, you can't find all bugs, what about that? …. all designed to create doubt. See the contradiction? Software — I work in trading and work on very large systems for my day job — is one the least engineered things in science. Software is particularly labor intensive. Getting specifications outside non-formal-work-domains is next to impossible. Software is particularly sensitive to teamwork compatibility. Software solves a lot of bugs on the back of customers who find them. Quick fix has some merit, but it's hardly a feature. Quick rollout is more about customer's managing for lack of progress, and project definition typical in software. Software has feasibility problems with CMM. In the 90s the Standish Group reported something like 40% of all projects at large companies were terminated before delivering any code. Why? Lack of executive support to give project direction; management got sick and tired of programmers lurching left and right seeing no end in sight … cost overruns, can't agree on goals, a … million reasons. Even the most basic things like pre/post contract definition, defining undefined behavior, unit and integration testing is extremely spotty as an industry. In short while manufacturing has definitely learned a lot about quality, software hasn't learned much. That's not to say millions of pages haven't been written about quality in software — because it's a major issue in the industry which pays top dollar for what's essentially a manual, labor intensive job — but that the industry simply doesn't have the same buy-in that's been required in manufacturing. I applaud this article and the work done here … and I think, to the contrary, the industry's prevailing attitude of nay-saying is exactly part of what's wrong. It's worth pointing out not all programmers are engineers. Most are coders (more about execution of task, technical in orientation). Maybe 15% are engineers that are concerned about quality which, by definition, involves customers. Software definitely doesn't have the same emphasis, or organizational focus on quality one sees in manufacturing or in work domains that are *formal* in nature. • CityguyUSA says: Once again profit before quality and in the end it's cost several corporations serious trust issues and billions of people having to deal with stolen identities. What a great idea to rush all this technology onto the main stage only to end up with the Dotcom crash leading the way. I wonder how much would actually been gained if they hadn't been so damn impatient and how much has been lost because of their impatience. • Ian Strange says: Even perfect software is hackable. Look up row hammer. • Computer Scientist says: heh. Darpa is a little behind the times. You could do that with Prolog decades ago… • Yatima says: This article is a bit on the fuzzy side. The example of the self-driving robot car for example is not apposite. The formal verification makes sure that code indeed fulfills are the formally stated functional requirements and does not go off the rails in the middle of the computation. But you would never be able to formally state the functional requirements for the robot car to get to the grocery store. The robot car might have brake control software so verified. But the higher-level neural-network based control would be totally unverifiable and only statistically would one be able to ascertain that it generally does its work correctly. Writing "hack proof" code is a specific aspect that one would indeed like to cover. It is equivalent to making sure that certain conditions regarding permissions will always hold, whatever the program's state. Now, of course, some possibilities to hack might not have occurred to the designer in the first place, and so the iterations will run… Some good references on how Amazon uses formal methods: http://research.microsoft.com/en-us/um/people/lamport/tla/amazon.html ( from: http://research.microsoft.com/en-us/um/people/lamport/tla/tla.html) Of course these things are currently routinely used in Aerospace Industry. Here are short paper on what's up currently: http://www.cis.upenn.edu/~alur/Survey13.pdf http://www.sc.ehu.es/jiwlucap/Beckert&Hahnel-2014.pdf It's a fantastic subject and I regret not having heard much of it at Uni in the late 80s, though having been fiercely exposed to Edgser Dijsktra's proof methods at ETHZ. Unfortunately, that was done using the then-current imperative programming style. That style was very much unsuited for the task at hand and as a result many students got an aversion to formal methods. Instead the appropriate logic programming style or even functional programming style should have been used, but was not even mentioned. • Yatima says: @Ethan <i>3. Read "Reflections on Trusting Trust" by Ken Thompson. If I build a backdoor into your compiler or another upstream dependency..</i> Problem has been solved: https://www.schneier.com/blog/archives/2006/01/countering_trus.html • Michael Fritzius says: "A small base is more easily defensible than a large one" –Sun Tzu, probably. One of the best ways to prevent hackable code is to keep the footprint small. The bigger it gets, the higher the chance of some detail being overlooked. • Chad Meiner says: Formal methods are invincible! • Jon Richfield says: Although many of the critical comments are ill-informed, the article as it stands falls short in several ways that cannot be evaluated without a lot of crucial detail that (understandably perhaps) is omitted. There are whole classes of formally specified, formally proved systems that are hackable or otherwise vulnerable, some from the data end, some from the code end, some from the design end, some from the hardware end, and some from the implementation end. Possibly such vulnerabilities have been taken care of in the system described in this article, but we cannot confirm that from the text as it stands. As a trivial illustration, perhaps too trivial to apply to the design in question, but still an illustration, imagine a structure of information, either code or data, such that no change to its content would leave it incorrect by changing its meaning. That would mean that all the states it could take would have the same meaning. That in turn would mean that its information content would be zero: it would have no control capability. It would be like substituting for a thermometer, a fixed etching showing the desired temperature of a furnace: precise, reliable, and incorruptible, as long as the actual temperature in the furnace happens to be correct. In fact, since it has only one possible state, it has no information content at all, and a hacker need not be concerned with its presence at all. Could such an example occur in real life, even in principle? It already has; some chisellers sold some airports a lot of bomb-detection equipment that amounted to functionless plastic rods attached to impressive consoles. Point it at baggage, and if it doesn't set off the alarm, the luggage is safe. Which turned out to be perfectly true, whenever the luggage really WAS safe. Empirical security is a tricky field, tricky than just formal design. • Steven B Sidman says: If memory serves, formally verified ICs were done in the UK, in the 90s. I believe one of the results of this effort was a floating point chip, done by some of the Transputer people. The design was logically correct, but converting it to physical form took some work, which might have introduced some possibility of error in translation. My recollection of that group was that they were in love with Haskell, and its predecessor, M. Everything old is new again. • Starry Gordon says: Maybe I missed it because of reading too fast, but I did not see consideration of the fact that the cost of formal verification increases exponentially with the complexity of the system to be verified. I have found it difficult, as a systems developer, to get management to pay the monetary and time price of even elementary testing and QA, much less go for formal verification. Moreover, even if the system is correct, it might not have been specified correctly, or, later, used correctly. • David Fredricks says: I THINK that the majority of malware takes advantage of the ability to get to any area of memory. If only there was a language that prevented this! WAIT there is one it was invented by IBM in the decade of 1950 and it was called FORTRAN. How did it prevent buffer overflows and other memory corruption? It didn't have POINTERS. You couldn't get outside of the bounds of arrays, because it had boundary protection. You had to CALL subroutines by name not just start executing code at some location in memory. If the subroutine didn't exist everything stopped. It was called an abend [a term no programmer today understands] But it was easier to write in C because programmers didn't want to worry about memory / bounds protection. And all the "enhancements / bandages" like C++ and derivatives kept trying to fix the problem. Now of course FORTRAN has capitulated and introduced pointers so it is just as flawed as C. • Brian A says: A lot of commenters seem to fault this article for not instead being a detailed scientific paper discussing formal verification and the results of the HACMS helicopter project in comprehensive conceptual detail and using what I'm sure is specialized terminology/jargon that is typically used in the area. But as someone who was only previously aware of what formal verification was about in 2016 at about a 35,000 foot level, I was quite glad to read a pretty accessible and well-written account describing the overall concept outlining these encouraging recent developments. (Even as I realized there was probably a fair bit of technical sophistication going on use the surface that the author was necessarily eliding over in order to indeed make the article accessible and of a reasonable length.) So, my thanks to the author and mag for this. There also seem to be a lot of commenters here that are being critical of the idea of formal verification on the basis that it can't resolve 100% of concerns that touch on the security (or, let's say more exactly, the Confidentiality, Integrity, and Availability) of a system. (Including, bizarrely, guaranteeing a system's hardware against destruction by a physical attacker.) So, yes, formal verification isn't and won't be a cure all. But in security, what is? Yes, you're not going to see Windows/Linux/Android/iOS/etc. replaced by fully-formally-verified operating systems anytime soon, let alone all the applications that run on them. But if you could replace even some of the most security-critical elements of those with new elements written in formally verified code you still might achieve huge importantly security gains. You might be able to build sandboxes that are much harder for corrupted elements to break out of (as with the helicopter project), virtualization technologies that make it much harder–or impossible– for an attack to escape a guest VM and access an underlying host OS, new document parsers that would replace a major source of exploitable vulnerabilities in office document suites, and so on. Nor does the fact that you must ensure that the verification software that does the verification of code is itself trustworthy mean verification is worthless. Someone pointed out the example that very well known security expert Bruce Schneider had made a point about some bad guys introducing vulnerabilities or malware into software products by corrupting the tools used "up the chain" in development environments, so that malicious code is already cooked into the software at release. But I cannot believe Mr.Schneider intended the takeaway point from that to be "Attacks corrupting your development tools themselves are possible; therefore, you don't need to care about whether the code you would write with them for your program can withstand attack." Instead, the only sensible lesson is that you have to look toward assuring the trustworthiness of the tools you use in addition to the code you write. A lesson it sounds like runners of the projects described here are (hopefully) sensible enough to be quite aware of. And assuring the application security of the verification software programs mentioned is a far less imposing challenge to address than trying to write highly secure code for each piece of software you work on using traditional, unverified methods today. The above being said, the questions of how performant verified-code components can be versus traditionally-written ones and how much increased development time & effort (if any) will be required to create something in verified code versus non-verified code strike me as good ones to worry about. Fairly or not, many developers have declined to use already existing managed program languages–which can already today make it impossible or nearly impossible for things like exploitable buffer overflows to exist in a piece of software– in projects, citing such reasons. It would be especially interesting to me to see a follow up article that, say, concretely compared the development time and difficulty involved in creating a simple new program or component using a current unmanaged language (like C), a current managed language (ie. with some built-in protections for memory safety & type safety), and with one of the emerging verified code methods, respectively. And looking at the computing resources consumed in executing each of the software pieces that result. • GMcK says: Everyone who is concerned with "provably secure" computer systems needs to put in their calendar a reminder to reread Ken Thompson's own Turing Award lecture "Reflections on Trusting Trust" at least once a year. Not only do you have to prove the program correct, but you have to prove the language compiler correct (this has been done for a handful of small languages) and you have to prove all the tools used to create the compiler correct, and all the microcode in the CPU that runs the code, and all the tools that are used to create the microcode, and all the IC design tools that specify the chips that run the microcode, and all the compilers that create those design tools. AND you have to have bulletproof processes that ensure that no unauthorized code got included in any of the build packages for all of the tooling. Oh, and your proof assistant tools ought to be proven correct as well. Kathleen Fisher's project marks a milestone, but there's still a long way to go ahead. • Matthias Rosenau says: To proof something (not using "random proofs") means formal deduction in some formal language. So, as stated in the article, you need a language to make a specification, which cannot be plain English, than someone or some program generates code and this code is proofed to exactly fullfill the specs? If the spec is formal, the code generation should be possible by algorithms. If so, the code is only a tautology of the spec, no proof (or at least no necessary proof, if you know, that the code generation algorithms are correct). To specify something like "not hackable" in a formal language sounds strange, because "hackable" tends to mean something beyond the limits (in this cause: of the language) • Jan Hindrik Knot says: The article should at least have mentioned the 1985 book "The Science of Programming" by David Gries. • Robin Alexander says: The lead to the article states "can prove certain programs to be error-free with the same certainty that mathematicians prove theorems." As a long time student of mathematics I should point out that, despite the public's perception, mathematical theorems are not "certain" either. The process of proof is iterative in that someone provides a proof for a theorem, others might discover errors in the proof that are then fixed if possible. Then more errors might be found, and so on. The point being that mathematics is not infallible or even self-consistent. To be fair, the standard of mathematical proof does seem to be much higher than that of software correctness. On a different note, since buffer and stack overflows are so often used in hacks, why in the world can't these be prevented by hardware of compilers? It seems there should be a way to prevent overflows from being possible. • Alex Mason says: Not a single mention of Data61 (formerly NICTA), UNSW or Galois who are the ones who actually built these systems, not DARPA. Really disappointing to see credit taken away from those who deserve it. • Frank says: @David Fredricks: I think you are absolutely right. C/C++ is the main problem. Unfortunately most operating systems and professional application software written in those languages which allow hackers to use any bug to take control of any computer systems. Lesson here is governments and private companies should switch to other more secure languages as soon as possible; not to mention operating systems themselves! • Michael Norrish says: As someone who worked on this project, I can see why people cite Thompson's famous paper on backdoors and the like. However, if you dig a little deeper into the way the formal verification of the underlying microkernel works, you will learn that those attacks (which require a compromised compiler) do not work against our OS. See http://sel4.systems/Info/FAQ/proof.pml for more details on the proof. • Rich Gabaldon says: While the intent is good, and millions of tax dollars are spent on this albeit very needed microkernel, there can be no doubt that a political motivation to the end of gratifying some need for the appearance of cyber security (until proven otherwise) is at the root of this endeavor. I have no doubt the "hacker-proof" buzzword is making the rounds. In the meantime, Microsoft and other well known software companies are rushing to release there newest, greatest, latest version of whatever unfinished software they are pushing this year, bugs be damned. • Amercanegro says: Are “error-free" and ”hacker-proof" really the same thing？ • Andrew Nambudripad says: So a bunch of people here are conflating your standard unit tests with *formal verification*. The former is a set of assertions given by a human with a limited amount of cases. Test T :: Given a specific state S and input I ; assert output O(specified) == application of function F(S,I). A test suite is simply T_1 … T_n. All this ensures is that given your specific (S,I)=>O. You can generalize this to given the input I, such that I belongs to the typeclass T and holds properties P_1 .. P_N, and get a much stronger guarantee of correct code. (E.g. Haskell's typeclasses or ML's modules + combined with property-based checking (see: QuickCheck & its derivatives)). This still, however, is not formal verification. @David Legg – That's a famous paper which won the Turing Award. "Reflections on trusting trust (Ken Thompson, ACM Aug 1984)[1]. (It had 3 parts to it, not two.) When you say " No amount of source code checking would reveal the vulnerability ", I think you mean "no amount of source code checking *of login.c", as if you had access to the compiler's source (and audited every single line), you could identify the malicious code transform ('trojan horse') targetting login.c. RE: How to formally verify in the 'hacker proof' sense: It's definitely possible from the software sense. Formal verification of an operating system has been done (see: "Verve"[2] (a MSR project)). Microsoft Research also has a formally verified (subset?) of C with a focus on strict concurrency[3] guarantees (i.e. code guaranteed to be free from dead-locks, race conditions, blah blah). There's also Frama-C[4] and a Coq-verified frontend to all of ANSI C done by Leroy, et al out of INRIA[5]. Verve formally guarantees security of Intel 8086 derivatives from the bootloader all the way up. RE: Verifying hardware: Take an instruction set that's standard (i.e. SPARC, ARM or POWER8) and formally verify the semantics of it with Coq or Agda. That's fairly trivial. The less trivial part (I gather, at least, I'm not in semiconductors…) is ensuring the silicon is properly designed, so that XOR on data that you pulled from the pre-fetcher and a register aren't stale. From what I hear, even trivial chips (M0+ or STM32 levels of power) take months of aggregated time in massive clusters just to verify with$foo% of certainty that every execution is correct.

@Daniel Bilar : Agreed, but one has to concede the fact that your SPARK verified Ada code is going to be a few orders of magnitude more secure than your-average-idiot-hitting-f5-until-the-build-succeeds. -Wpedantic, what's that?

@jackolantern : Microsoft _has_ done this. In fact, MSR is one of the leaders of formal verification. Search journal archives for "INRIA/MSR" or whatever.

I have no idea why this article is randomly popping up now – formal verification has been around for ages. Berkeley, CMU, INRIA out of France, ETH Zurich, and Microsoft Research have been at this since my old man was doing his post-doctorate grind.

RE: That other guy who brought up Reflections on Trusting Trust, yeah I'm mostly with you but read my comment above — formally verified compilers exist. Though I guess if you don't trust the binaries, you could write an ANSI C compiler yourself pretty trivially targeting just x86_64 or whatever, and use that to bootstrap the formally verified compilers. Practically, you could also compiler-bounce between GCC, Intel CC, clang/llvm, each with different optimization flags and eventually all "'login.c' meta-awareness" (which was the most critical attack vector) will be statistically eliminated (though provably so? hmm, that'd be an interesting topic for a dissertation).

[1] Check ACM's archive; I think the link I have is paywalled behind my .edu IP
[2] https://www.microsoft.com/en-us/research/publication/safe-to-the-last-instruction-automated-verification-of-a-type-safe-operating-system/
[3] https://www.microsoft.com/en-us/research/project/vcc-a-verifier-for-concurrent-c/
[4] http://frama-c.com/training/Hi-Lite-Fraunhofer.pdf
[5] http://pauillac.inria.fr/~xleroy/publi/cfront.pdf

• Stephen B. Gray says:

I doubt that provably correct code will have a practical effect for decades. I'm an experienced use of Word, which (unknown to most users) is horribly complicated and has scores of bugs. (It also needs a thorough improvement in its design specs.) MS seems to have no interest in even ordinary bug removal, let alone doing it formally.

Like all mega-corporations, there are different divisions with different management styles all of which have different priorities. Generalizing a corporation as big as Microsoft is pretty tough to do.

In 1991, an employee recognized core research was important and MSR fell out of it. Today, they employ more than 125 staff which are solely dedicated to not producing products but doing what academics do without the politics or pressure of publish-or-perish and impact factors of journals. They pay the salary of people like Simon Peyton-Jones so he can continue to work at Haskell full-time without having to worry about a paycheck, giving remedial lectures, or how many conferences he presented at last year. Reactive programming (RxJS was their project, the soundness of the temporal aspects of the system was all done by them) can all be attributed directly to MS. (That includes all of that fancy-schmancy JS frameworks ranging from Meteor, to Facebook's React/Redux/etc). Z3, the worlds best SMT solver, was given away to the public. is a direct fall out of MSR/Languages and Bart de Smet (or Meijer's, apologies in advance for any incorrect attribution) silently been funding important research, publishing important papers, and supporting the continued development of research-level computer science in a way that only ATT/Bell Labs historically have done.

The MS office team, in addition to being a completely different division of MS Research, has the unfortunate task of dealing with ensuring the continuity of support. Their customers rely on that .xlsx designed in 1997 when they had 6 employees — 20 years later, that business is still using the same Excel macros and that sole xlsx file is a few hundred MB, has core functionality which relies on bugs from Office '97 such that *if you fix the bugs so it works 'properly', your customers will complain that your fix broke their software*. The fact that WinXP (a ~15-16 year old OS) was still getting active patches (albeit for security, not features) up until about a year ago, and Windows Server 2016 can run ASP.NET classic applications built against SQL Server 2000 just fine boggles my mind.

I'm not a Microsoft apologist in the least, mind you, but as other people have mentioned in this thread, designing a 'well-functioning' specification[0] (i.e. internally consistent as defined by Hoare logic) and then implementing that specification so you ultimately can link and load that object file and have it mathematically correct is only maybe 20% of the problem-space (*proper* requirements gathering, consistent documentation, evolution of your ABI and/or the ABI of software with which you interact, maintenance as business requirements evolve, and other issues involving your customers, their users, your organizations managers, and engineers construct the remaining 80%).

[0] Anecdote – Erlang designer Armstrong does contract work for formal verification with high-assurance applications. A widget-maker for a large European car manufacturer wanted to ensure their widget was fully compliant with all of the specifications as issued by the overseeing administrative authority (ISO, ANSI, whatever). He and his team of engineers sat down and took 1k +/- .2k pages of written specifications, translated them into invariant properties, and let 'er rip. Of the ~foo bugs they found, ~40% of the bugs were internally inconsistent bugs *within the specification*. [An average car of model year 2016 has somewhere between 500-800 MCUs within them. Scary prospect…] So, lets say LDAP's RFC has 3 'bugs' in them (a proposition: p in section one, then a proposition: ~p in section 3). Microsoft wants to implement Active Directory to have LDAP compliance to work with Kerberos, Apple devices, etc etc. By definition, this implementation will be buggy. These bugs are of a class which have no fix!

• Amurru says:

You data and code reside on the same areas these countermeasures are far less effective.
replace von Neumann architecture with a more secure one.