Quantized Academy

The Basic Algebra Behind Secret Codes and Space Communication

Whether you’re passing secret notes in class or downloading images from a space probe, Reed-Solomon codes offer an ingenious way to embed information and correct for errors.
cartoon illustration of a female teacher surrounded by students throwing paper airplanes covered in equations against a bright orange background

Robert Neubecker for Quanta Magazine

Introduction

Space exploration requires tremendous precision. When you’re landing a rover on Mars 70 million miles away from the nearest service station, you need to maximize efficiency and prepare for the unexpected. This applies to everything from spacecraft design to data transmission: Those messages returning to Earth as a steady stream of 0s and 1s are bound to contain some errors, so you need to be able to identify and correct them without wasting precious time and energy.

That’s where math comes in. Mathematicians have invented ingenious ways to transmit and store information. One surprisingly effective method uses Reed-Solomon codes, which are built on the same basic algebra that students learn in school. Let’s drop in on a math class to see how Reed-Solomon codes help transmit and secure information while correcting any costly errors that pop up.

Two students, Art and Zeke, are exchanging secret messages in Ms. Al-Jabr’s math class. Art unfolds Zeke’s latest note to reveal the numbers 57 and 99. He knows he has to supply the x-coordinates 3 and 6 to create the points (3, 57) and (6, 99). Art plugs each point into the linear equation y = Ax + B and produces the following system of equations:

57 = 3A + B

99 = 6A + B

To decode the message, Art has to solve for A and B. He starts by subtracting the first equation from the second:

   99 = 6A + B  (57 = 3A + B) =   42 = 3A

This eliminates B. Dividing both sides of this new equation by 3 tells Art that A = 14, and then substituting this back into the first equation, 57 = 3 × 14 + B, gives B = 15.

Art now knows that the line passing through (3, 57) and (6, 99) is described by the equation y = 14x + 15. But he also knows that in a Reed-Solomon code, the secret message is hiding in the coefficients. He decodes Zeke’s message using their simple agreed-upon alphabet cipher: 14 is “N” and 15 is “O,” which tells Art that, no, Zeke can’t play video games after school today.

The secret to this simple Reed-Solomon code starts with two basic facts of geometry. First, through any two points there is a unique line. Second, for coefficients A and B, every (non-vertical) line can be written in the form y = Ax + B. Together, these two facts guarantee that if you know two points on a line you can find A and B, and if you know A and B, you know all the points on the line. In short, possessing either set of information is equivalent to knowing the line.

Reed-Solomon codes leverage these equivalent sets of information. The secret message is encoded as the coefficients A and B, and the line’s points are broken up into pieces, some of which are transmitted publicly, and some of which are kept private. To decode the message, you just collect the pieces and put them back together. And all that this requires is some simple algebra.

Zeke’s pieces were the numbers 57 and 99, which he sent to Art. These numbers are the public part of the message. Art put those together with his own pieces, 3 and 6, to reconstruct the points (3, 57) and (6, 99). Here, the 3 and the 6 constitute the private part of the message, which Art and Zeke agreed upon beforehand.

The two points lie on a line, and to decode the message, you just need to find that line’s equation. Plugging each point into y = Ax + B gives you a system of two equations about the line that must both be true. Now the strategy is straightforward: Solve the system, determine the line and decode the message.

In algebra class you probably learned many methods of solving systems of equations, such as graphing, guessing and checking, and substitution. Art used elimination, a method where you manipulate the equations algebraically in order to eliminate the variables one at a time. Each time you eliminate a variable, the system becomes a little easier to solve.

As with other encryption schemes, it’s the clever combination of public and private information that keeps messages secure. Zeke could shout his numbers 57 and 99 across the classroom and it wouldn’t jeopardize the security of his message to Art (though it might get him in trouble with Ms. Al-Jabr). That’s because without the corresponding private information — the x-coordinates 3 and 6 — it’s impossible to identify the equation of the line. As long as they keep those values to themselves, they can safely pass their secret messages in public.

The line y = 14x + 15 is fine for transmitting the two-letter word “no,” but what if the students want to share a longer secret? Here’s where the full power of algebra, geometry, and systems of linear equations comes into play.

Suppose Zeke wants to know how Art thinks he’ll do on tomorrow’s English test. Art converts his three-letter answer into the numbers 14, 59 and 82 and passes those to Zeke. The students agreed beforehand that when using Reed-Solomon codes of length 3, their private numbers are 2, 5 and 6, so Zeke puts the pieces together to reconstruct the points (2, 14), (5, 59) and (6, 82).

Now, there is no linear function that passes through these three points. But there is a unique quadratic function that does. And since every quadratic function can be written in the form y = Ax2 + Bx + C, the same general method can be applied. The only difference is the size of the system.

Plugging each point into y = Ax2 + Bx + C yields one equation, so the three points produce the following system of three equations:

(2, 14):            14 = 4A + 2B + C

(5, 59):            59 = 25A + 5B + C

(6, 82):            82 = 36A + 6B + C

A system of three equations with three unknowns requires a little more work to solve than a system of two equations with two unknowns, but the goal is the same: Cleverly combine pairs of equations to eliminate variables.

For example, if you subtract the first equation from the second, you get the new equation 45 = 21A + 3B. Likewise, if you subtract the second equation from the third, you get 23 = 11A + B. These algebraic manipulations produce a new system:

45 = 21A + 3B

23 = 11A + B

Now you have a “two-by-two” system: two equations with two unknowns. To solve it, you can multiply the second equation by −3 and add it to the first equation:

       45 =   21A + 3B +   (-69 = -33A − 3B) =     -24 = -12A

Notice how repeated elimination has turned the original system of three equations into a single equation that can be easily solved: A = 2. Working backward, you can plug A = 2 into one of the equations in the two-by-two system to find B = 1, and then plug both values into one of the original equations to get C = 4. After using the simple alphabet cipher on 2, 1 and 4, Zeke knows that Art is going to do “BAD” on tomorrow’s English test. At least he’s getting lots of algebra practice.

Thanks to a special fact about polynomial functions, you can transmit a message of any length using Reed-Solomon codes. The key is this: Given any n points in the plane with different x-coordinates, there is a unique polynomial of “degree” n − 1 that passes through them. The degree of a polynomial is the highest power of a variable in the expression, so a quadratic function like Ax2 + Bx + C has degree 2, since the highest power of x is 2. And a linear function has degree 1, since in the equation y = Ax + B, the highest power of x is 1.

In the first example we used the fact that two points determine a unique linear, or degree-1, polynomial. In the second, we relied on the fact that three points determine a unique degree-2, or quadratic, polynomial. And if you want to send a message of length 10, just encode the message as the 10 coefficients of a degree-9 polynomial function. Once you have your function, compute the 10 public y-values by evaluating the function at the previously agreed-upon 10 private x-values. Once you do that, you can safely pass those y-coordinates in public for your receiver to decode. In practice, Reed-Solomon codes are a bit more complex than this, utilizing more sophisticated kinds of coefficients and translation schemes, but the fundamental idea is the same.

In addition to keeping your message secure, Reed-Solomon codes also offer simple and efficient ways to catch and even correct errors. This is important anytime data is transmitted or stored, as there’s always a chance that some of the information will be lost or corrupted.

One solution to this problem would be to simply send extra copies of the data. For example, Zeke can send the message [14, 14, 14, 15, 15, 15] instead of [14, 15]. As long as Art knows that every part of the message is sent in triplicate, he can decode the message and check for errors. In fact, if he finds any errors, he has a good chance of correcting them. If Art receives [14, 14, 24, 15, 15, 15], the fact that the first three numbers are different alerts him to an error, and since two of them are 14, he can guess that the 24 should probably be a 14 as well. Instead of asking for the message to be resent, Art can carry on with his decoding. This is an effective but costly strategy. Whatever time, energy and effort are required to send n pieces of information, this requires three times as much.

But the math behind Reed-Solomon codes offers an efficient alternative. Instead of sending multiple copies of every piece of data, you can just send one extra point! If that additional point fits your polynomial, then the information is correct. If it doesn’t, there’s been an error.

To see how this works, suppose you want to check the message “NO” in the first example. Zeke can just send the additional y-coordinate 155. Assuming he and Art agreed on a third private x-coordinate beforehand, say x = 10, Art can check this third point against the line he decoded. When he plugs x = 10 into y = 14x + 15 and sees that the result is 155, he knows there were no errors in transmission.

This doesn’t just work for lines. To enable Zeke to check “BAD” in the second example, Art can send y = 25. If they’ve agreed that 3 is the extra private x-coordinate, Zeke can plug x = 3 into his quadratic function y = 2x2 + x + 4 and verify that the point (3, 25) fits, again confirming an error-free transmission with just one more point.

And that extra point can potentially correct errors as well. If an error is detected and the receiver can’t construct the polynomial function that contains the message, they can instead construct the “best-fit” polynomial using regression techniques. Like a line of best fit in statistics class, this is the polynomial function that is mathematically determined to most closely fit the given points, even if it doesn’t pass through all of them. Depending on the structure of the message and how much extra information you send, this best-fit polynomial can help you reconstruct the correct polynomial — and thus the correct message — even from corrupt information.

This efficiency in transmitting and correcting communications explains why NASA has used Reed-Solomon codes on its missions to the moon and to Mars. And it gives you something to ponder as you solve your next system of equations. As you guess, check or eliminate your way to the solution, think of the power and elegance of Reed-Solomon codes and all the secrets your system might reveal.

Exercises

1. Using the same scheme they used in class, Art posts the public numbers 33 and 57 for Zeke to decode. What’s the message?

2. How can Art and Zeke be sure that the system of equations that results from their private numbers x = 3 and x = 6 will always have a solution?

3. In response to Art’s message of “BAD” about the English test, Zeke sends back [90, 387, 534]. Assuming they use the same scheme they used in class, what’s his message?

4. Lola sends you a two-letter message plus one error-checking number using a Reed-Solomon code and the same simple alphabet cipher used by Art and Zeke. You’ve secretly agreed on the x-coordinates 1, 3 and 10 in advance, and Lola transmits the public numbers [27, 43, 90]. Does the message contain an error?

Click for Answer 1:

Using the same x-coordinates as in the initial example yields the points (3, 33) and (6, 57), and thus the system of equations:

33 = 3A + B

57 = 6A + B

Subtracting the first equation from the second yields 24 = 3A, so A = 8. Plugging A = 8 into the first equation yields 33 = 24 + B, so B = 9. The simple alphabet cipher translates the message as “HI.”

 

Click for Answer 2:

By using two distinct x-coordinates to generate their points (x1, y1) and (x2, y2), Art and Zeke ensure that the system

y1 = Ax1 + B

y2 = Ax2 + B

will always have a unique solution that can be found by subtracting the equations. For example, subtracting the first equation from the second will always eliminate the variable B and result in a solution for A:

y2y1 = Ax2Ax1

y2y1 = A(x2x1)

$latex A = \frac{y_2 – y_1} { x_2 – x_1}$

Once you have A, you can plug it into either of the original equations to find that

$latex B = y_1 – x_1 (\frac{y_2 – y_1} { x_2 – x_1})$

This will always work, as long as you don’t divide by zero, so x1 and x2 must be different numbers. This is a first step in showing that the larger systems of equations will always have a unique solution as well.

 

Click for Answer 3:

The three points lead to the following system of equations:

(2, 90)                          90 = 4A + 2B + C

(5, 387)                        387 = 25A + 5B + C

(6, 534)                        534 = 36A + 6B + C

Solving the system of equations yields A = 12, B = 15, and C = 12, or “LOL” after translation through the simple alphabet cipher.

 

Click for Answer 4:

Yes. The first two points are (1, 27) and (3, 43). The system of equations

27 = A + B

43 = 3A + B

has the solution A = 8 and B = 19, producing the line y = 8x + 19 and the secret message “HN.” But notice that the third point doesn’t fit the line, since 8 × 10 + 19 equals 99, not 90. The additional point has revealed an error.

To correct the error, run a linear regression on the points (1, 27), (3, 43) and (10, 90). This yields a line with a slope very close to 7, which suggests that A = 7. Using this value of A, you can find B to be 20, and the message can be properly decoded as “GO.”

Comment on this article