Monday, October 6, 2008

The two R's



Computer software consists of a lot of instructions that read and write information to memory. The basic idea hasn't changed since World War II. Imagine a gazillion toggle switches, each of which can be flipped up or down. That's the memory, and then there's a processor that runs instructions, what we call the "code". The instructions are like "go look at switch 3,452,125 and see if it's flipped up. Then go look at switch 35,289 and see if it's flipped up. If they're both flipped up, then go find switch 278,311 and flip it down." And so forth. Some of the switches do things, like turn on a pixel on the computer screen. Others are just there to remember. We have nice tools so that we don't actually have to use numbers for the individual switches when we write the instructions, but under the covers that's exactly what's going on.

I work at a company called Terracotta Technology. We make software that connects many computers together so that they can solve problems bigger than one computer could handle. We make a sort of virtual computer, that other people's programs can execute on. We fool the programs that run on Terracotta into thinking they're running on a normal computer. A program thinks it's flipping a switch on its own computer, when actually it might be on some other computer.

So, we care a lot about when the programs try to read from memory and when they try to write to it, because we have to intercept all those operations. If all they want to do is read, we don't have to do as much work. Flipping a switch, in our world, that's real work.

This idea of replacing what's under the covers without changing how it looks to the software that's using it is not original - it comes up over and over again in software, in fact it's probably the single biggest idea the industry ever had. Hibernate is another product that does something like this. Hibernate takes reads and writes to memory, and supports them with reads and writes to a database, which is more reliable and persistent and searchable. A programmer could just write instructions to talk directly to the database, but Hibernate makes it easier by hiding some of the complexity under the covers.

But the illusion breaks down. The picture up above is what happens when you ask Hibernate how many things are in a list. Asking how many things are in a list shouldn't change it, right? That's common sense. But asking Hibernate how many things are in a list might change memory, because it might have to go fetch the information from the database and then save it in memory.

So if you want to figure out whether an operation is a "read" or a "write" or both, you need to know who's responsible for performing it. And that's something that can change on the fly, because we're so good at replacing what's under the covers.

But why do we care about the distinction anyway? Are reading and writing really the only way to compute? Why should this implementational distinction matter to a programmer?

Is this the path to functional programming?

No comments: