Wednesday, December 3, 2008

Code Reviews

A spare set of eyes is always useful, so most software teams have some practice of code reviews. There is huge variation from team to team in how code reviews are actually done, and in my experience how any given team does it is an axiom of the corporate culture; suggesting a different approach is met with the same sort of response as if you'd suggested, say, using an abacus instead of a personal computer. (I haven't tried this at my current employer, Terracotta, yet.)

The range of practices I've been directly involved with includes:

  • No review at all. I haven't seen this in a long time.

  • Self-initiated. If you want a review, you find a peer and beg some of their time. Surprisingly, this has been my experience working on Eclipse, and at a number of other teams; it may reflect resource constraints or perhaps just the fact that I've been in the industry for a long time and people foolishly think I'm less likely to write bad code.

  • Recommended review for major work, especially for new developers. This too is pretty common in my experience, and is the approach taken at Terracotta.

  • Mandatory review of all code before it's committed to version control. This approach was taken by a team I worked on in the '90s at Microsoft, and resulted in some of the best code I've ever written or seen. (The project made it through beta but was then axed by management; it conflicted with corporate strategy.)

  • Pair programming, two sets of eyeballs at all times. I've only tried this a little bit; it did seem like we wrote pretty good code, but boy was it exhausting.


Along with that range is the question of how, exactly, the code gets reviewed. Approaches I've tried include:

  • Over my shoulder. The reviewer looks over my shoulder as I walk through the code on my monitor, explaining it.

  • Over the reviewer's shoulder. Same, except that the reviewer is the one driving.

  • Independent review, meet with feedback. Each reviewer is given the code to review, and given some time (a day, perhaps) to review and come up with comments; we then meet to collect together the feedback. The meeting is typically moderated by someone other than the developer, and there may also be a designated scribe to collect the comments.

  • Collective review. Like "over my shoulder", except that there is more than one reviewer. Having multiple reviewers makes it psychologically harder for any one reviewer to change the flow of the review: if someone wants to go back and look at a previous file, they have to interrupt someone else.


And then, there's the question of media and presentation. If code is being given to reviewers, it may be presented as a set of changes to the existing code, or as a finished document; it may be presented as hard copy or as files. The review may be conducted in person, over a video link (like WebEx or Elluminate), by audio conference, or by email.

The costs and benefits vary a lot depending on approach. I've gotten the best reviews out of the "independent review" approach: it creates a social dynamic where there is some shame for any reviewer who fails to find a particular bug that others do find, so there's incentive for thoroughness, but it also lets each reviewer bring their skills to bear in whatever way and at whatever pace works best for them. The collective feedback (whether in a meeting or in group email) is a good way to learn advanced techniques, subtle errors, and so forth.

But this approach is also expensive - I used to budget half a day to participate in a review like that, between the reviewing and the feedback meeting, and if three or four people have to spend half a day for each developer who wants to check something in, it's easy to spend all your time reviewing. Also, this strategy works poorly for small changes or even large refactorings of existing code, because reviewers tend to get distracted by things that really should be fixed, but that have been that way forever and is not in the scope of the current changes.

The opposite end of the curve is probably the "go borrow someone's eyeballs if you need them" sort of review. This misses a lot of errors, but sometimes you get so deep into a problem that you lose track of the obvious. A second set of eyes can see that you failed to close an open stream, or that you're doing something that a library routine already exists for. A second set of eyes will probably miss subtleties like locking on the wrong object.

Personally, I'm afraid that of all the possibilities, I get the very least out of collective review over video conference. The audio and video quality is generally very poor, so it's hard for me to follow what anyone else is saying; if my attention wanders for a moment, it's socially awkward to ask people to back up, so I tend to lose track of crucial details; and since it's the code owner who drives the review, any assumptions they have about what is and is not important become shared. Often bugs lurk precisely within those assumptions.

Finally there's the question of what sort of comments matter. In the most formal review process I was part of, we prioritized comments exactly the same way we'd have prioritized bugs:

  • pri 1 was a bug that would make the code fail significantly or a design flaw that would prevent the code from working with the rest of the product

  • pri 2 was a more minor bug or design infelicity, or possible performance improvements

  • pri 3 was a bug that wouldn't actually affect the code as it stood but that might cause problems during maintenance or if someone else tried to use it in a different way

  • pri 4 was typographical stuff, like use of whitespace, naming of internal variables, and the like.


Using a structure like that, we went through each module of code, collecting and discussing comments in priority order. Pri 1 bugs had to be fixed before the code could be checked in, and generally another review meeting would be scheduled to review the fix, depending on the developer's track record. Pri 2 bugs were usually fixed depending on the amount of effort required. Pri 3 bugs might be deferred or ignored. Generally priority-4 bugs would be relegated to email, that is, the scribe would just collect them from the individual reviewers and tack them onto the followup email without them ever being discussed in person, to save time and avoid religious wars; the developer would be expected to make a personal decision about whether to implement the sugggested change. Doing it this way also let us directly compare the relative effectiveness of reviewing versus testing.

I'm against any review process that takes a lot of people's time to do a shallow review. Either review thinly and cheaply, or deeply and expensively, but don't fool yourself into thinking that a lightweight review process is finding all the interesting bugs.

1 comment:

Alex Miller said...

If you've got ideas on other things to try or ways to improve, let's try them!!