Thursday, August 29, 2013

What is Quality Engineering?

Back in the 1990s, when waterfall was the development methodology, developers wrote code and checked it in; testers tested code, and found bugs; developers fixed the bugs; and then eventually we released the product to a limited group of customers called beta testers, and finally to the general public.

Nobody works that way any more.  Waterfall failed, and Agile took over.  Agile does not have a clear role for testers, because there is not a separate testing phase.  If you have someone different doing your testing, then you must necessarily be spending some time with code that is (at least partly) checked in but that has not been verified.  That is not Agile.

At the company I work for, teams have resolved this dilemma in many ways, most of them lousy.  Some teams don't have testers; some have automation developers; some have mini-waterfall.

We have a job title of "Quality Engineer"; people with this job are not expected to implement customer-facing features.  The absurd implication is that the people who implement customer-facing features are not quality engineers.  A software engineer who is not a quality engineer should be fired.  Quality is not something that can be applied after the coding is done.

But testers are important.  It's really hard to rigorously test your own code.  If you didn't see a gap the first time, you probably won't see it the second time.  And writing code is a creative act that takes emotional investment.  Asking someone to find the flaws in their own code is like asking a painter to critically assess the artistic relevance of their work before the paint dries on the canvas.

Pair programming is one solution; it's a lot easier to see someone else's error, or challenge someone else's shortcut.  Two sets of eyes during coding can greatly improve quality.  But the skill set of a good manual tester is different than that of a coder.  Watching a good manual tester is like watching a good hacker: the feature you thought was solid gold dissolves into a pile of bugs before your eyes.

So there is still a role for manual testing.  QE can understand the product from the customer's perspective, use it, and find out what doesn't work: essentially, act as a high-bandwidth, low-latency customer proxy.  QE in this role should be most tightly aligned with the product owner.

But manual testing is low leverage, compared to some more interesting possibilities.  There are areas where "Quality Engineering" really becomes a meaningful term.  Regrettably few companies invest in these areas.  The common characteristic of all these possibilities is that the work is internal-facing, decoupled from the product release cycle, and aimed at the development process rather than the product as such.

Predictive Fault Detection
There is a wealth of academic work, and some commercial products, dedicated to the premise that it is possible to predict before any code has been written where the bugs will be.  Bugs are not random: certain design patterns, certain APIs and technologies, certain methodological patterns are inherently buggy.  QE should be studying past results to predict future buggy patterns, steering coders away from them where possible, and advising extra attention where necessary.  QE should be like a harbor pilot, who knows where the hidden reefs are better than the ship captains can.

When technologies or patterns that are highly likely to provoke bugs are found, QE should propose eliminating them entirely: for example, if the company has been using a particular messaging framework but coders interacting with the framework tend to use it incorrectly and cause bugs, perhaps it is a sign that it is a bad choice of framework, even if it is otherwise performant and cool.  Or maybe it can be fixed.

Test Curation
Coders should write the majority of their own tests.  But as the codebase grows, so does the body of tests; and the test base becomes redundant and full of low-value tests.  Careful unit testing alleviates this problem because the individual tests continue to run quickly; but unit testing relies on well-modularized code, and in many enterprise situations - including at the company I work for - this is a goal that we can work towards but it is not a point we can start from.

So we have a vast number of slow, highly redundant tests, most of which test features that are not likely to regress.  QE should monitor the overall test base and combine tests that are too redundant, eliminate tests that provide insufficient value, and identify areas of weak coverage.  QE should understand and manage the test base as a whole, where coders tend to interact only with specific tests.

Framework Development
Coders are generally working under time pressure to produce a customer-facing feature.  We tend to do whatever reduces our risk of on-time delivery, even if it results in accumulating technical debt.  It's often hard just to get a coder to take the extra time to refactor the shared code they are building on top of.  Most developers are not in a position where they can tell their boss they're about to spend a few months developing code that will pay off company-wide but that will not directly result in shipping the feature the team is supposed to be working on.  As a dev manager, my personnel funding is proportioned on the basis of feature need, not internal investment.

However, the payoff for having a well-maintained set of test frameworks is huge; all the more so when the maintenance isn't just a series of one-off efforts by coders who need a feature, but a proactive, intentionally designed effort by a dedicated team.  QE can serve as a pool of engineers whose job is to improve the quality and efficiency of the feature-dedicated coders.

In summary:
The term "Quality Engineer" is nothing but a euphemism, when it's used to make a tester feel important in a development methodology that doesn't have a place for testing.  Testing is important, and it doesn't need to be called something other than what it is; but it's entirely different from quality engineering.  Quality engineering should be valuable and high leverage, but it can only be so if we take it seriously, separate it from testing, and select quality engineers on the basis of relevant skill, training, and experience.

Tuesday, August 27, 2013

more

I'm managing people these days so I spend more time opining. One of my coworkers asked me to opine more publicly. So I'll start this thing back up and we'll see how long it takes to make a fool of myself. 10... 9... 8...

Friday, October 22, 2010

time-delayed feedback in the workplace

The job of buildmaster rotates amongst managers. The buildmaster is primarily responsible for haranguing developers when the automated test failure rates are too high; and if they are too high for a while, the buildmaster can "lock the line", meaning that the only permitted checkins are those that ostensibly fix tests. We have some test suites that take several days to complete. Thus a bad checkin may cause test results to plunge days after the fact.

In Peter Senge's classic The Fifth Discipline, he talks about the effect of introducing a time delay into a negative feedback system. Whereas negative feedback usually stabilizes a system, negative feedback plus time delay tends to cause ever-more-violent oscillation.

Consider the following actual data:










Test Current EOD 10/21 EOD 10/20 EOD 10/19 TARGET
fast_suite 97.55% 98.77% 99.39% 99.39% 98%
slow_suite 86.43% 94.10% 83.61% 95.29% 97.5%


The fast suite returns feedback in a couple hours; the slow suite takes a few days to catch up to a changelist.

I am assured by various people that it sucks to be the buildmaster. It will continue to suck to be the buildmaster, I think, until we devise a system that is stable rather than oscillatory. A stable system is characterized by damping rather than nonlinear gain; and by feedback that is at least an order of magnitude faster than the forward phase response of the system. (It's possible to stabilize systems other ways, but this is the most general and reliable.)

To speed up the feedback loop, we could have fast suites that predict the behavior of the slow suites. Simply choosing a random subset of the tests in the slow suite, running those first, and providing interim results could achieve that.

To have damping rather than nonlinear gain, we need to remove or highly restrict the buildmaster's ability to lock the line; and instead, we need to increase the amount of pre-testing that is required in order to do a checkin. For instance, if interim results indicate a high failure rate, then new checkins should be subjected to a higher level of testing in the precheckin queue before they are allowed to actually commit.

Friday, August 13, 2010

The Compiler Is Not The Audience

Is the code that you write making life easier or harder for the next person who has to work in the same area? Are you creating complexity and fragility that will slow them down, or platforms, patterns, and utilities that will speed them up?

Thursday, August 12, 2010

test modularity

Principle: keep your tests of functionality separate from your tests of business rules.

Example: I have some code that provisions licenses (bundles of permissions) to entities. I should have one suite of tests that verifies that it is possible to provision arbitrary combinations of permissions correctly - that is, the functionality. The clients of my code should have tests that verify that they are choosing to provision the particular combinations they expect - that is, the business rules. My tests should not fail when the business decision of what permissions to grant to which entities gets changed.

Tuesday, August 10, 2010

Code ownership

As an aside: I've not written much lately because I'm working for a non-open-source company, which limits what I can talk about without crossing IP boundaries. I've decided to start posting again but will have to be a bit vague.

My company's roots are in web application development. This seems to contribute to a more horizontal rather than structured architectural style: all groups for themselves, each working on small user-facing features. There is a general principle that everyone is allowed to check code into anyone's area: code "ownership" is discouraged.

I find this a problem. At this point our codebase is quite large. No one, not even the principals of the company, understand it all; I routinely ask them questions and get back answers like "well, it's been a while since I worked on that so I'm not sure." But at the same time, no one can fully understand even the piece they work on, because it has been partied on by a myriad of developers, few of whom were well versed in its design, its test suite, its intentions, its history.

Tuesday, January 12, 2010

Dual dispatch

Computer programming is about describing the behavior of entities in various scenarios. For instance, if you're writing a data entry form, you might need to describe how it behaves when you click a certain button, how it behaves when you type into a field, and so forth. In object-oriented programming, we try to organize the code that describes an object's behavior together with the data that describes its state.

This gets messy when objects interact. Suppose you've got three Animals (Dog, Cat, Monkey), and three kinds of Food (DogChow, CatChow, MonkeyChow). You want your program to ensure that Monkeys can only eat Monkey Chow, Cats can only eat Cat Chow, but Dogs can eat all three kinds. So, you add a method to each of the three Animal classes, something like "boolean canEat(Food chow)". The implementation for Cat looks like "return chow instanceof CatChow", Dog looks like "return true", and so forth.

What happens when you add an animal? No problem, you just have to implement its canEat() method. What about when you add a new type of food? Well, you have to go through all the existing animals and make sure their implementation is still right. For instance, since chocolate is poisonous to dogs, the Dog implementation of canEat() is wrong. No compiler error, but your dog might die.

Or, you could flip the problem around, and put methods on all the Foods, saying what animals can eat it. Now, when you add a food, it's just a matter of implementing its canBeEatenBy(Animal animal) method. But when you add a new animal, you have to check all the Food implementations.

This problem of how to describe the interactions between entities is called "dual dispatch." I know of no particularly good general solution.