Startup Tip #3: Testing. Just Do it™
ugghh! Testing… so lame. Do we really have to do this?!
But you know what sucks more than testing:
- Losing customers & partners after a massive outage
- Having to figure out what brought your system down at peak while clients and partners are yelling at you (aka debugging).
- Finding out too late that your last release has a huge memory leak and spending days constantly restarting downed servers.
I have learned the hard way how important testing is. Lets just say… that I didn’t test a production push in my early days and… brought down a whole site for a few million people for an hour. Oops, there goes that partner.
If customers can be affected by the code, it needs to be tested thoroughly.
How to Test
Here are all the tests that are worth your time in the long run (in chronological order within a dev cycle).
- unit tests:
- integration tests:
- Answers: Does the business logic of this feature work?
- Here use cases from product specs are converted into automated tests. These tests should exercise code from as many components as would be realistic on a production system (e.g. UI logic, backend logic, and database logic).
- Sample bug that would be caught: ”The UI team called a field ‘foo’ but the backend assumed it would be ‘Foo’”
- Integration tests should be clear enough for the product manager to read through.
- To run quickly and with minimal setup, integration tests should be ran in a single process. e.g. use sql lite through mocking instead of connecting over tcp to a real database. (see mock objects and dependency injection)
- smoke tests:
- Answers: Does the server boot? Does a basic request go through without throwing an error?
- This test (often manual) is a quick sanity check before handing a feature off to a dedicated QA team. Trust me, you will piss off your QA team if you say you are done and the server won’t even start.
- code reviews:
- Answers: Was proper testing done? Were coding standards followed? Where should the QA team be most concerned, especially around performance?
- Code review can become a heated topic (see points #8/#9 from this pdf). As lame as it sounds, I believe its critical to maintain a positive culture that more bugs found in code review generally means that a developer’s task was more complex.
- Suggested Tool: Review Board
- ui / backend manual testing:
- Answers: Does the feature work?
- Here you run through use cases by hand on a clone of the production system (aka a staging stack): using the UI and making requests. Do you get the correct results?
- These are essentially a repeat of integration tests.
- UI testing automation: If you are confident that parts of your UI layout won’t be changing soon, go ahead and build automated tests for them with selenium.
- system testing and load testing
- Answers: Does the new feature work after a long period of load? Do we have a memory leak?
- Here you set up a staging stack and try to simulate the real world as much as possible using the feature. The goal is to leave the feature running smoothly for a few hours.
- Replay production traffic or a set of traffic designed to test specific features.
- I recommend having a big script that populates your database with an object to be affected by each core feature. For example, at Invite, our test script would generate campaigns that should only serve to US traffic and campaigns that had specific dollar budgets. After you run traffic, run through a saved check list of what the system should look like if old and new features are working properly. At Invite, this meant that campaigns didn’t go over budget and served to the correct geographies.
- Watch out for rapid rates of memory growth (aka memory leaks)
- Don’t forget to review error logs post testing.
- Suggested Tools: apache bench, httperf, something custom to log and replay traffic (twisted makes building this easy)
- fuzz tests (aka negative tests):
- Answers: Do lots of malformed requests bring down the system? (Hint: they shouldn’t)
- Verify that malformed requests are logged gracefully and don’t bring down the system. For low latency systems, you likely don’t want every error to be written to disk as this causes lots of lost cpu time to IO operations.
- performance test:
- Answers: Has performance degraded too much in this release? (Will serving costs get too high?)
- Replay a repeatable set of traffic. It can be custom or a replay of production traffic. The important part is that its the same set of requests used in each release for comparison purposes.
- Suggested Tools: apache bench, httperf
Continuous Integration (CI):
- The sooner you catch bugs in the development process, the easier they are to debug and fix. e.g. you know immediately which commit is the culprit of a breakage.
- Suggested Tool: Hudson
Who should be testing what?
Unit tests through code review (listed above) should be done by the development team.
“ui / backend manual testing” and down can be done by a dedicated QA team. Product managers should be involved in the UI / backend manual testing step.
What do you automate? Unit, integration, and some UI tests should go into your CI system. System, fuzz, and performance should leverage scripts but are tough to automate and run with CI.