Testing concurrency can be hard. When you fire up threads from within a test, it’s difficult not to introduce concurrency bugs in the test code and be sure you’re actually exercising the code with the intended interleaving. There must be a better way…
Separate the Concurrency Policy from Behaviour
GOOS amongst other people recommend separating the concurrency policy from the parts of the system that are doing the work. So, for example, if you have some form of “executor” which is responsible for farming work out concurrently and behaviour defined separately, you can test each independently and just verify that they collaborate to achieve “worker” behaviour concurrently. Make sense?
As an example, this test from tempus-fugit demonstrates the idea. The Scheduler
’s behaviour (which is essentially to “schedule” tasks) is independent from how it actually achieves this. In this case, it delegates to an Executor
and so this doesn’t need to be tested with any threads. It’s a simple collaborator style test.
Having said that, there may be times you actually want to run your class ‘in context’ in a multi-threaded way. The trick here is to keep the test deterministic. Well, I say that, there’s a couple of choices…
Deterministic
If you can setup your test to progress in a deterministic way, waiting at key points for conditions to be met before moving forward, you can try to simulate a specific process interleaving to test. This means understanding exactly what you want to test (for example, forcing the code into a deadlock) and stepping through deterministically (for example, using abstractions like CountdownLatch
to synchronise the moving parts).
When you attempt to make some multi-threaded test syncrhonise its moving parts, you can use whatever concurrency abstraction is available to you but it’s difficult because its concurrent; things could happen in an unexpected order. Often people try to mitigate this in tests by introducing sleep
calls. We generally don’t like to sleep in a test because it can introduce non-determinism. Just because the right sleep amount on one machine usually causes the affect you’re looking for, it doesn’t mean it’ll be the same on the next machine. It’ll also make the test run slower and when you’ve got thousands of tests to run, every ms counts. If you try and lower the sleep period, more non-determinism comes in. It’s not pretty.
Some examples of forcing specific interleaving include
- Forcing a deadlock using
CountdownLatch
- Setting up a thread to be interrupted
Another gotcha is where the main test thread will finish before any newly spawned threads under test complete. This is an easy trap to fall into with UI testing. Waiting for a specific condition rather than allowing the test thread to finish often helps. For example using WaitFor. See the article Be Explicit with the UI Thread for more details around this for UI testing.
Soak / Load Testing
Another choice is to bombard your classes in an attempt to overload them and force them to betray some subtle concurrency issue. Here, just as in the other style, you’ll need to setup up specific assertions so that you can tell if and when the classes betray themselves. Of course there is no guarantee that you’ll simulate a problem, you might never see the unlike timing needed.
The tempus-fugit library offers a declarative way to setup tests to run repeatedly and in parallel, see Load / Soak Tests.