Notes on Erlang QuickCheck - Part 2

Yesterday I had the pleasure of spending a few extra hours with John Hughes. As usual, it has been an impressive learning experience, so I decided - once again - to share some of the highlights from our discussions.

We modeled one of our internal Erlang applications, using a Quickcheck statem. The application was relatively small (i.e. <1000 LOC), it offered a pretty straightforward API and its code was already covered by a fair amount of unit tests, with code coverage surpassing 90%.

As already happened to someone else, I didn’t expect to find too many bugs, if not in the Quickcheck model itself. Once again, Quickcheck proved me to be very wrong, finding five or six corner cases which caused the tests to fail. It also revealed some interesting behaviour for the application which, even if ineherently correct, was not clearly documented.

It took me a couple of days to get familiar with the application and its API, to implement a decent model for it and to wrap up everything into a pull request and into these notes.

So, let’s start go through a couple of learnt lessons.

Is ‘zero’ a natural number?

There is no universal agreement about whether to include zero in the set of natural numbers. In Erlang QuickCheck, zero is considered a natural number, so it can be generated by the eqc_gen:nat/0 generator. In fact, a nat() normally shrinks to 0. This is easily verifiable by means of the eqc_gen:sampleshrink/1 function which, given a generator G, prints a value generated by G, followed by one way of shrinking it:

> eqc_gen:sampleshrink(eqc_gen:nat()).
3
--> [0,2]
ok
> eqc_gen:sampleshrink(eqc_gen:nat()).
8
--> [0,2,6,7]
ok

If you want to exclude the zero from the possible outcomes, you can define something like:

non_zero_nat() ->
    ?SUCHTHAT(N, eqc_gen:nat(), N > 0).

Code coverage for Erlang Quickcheck

Quickcheck does not provide code coverage information out of the box, so you need to use an external coverage analysis tool such as cover to obtain that piece of information.

If you use the rebar build tool to compile and test your Erlang applications, there is an easy solution to the problem. In fact, even if rebar’s cover support is currently limited to the EUnit testing framework, that is going to be enough in most of the cases. The trick is to wrap your Quickcheck properties into EUnit test suites.

Simply add the EUnit header file after the Quickcheck one:

-include_lib("eqc/include/eqc.hrl").
...
-include_lib("eunit/include/eunit.hrl").

And wrap each Quickcheck property into a EUnit test case. The result should look something like this, where the timeout is expressed in seconds:

prop_sample_test_() ->
  { "prop_sample", {timeout, 10, ?_assert(eqc:quickcheck(?MODULE:prop_sample()))}}.

That’s it. You can now verify your Quickcheck properties via Eunit:

./rebar skip_deps=true eunit

And get your coverage reports in .eunit/index.html.

Generic servers, the application controller and the terminate/1 function

One of the most interesting issues which emerged while implementing our Quickcheck statem model was related to the semantics of the terminate/1 callback function for the gen_server OTP behaviour.

In fact, while adding commands to our statem, we noticed a rarely occurring issue. I say “rarely” since it only appeared after increasing the number of generated test cases from the 100 (default value) to 1000:

eqc:quickcheck(eqc:numtests(1000, sample_eqc:sample_prop())).

The issue seemed to happen on an application restart (i.e. a stop command immediately followed by a start command). Both commands were using the application module to start and stop the Erlang application. Some basic tracing revealed that the application was not cleaning up things properly upon termination. The guilty process seemed to be a gen_server. We immediately noticed that the process was not trapping exits, which prevented the terminate/1 from executing. So far, nothing too interesting. We enabled the trap_exit flag for the gen_server and we re-run the tests.

At this point the tests passed, but they were taking very long time, which was a bit surprising. The terminate/2 function, containing the cleanup code, seemed harmless at a first glance. Nonethless, the gen_server process was hanging somewhere, so that the supervisor had to kill it after the configured shutdown time, which was set to 5 seconds.

A second look at the cleanup function pointed to a suspicious application:set_env/3 call. The server was trying to set an application environment variable before dying. A rapid exploration of the OTP for the application_controller module revealed the mystery.

In OTP, the application environment is handled via a ETS table, which is accessed via the application_controller process. Whilst reading an environment variable is implemented as a mere lookup operation, writes are serialized through the application controller. In other words, it is not possible to set an environment variable from a terminate/1 callback function in a gen_server which traps exits, since in case of application termination the application controller is busy shutting down the application itself and cannot handle other requests, causing a deadlock. A minimal example showing the deadlock can be found in the following gist.

Even if the terminate/1 was covered by unit tests, the issue never emerged since in that case the terminate function was not called via the application controller, but as the consequence of a normal exit signal.

Yet another example of a buggy function 100% covered by unit tests has been found. Praises to Quickcheck!

Show states

When implementing the model for a Quickcheck statem, you often end up with counter-examples which are a direct consequence of bugs in your properties, missing preconditions and postconditions or incorrect state updates. In such cases, it’s a good idea to include model states into the standard Quickcheck output. You can customize Quickcheck pretty-printing by using eqc_gen:with_parameters/3 and enabling the show_states flag:

prop_sample() ->
  ?FORALL(Cmds, commands(?MODULE),
  begin
    {H, S, Res} = run_commands(?MODULE, Cmds),
    with_parameters(
      [{show_states, true}],
      pretty_commands(
        ...

Alternatively, you can display the model states by using the eqc_statem:show_states/1 function, useful if you are quick-checking your properties from the shell:

eqc:quickcheck(eqc_statem:show_states(Prop)).

By including model states into the output you’ll tend to have some quite verbose output, so the suggestion is to keep the flag disabled by default and to enable it only when needed.

Generating unique lists

Sometimes you have a complex data structure (let’s call it an item) and you want to generate a list of items, ensuring that the list does not contain duplicates. You’re going to add the generated list to your state and you want to ensure that any element of the list already exists in the state.

The trick to solve the above in a way that ensures a decent shrinking behaviour for your generator is to split the existing problem into smaller ones. For the sake of simplicity, let’s assume that a complex item is a 2-elements tuple and that the first element of the tuple is an id which identifies a tuple. Let’s also assume that an id is a natural number.

What you can do is to generate a list of unique ids. You can use the lists:usort/1 function to ensure there are not duplicates. The condition ensures that the generated element is not already part of the state.

uids(S) ->
  ?LET(L, list(nat()),
       [X || X <- lists:usort(L), not lists:keymember(X, 1, S#state.items)]).

Starting from the generated list of unique ids, you can then construct a list of complex items:

items(S) ->
  ?LET(Ids, uids(S), [item(Id) || Id <- Ids]).

Where the item/1 function creates a complex data structure starting from an identifier”

item(Id) ->
  ?LET(Value, some_value_generator(), {Id, Value}).

No spikes in time

One of the properties you may want to verify is that a single API request doesn’t take longer than a certain amount of time. A simple way to check this is to use the timer:tc/1 function. There are a couple of important considerations to be made about this function. For example, it is advised to run it from a newly spawned process and to call erlang:garbage_collect/1 just before using it, to lower the possibilities the Erlang garbage collector starts its job during the measurement. Nonethless, if you don’t need extremely accurate measurements, the following approach may be good enough for you.

Let’s assume you have an exec command in your Quickcheck statem:

exec_command(S) ->
  Args = ...
  {call, ?MODULE, exec, [Args]}.

Your ?MODULE:exec/1 function should then look something like:

exec(Args) ->
  timer:tc(server, exec, [Args]).

And the postcondition would look something like:

exec_post(S, [Args], {Time, Result}) ->
  Time < expected_time(exec, Args) andalso Result ...

Where:

expected_time(exec, Args) -> 10 * 1000. %% 10 ms

As a plus, you may want collect some statistics about times (which are now part of the command results) in your property, via the aggregate/2,3 and collect/2,3 functions.

Dealing with race conditions

In our case, it proved to be really hard to provoke the race condition which caused our property to fail. This is why we wrapped our property into a ?ALWAYS(N, Prop) to repeat the same test several time:

prop_sample() ->
  ?FORALL(Attempts, ?SHRINK(1, [100]),
    ?FORALL(Cmds, commands(?MODULE),
      ?ALWAYS(Attempts,
        begin
          ...
        end))).

Notice how we abused the ?SHRINK macro to force Quickcheck to test each case once while searching for a failure and then 100 times for each shrinking attempt.

Amending the counterexample

In case of a failing property, Quickcheck stores a copy of the counter-example found. The last counter-example can be retrieved by the eqc:counterexample/0 function. Once fetched, it can be used to repeat a test or to test a different property in the same case, for example using the eqc:check(Prop) function.

The eqc:counterexample/0 function returns a counterexample() abstract data type. With the current version of Quickcheck, it’s possible to decompose the return value into a list of two elements:

[N,C] = eqc:counterexample().

The second element (C) is the actual counter-example, whilst the first element (N) represents the number of attempts you want to re-try that test. If you want to re-try the counterexample a certain amount of times, you can simply do:

eqc:check(sample_eqc:prop_sample(), [1000, B]).