Shaking out test interdependence
Tags: Software, Python, Testing, Intermediate Python
I’ve been battling bugs recently, this came up and it seemed like a fun exercise in test suite hygiene. I take the view that test hygiene is everyones responsability regardles of the level of the engineer, this is such a simple subject that everyone regardless of their skill level can implement the ideas here.
Flaky tests are the bane of many a developers existence, even worse than this however is test inter-dependencies. Let’s say we’ve got the following code, some function that marks a data structure as being completed.
from datetime import datetime
def notify(id, name):
...
def mark_complete(data):
data['complete'] = True
data['completed_at'] = datetime.utcnow()
notify(data['id'], data['name'])
return data
And then an accompanying test;
from uuid import uuid4
DATA = {
"id": uuid4()
"name": "potato"
}
def test_mark_complete():
completed_data = mark_complete(DATA):
assert completed_data['complete'] is True
assert completed_data['completed_at'] is not None
The above tests check the functionality fine. We take an existing data structure and test to see if
we can insert data into it successfully. The problem here is we’re leaving state in the application,
the DATA
structure now includes a complete
and completed_at
key for the rest of the duration
of the run. Right now this has no intended side affects, in the future, maybe it will…
Let’s add some more functionality…
from datetime import datetime
def clear_name_and_rerandomise(data):
data['id'] = uuid4()
del data['name']
return data
And a test again…
def test_clear_name_and_rerandomise():
starting_id = DATA['id']
completed_data = clear_name_and_rerandomise(DATA):
assert completed_data['id'] != starting_id
assert completed_data.get('name') is None
This mutates the data object that is passed in by removing an existing field. Our mark_complete
will now no longer work if this is run against the DATA
object before. This is a contrived example
that wouldn’t really fool anyone, but these things happen in data driven applications where state is
stored remotely and application behaviour depends upon the data.
My favourite cheap trick to help identify test interdependence is to randomise the order. In Python this is so simple that you can install a test runner plugin to do it for you, if you’re using pytest then pytest-random-order is probably a good place to get started with this.
If you run the tests above with this plugin installed you’ll find the tests fail about 50% of the time. In this case where you have two tests it’s pretty clear what the problem is, but in a test suite of 1000 tests it might just appear to be another flaky test. You might want to pass in a seed and run the tests twice to validate this is a problem with your data versus a test dependency issue.
# Create a seed.
export RANDOM_SEED=$(shuf -i 0-100000 -n 1)
# Run with the seed.
pytest --random-order-seed=$RANDOM_SEED
This isn’t the most beautiful solution, but it’s better than the alternative which is building an app or test suite with implicit dependencies and having to fix them down the line…