Tagged: delta test data seed

June 3, 2014

Test Automation Tips: 2

#2 If you update or delete test data in a test, you need to isolate the data so it isn’t used by any other tests.

A major cause of flaky tests (tests that fail for indeterminate reasons) is bad test data. This is often caused by a previous test altering the test data in a way that causes subsequent tests to fail. It is best to isolate test data for scenarios that change the test data. You should have an easy way to isolate read only data so that you can insure it isn’t corrupted up by your tests.

Delta Seed

One way I like to do this is having what I call a delta test data seed. This delta seed loads all read only test data at the start of a test suite run. Any test data that needs to be updated or deleted is created per test. Mutable test data seeds are ran after the delta seed. So, in addition to the delta seed I will have a suite, feature or scenario seed.

Suite Seed

The suite seed is ran right after the delta seed, usually with the same process that runs the delta seed. Because the suite seed data is available to all tests being ran, it is the most riskiest seed and the least efficient unless you are running all of your tests as you may not need all of the data being loaded. I say risky, because it opens up the scenario where someone writes a test against mutable data when it should only be used by the test that will be changing the data.

Feature Seed

The feature seed would run at the beginning of a feature test during text fixture setup. This basically loads all the data used by tests in the feature. This has some of the same issues as the suite seed. All of the data is available for all tests in the feature and if someone gets lazy and writs a test against the mutable data instead of creating new data specifically for the test may result in flaky tests.

Scenario Seed

The scenario seed runs at the beginning of an individual test in the feature test fixture. This is the safest in terms of risk as the data is loaded for the test and deleted after the test so no other tests can use it. The problem I have with this is when you have a lot of tests having to create hundreds of database connections and dealing with seeding in the test can have an impact on overall and individual test time. If not implemented properly this type of data seeding can also create a maintenance nightmare. I like to use test timing as an indicator of issues with tests. If you can’t separate the time to seed the data from the time to run the test, having the seed factored into the test can affect the timing in a way that has nothing to do with what is being tested. So, you have to be careful not to pollute your test timing with seeding.

Which Seed?

Which seed to use depends on multiple factors. How you run your tests? If you are running all tests, it may be efficient to use a suite seed. If you run features in parallel, you may want a feature seed to quickly load all feature data at one time. If you run tests based on dependencies in a committed change, you may want to keep seeding granular with a scenario seed. There are many other factors that you can take into account and trial and error is a good way to go as you optimize test data seeding.

Conclusion

The thing to take away is you need a strategy to manage the isolation of test data by desired immutability of the data. Tests that don’t alter test data should use read only data. Tests that alter test data should use test data specifically created for the test. If you allow a test to use data altered by another test, you open yourself up to flaky test syndrome.

View more tips.

When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31