Category: Quality

Test Automation Tips: 2

#2 If you update or delete test data in a test, you need to isolate the data so it isn’t used by any other tests.

A major cause of flaky tests (tests that fail for indeterminate reasons) is bad test data. This is often caused by a previous test altering the test data in a way that causes subsequent tests to fail. It is best to isolate test data for scenarios that change the test data. You should have an easy way to isolate read only data so that you can insure it isn’t corrupted up by your tests.

Delta Seed

One way I like to do this is having what I call a delta test data seed. This delta seed loads all read only test data at the start of a test suite run. Any test data that needs to be updated or deleted is created per test. Mutable test data seeds are ran after the delta seed. So, in addition to the delta seed I will have a suite, feature or scenario seed.

Suite Seed

The suite seed is ran right after the delta seed, usually with the same process that runs the delta seed. Because the suite seed data is available to all tests being ran, it is the most riskiest seed and the least efficient unless you are running all of your tests as you may not need all of the data being loaded. I say risky, because it opens up the scenario where someone writes a test against mutable data when it should only be used by the test that will be changing the data.

Feature Seed

The feature seed would run at the beginning of a feature test during text fixture setup. This basically loads all the data used by tests in the feature. This has some of the same issues as the suite seed. All of the data is available for all tests in the feature and if someone gets lazy and writs a test against the mutable data instead of creating new data specifically for the test may result in flaky tests.

Scenario Seed

The scenario seed runs at the beginning of an individual test in the feature test fixture. This is the safest in terms of risk as the data is loaded for the test and deleted after the test so no other tests can use it. The problem I have with this is when you have a lot of tests having to create hundreds of database connections and dealing with seeding in the test can have an impact on overall and individual test time. If not implemented properly this type of data seeding can also create a maintenance nightmare. I like to use test timing as an indicator of issues with tests. If you can’t separate the time to seed the data from the time to run the test, having the seed factored into the test can affect the timing in a way that has nothing to do with what is being tested. So, you have to be careful not to pollute your test timing with seeding.

Which Seed?

Which seed to use depends on multiple factors. How you run your tests? If you are running all tests, it may be efficient to use a suite seed. If you run features in parallel, you may want a feature seed to quickly load all feature data at one time. If you run tests based on dependencies in a committed change, you may want to keep seeding granular with a scenario seed. There are many other factors that you can take into account and trial and error is a good way to go as you optimize test data seeding.

Conclusion

The thing to take away is you need a strategy to manage the isolation of test data by desired immutability of the data. Tests that don’t alter test data should use read only data. Tests that alter test data should use test data specifically created for the test. If you allow a test to use data altered by another test, you open yourself up to flaky test syndrome.

View more tips.

When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?

An Easy Win for Testable Methods

One of our developers was having an issue testing a service. Basically, he was having a hard time hitting the service as it is controlled by an external company and we don’t have firewall rules to allow us to easily reach it in our local environment. I suggested mocking the service since we really weren’t testing getting a response from the service, but what we do with the response. I was told that the code is not conducive to mocking. So, I took a look and they were right, but the fix to make it testable was a very simple refactor. Here is the gist of the code:

public SomeResponseObject GetResponse(SomeRequestObject request)
{
//Set some additonal properties on the request
request.Id = "12345";

//Get the response from the service
SomeResponseObject response = Client.SendRequest(request);

//Do something with the response
if (response != null)
{
//Do some awesome stuff to the response
response.LogId = "98765";
if (response.Id > "999")
{
response.Special = true;
}
LogResponse(response);
}

return response;
}

What we want to test is the “Do something with the response” section of the code, but this method is doing so many things that we can’t isolate that section and test it…or can we? To make this testable we simply move everything that conserns “Do something with the response” to a separate method.

public SomeResponseObject GetResponse(SomeRequestObject request)
{
//Set some additonal properties on the request
request.Id = "12345";

//Get the response from the service
SomeResponseObject response = Client.SendRequest(request);

return ProcessResponse(response);
}
public SomeResponseObject ProcessResponse(SomeResponseObject response)
{
//Do something with the response
if (response != null)
{
//Do some awesome stuff to the response
response.LogId = "98765";
if (response.Id > "999")
{
response.Special = true;
}
LogResponse(response);
}

return response;
}

Now we can just test the changes to ProcessResponse method in isolation away from the service calls. Since there were no changes to the service or the service client, we didn’t have to worry about testing them for this specific change. We don’t care what gets returned we just want to know if the response was properly processed and logged. We still have a hard dependency on LogResponse’s connection to the database, but I will live with this as an integration test and fight for unit tests another day. This is a quick win for testability and a step closer to making this class SOLID.

Test Automation Tips: 1

#1 Don’t test links unless you are testing links.

Unless you are specifically testing menus or navigation links, don’t automate the clicking of links to navigate to the actual page under test. Instead take the shortest route to set the test up.

Let’s say I have a test that starts by clicking the product catalog link, then clicks on a product to get to product details, just so I can test that a coupon appears in product details. This test is testing too many concepts that are out of the scope of the intent of the test. I am trying to test the coupon visibility, not links. Instead, I should navigate directly to the product details page and make my assertion.

If you use the link method and fool yourself into believing that your test is acting like a user, when links are broken your test will fail for a reason other than the reason you are testing. If you use the same broken link in multiple tests, you will have multiple failures and will have to fix and rerun the tests to get feedback on the features you are really trying to test. If the broken tests are in a nightly test, you just lost a lot of test coverage and and you don’t know if the features are passing or failing because they never got tested. Using the link method can cause a dramatic loss of test coverage. So, test links in navigation tests only when the links are front and center in the purpose of the test. For all other tests, take the shortest route to the page under.

View more tips.

When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?

Test Automation Tips

When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?

#1 Don’t test links unless you are testing links.

#2 If you update or delete test data in a test, you need to isolate the data so it isn’t used by any other tests.

#3 Don’t become complacent in testing when a change is simple.

What Makes a Good Candidate for Test Automation?

Writing large UI based functional tests can be expensive in terms of money and time. It is sometimes hard to know where to focus your test budget. New features are good candidates, especially the most common successful and exceptional paths through the feature. But, when you have a monster legacy application with little to no coverage, where to get the biggest bang for the buck can be hard to ascertain.

Bugs, Defects, Issues…It Doesn’t Work

I believe bugs provide a good candidate for automation, especially if regression is a problem for you. Even if regression is not an issue, its always good to protect against regressions. So, automating bugs are kind of a win-win in terms of risk assessment. Hopefully, when a bug is found whoever finds the bug or whoever adds it to the bug database provides reproduction steps. If the steps are a good candidate for automation, automate it.

Analyzing Bugs

What makes a bug a good candidate for test automation? When analyzing bugs for automated testing I like to evaluate on 4 basic criteria. In descending order of precedence:

  • The steps are easy to model in the test framework.
  • The steps are maintainable as an automated test.
  • The bug was found before.
  • The bug caused a lot of pain to users or the company.

It is just common sense that “bug caused a lot of pain” is the top candidate. If a bug caused a lot of pain, you don’t want to repeat it, unless you like pain. Yet, if the painful bug is a maintenance nightmare as an automated test, the steps are hard to model, and the bug wasn’t found before you may want to just mark it for manual regression. If your test matches 2 or more of the criteria I’d say it is a high priority candidate for test automation.

Conclusion

These are just my opinion and there is no study to prove any of it. I know this has been thought of and pondered, maybe even researched by someone. If you know where I can find some good discussions on this topic or if you want to start one, please let me know.

 

Finding Bad Build Culprits…Who Broke the Build!

I found an interesting Google Talk on finding culprits automatically in failing builds – https://www.youtube.com/watch?v=SZLuBYlq3OM. This is actually a lightening talk at GTAC 2013 given by grad students Celal Ziftci and Vivek Ramavajjala. First they gave an overview of how culprit analysis is done on build failures triggered by small test and medium sized tests.

CL or change list is a term I first heard in “How Google Tests Software” and refers to a logical grouping o changes committed to the source tree. This would be like a git feature branch.

Build and Small Tests Failures

When the build fails because of a build issue we build the CLs separately until a CL fails the build. When the failure is a small test (unit test) we do the same thing. Build CLs separately and run the tests against them to find the culprit. In both cases, we can do the analysis in parallel to speed it up. This is what I covered in my post on Bisecting Our Code Quality Pipeline where git bisect is used to recurse the CLs.

Medium Tests

Ziftci and Ramavajjala define these tests as taking less than 8 minutes to run and suggest using a binary search to find the culprit. Target the middle CL, build it and if it fails, the culprit is most likely to the left, so we recurse to the left until we find the culprit. If it passes, we recurse to the right.

CL 1 – CL 2 – CL 3 – CL 4 – CL 5 – CL6

CL 1 is the last known passing CL. CL 6 was the last CL in the failing build. We start by analyzing CL 4 and if fails, then we move left and check CL 3. If CL 3 passes, we mark CL 4 as the culprit. If CL 3 fails we mark CL 2 as the culprit because we know that CL 1 was good and don’t need to continue analyzing.

If CL 4 passed, we would move right and test CL 5 and if it fails, mark CL 5 as the culprit. If it passes, then we mark CL 6 as the culprit because it is the last suspect and we don’t have to waste resources analyzing it.

Large Tests

They defined these tests as taking longer than 8 minutes to run. This was the primary focus of Ramavajjala and Ziftci’s research. They are focusing on developing heuristics that will let a tool identify culprits by pattern matching. They explained how they have a heuristic that will analyze a CL for number of files changed and give a higher ranking to CLs with more files changed.

They also have a heuristic that calculates the distance of code in the CL from base libraries, like the distance from the core Python library for example. The closer it is to the core the more likely that it is a core piece of code that has had more rigorous evaluation because there may be many projects depending on it.

They seemed to be investing a lot of time into insuring that they can do this fast. They stress caching and optimizing how they do this. It sounds interesting and once they have had a chance to run their tool and heuristics against the massive amount of tests at Google (they both became employees of Google) hopefully they can share the heuristics that prove to be most adept at finding culprits at Google and maybe anywhere.

Thoughts

They did mention possibly using a heuristic that looks at the logs generated by build failures to identify keywords that may provide more detail on who the culprit maybe. I had a similar thought after I wrote the git bisect post.

Many times when a test fails in larger tests there are clues left behind that we would normally manually inspect to find the culprit. If the test has good messaging on their assertions, that is the first place to look. In a large end to end test there may be many places for the test to fail, but if the failure message gives a clue of what fails it helps to find the culprit. Although, they spoke of 2 hr tests and I have never seen one test that takes 2 hours so what I was thinking about and what they are dealing with may be another animal.

There is also the test itself. If the test covers a feature and I know that only one file in one CL is included in the dependencies involved in the feature test, then I have a candidate. There is also application logs and system logs. The goal as I saw it was to find a trail that led me back to a class, method, or file that matches a CL.

The problem with me trying to seriously try and solve this is I don’t have a PhD in Computer Science, actually I don’t have a degree except from the school of hard knocks. When they talked about the binary search for medium sized tests it sounded great. I kind of know what a binary search is. I have read about it and remember writing a simple one years ago, but if you ask me to articulate the benefits of using quad tree instead of binary search or to write a particular one on the spot, I will fumble. So, trying to find an automated way to analyze logs in a thorough, fast and resource friendly manner is a lot for my brain to handle. Yet, I haven’t let my shortcomings stop me yet, so I will continue to ponder the question.

We are talking about parsing and matching strings, not rocket science. This maybe a chance for me to use or learn a new language more adept at working with strings than C#.

Conclusion

At any rate I find this stuff fascinating and useful in my new role. Hopefully, I can find more on this subject.

 

How Much Does Automated Test Maintenance Cost?

I saw this question on a forum and it made me pause for a second to think about it. The quick answer is it varies. The sarcastic answer is it costs as much as you spend on it, or how about, it cost as much as you didn’t spend on creating a maintainable automation project.

I have only been involved in 2 other test automation projects prior to my current position. In both I also had feature development responsibility. On one of the projects, comparing against time developing features, I spent about 10-15% of my time maintaining tests and about 25% writing them. So, that is about 30-40% of my total test time on maintenance. Based on my knowledge today, some of my past tests weren’t that good so maybe the numbers should have been higher or lower. On the other project, test maintenance was closer to 50% and that was because of poor tool choice. I can state the numbers because I tracked my time spent. I could not use these as benchmarks to estimate maintenance cost on my current project or any other unless the context was very similar and I can easily draw the comparison.

I have seen where someone might say “it’s typically between this and that percentage of development cost,” or something similar. Trying to quantify maintenance costs is hard, very hard and it depends on the context. You can try to estimate based on someone else’s guess of a rough percentage and hope it pans out, but in the end it is dependent on execution and environment. An application that changes often vs. one that rarely changes, poorly written automated tests, bad choice of automation framework, skill of the automated tester…there is a lot that can change cost from project to project. I am curious if someone has a formula to calculate an estimate across all projects, but having an insane focus on the maintainability of your automated test suites can significantly reduce costs in the long run. So a better focus, IMHO, is on getting the best test architecture, tools, framework, people and make maintainability a high priority goal. Also properly tracking maintenance in the project management or bug tracking system can provide a more valuable measure of cost across the life of a project. If you properly track maintenance cost (time), you get a benchmark that is customized for your context. Trying to calculate cost up front with nothing to base the calculations on but a wild uneducated guess can lead to a false sense of security.

So, if you are trying to plan a new automation project and you ask me about cost the answer is, “The cost of having automated tests…priceless. The cost of maintaining automated tests…I have no idea.”