Category: Quality

Bisecting Our Code Quality Pipeline

I want to implement gated check-ins, but it will be some time before I can restructure our process and tooling to accomplish it. What I really want is to be able to keep the source tree green and when it is red provide feedback to quickly get it green again. I want to run tests on every commit and give developers feedback on their failing commits before it pollutes the source tree. Unfortunately, to run the tests as we have it today would take too long to test on every commit. I came across a quick blog post by Ayende Rahien on Bisecting RavenDB and they had a solution were they used git bisect to find the culprit that failed a test. They gave no information on how it actually worked just a tease that they are doing it. I left a comment to see if they would share some of their secret sauce behind their solution, but until I get that response I wanted to ponder it for a moment.

Git Bisect

To speed up testing and also allow test failure culprit identification with git bisect we would need a custom test runner that can identify what test to run and run them. We don’t run tests on every commit, we run tests nightly against all the commits that occurred for the day. When the test fails it can be difficult identifying the culprit(s) that failed the test. This is were the Ayende steps in with his team’s idea to use bisect to help identity the culprit. Bisect works by traversing commits. It starts at the commit we mark as the last known good commit to the last commit that was included in the failing nightly test. As bisect iterates over the commits, it pauses at each commit and allows you to test it and mark if it is good or bad. In our case we could run a test against a single commit. If it passes, tell bisect its good and to move to the next. If it fails, save the commit and failing test(s) as a culprit, tell bisect its bad and to move to the next. This will result in a list of culprit commits and their failing tests that we can use for reporting and bashing over the head of the culprit owners (just kidding…not).

Custom Test Runner

The test runner has to be intelligent enough to run all of the tests that exercise the code included in a commit. The custom test runner has to look for testable code files in the commit change log, in our case .cs files. When it finds a code file it will identify the class in the code file and find the test that targets the class. We are assuming one class per code file and one unit test class per code file class. If this convention isn’t enforced, then some tests may be missed or we have to do a more complex search. Once all of the test classes are found for the commit’s code files, we run the the tests. If a test fails, we save the test name and maybe failure results, exception, stack trace… so it can be associated with the culprit commit. Once all of the tests are ran, if any of them failed, we mark the commit as a culprit. After the test and culprit identification is complete, we tell bisect to move to the next commit. As I said before, this will result in a list of culprits and failing test info that we can use in our feedback to the developers.

Make It Faster

We could make this fancy and look for the specific methods that were changed in the commit’s code file classes. We would then only find tests that test the methods that were changed. This would make testing focused like a lazer and even faster, but we could probably employ Roslyn to handle the code analysis to make finding tests easier. I suspect tools like ContinuousTests – MightyMoose do something like this, so it’s not that far fetched an idea, but definitely a mountain of things to think about.

Conclusion

Well this is just a thought, a thesis if you will, and if it works, it will open up all kind of possibilities to improve our Code Quality Pipeline. Thanks Ayende and please think about open sourcing that bisect.ps1 PowerShell script 🙂

TestPipe Test Automation Framework Release Party

Actually, you missed the party I had with myself when I unchecked private, clicked save on GitHub, and officially release TestPipe. You didn’t miss your chance to checkout TestPipe, a little Open Source project that has the goal of making automated browser based testing more maintainable for .NET’ters. The project source code is hosted on GitHub and the binaries are hosted on NuGet:

 

 

If you would like to become a TestPipe Plumber and contribute, I’ll invite you to the next party :).

 

Results of my personal make a logo in 10 minutes challenge. Image
 
 

Trust No One or a Strange Automated Test

Nullius in verba (Latin for “on the word of no one” or “Take nobody’s word for it”)
http://en.wikipedia.org/wiki/Nullius_in_verba

This is the motto for the Royal Society, UK’s Academy of Science. I bring this up because I inherited an automated test suite and I am in the process of clearing current errors and developing a maintenance plan for them. As I went through the test I questioned whether I could trust them.  In general its difficult to trust automated tests and its worse when I didn’t write them. Then I remembered “nullius in verba” and decided that although I will run these tests, fix them and maintain them, I can not trust them. In fact, since I am now responsible for all automated tests I can’t put any value in any test unless I watch them run, understand their purpose, and ascertain the validity of their assumptions. This is not to say that the people that write tests that I maintain cannot be trusted because of incompetence. In fact, many of the tests that I maintain have been crafted by highly skilled professionals. I just trust no one and want to see for myself.

Even after evaluating automated tests, I can’t really trust them because I don’t watch every automated test run. I can’t say for certain that they passed or failed or that they are a false positive. Since I don’t watch every test run I can only hope they are OK. I can’t even trust someone else’s manual testing with the infallibility of man, so I can’t trust an automated check written by an imperfect human. So, I view automated tests like manual test, they are tools in the evaluation of the software under test.

It would be impractical to manually run every test covered by the automated suite so a good set of tests provide more coverage than manual execution alone. One way automated tests provide value is when they uncover issues that point to interesting aspects of the system that warrant further investigation. Failing tests or unusually slow tests can give a marker to focus on in manual exploration of the software. This is only true if the tests are good, like being focused on one concept, not flaky or sometimes passing or failing, and other attributes of a good automated test. If the tests are bad, their failures may not be actual and take away all value from the automated test because I have to waste time instigating them. In fact, having an automated test suite plagued with bad tests can increase the effort required to maintain test so much that it negates any value they provide. The maintainability of a test is a primary criteria that I evaluate when I inherit them from someone else and I have to see for my self if each test is good and maintainable before I can place any value in them.

So, my current stance is to not trust anyone else’s test. Also, I do not elevate automated tests to being the de facto standard that the software works. Yet, I find value in the automated tests as another tool in my investigation of the quality of the software. If they don’t cost much in terms of maintenance or running them, they provide value in my evaluation of software quality.

Nullius in verba

Scientific Exploration and Software Testing

Test ideas by experiment and observation,

build on those ideas that pass the test,

reject the ones that fail.

Follow the evidence wherever it leads

and question everything.

Astronomer Neil deGrasse Tyson, Cosmos, 2014

This was part of the opening monologue to the relaunch of the Cosmos television series. It provides a nice interpretation of the scientific method, but also fits perfectly with one of my new roles as software tester. Neil finishes this statement with

Accept these terms and the cosmos is yours. Now come with me.

It could be said, “Accept these terms and success in software testing is yours.” What I have learned so far about software testing falls firmly in line with the scientific method. I know software testing isn’t as vast as exploring billions of galaxies, but with millions of different pathway through a computer program, software testing still requires similar rigor as any scientific exploration.

IE WebDriver Proxy Settings

I recently upgraded to the NuGet version of IE WebDriver (IEDriverServer.exe). I started noticing that when I ran my tests locally I could no longer browse the internet. I found myself having to go into internet settings to reset my proxy. My first thought was that the new patch I just received from corporate IT may have botched a rule for setting the browser proxy. After going through the dance of running tests, resetting proxy, I got pretty tired and finally came to the realization that it must be the driver and not IT.

First stop was to check Bing for tips on setting proxy for WebDriver. Found lots of great stuff for Java, but no help on .Net. Next, I stumbled upon a log message in the Selenium source change log that said, “Adding type-safe Proxy property to .NET InternetExplorerOptions class.” A quick browse of the source code and I had my solution.

In the code that creates the web driver I added a proxy class set to auto detect.

Proxy proxy = new Proxy();
proxy.IsAutoDetect = true;
proxy.Kind = ProxyKind.AutoDetect;

This sets up a new Proxy that is configured for auto detect. Next, I added 2 properties, Proxy and UsePerProcessProxy to the InternetExporerOptions

var options = new OpenQA.Selenium.IE.InternetExplorerOptions
{
     EnsureCleanSession = true,
     Proxy = proxy,
     UsePerProcessProxy = true
};

Proxy is set the the proxy we previously set up. UsePerProcessProxy tells the driver that we want this configuration to be set per process, NOT GLOBALLY, thank you. Shouldn’t this be the default, I’m just saying. EnsureCleanSession, clears the cache when the driver starts, this is not necessary for the Proxy config and is something I already had set.

Anyway, with this set up all we have to do is feed it to the driver.

var webDriver = new OpenQA.Selenium.IE.InternetExplorerDriver(options);

My test coding life is back to normal, for now.

Agile Browser Based Testing with SpecFlow

This may be a little misleading as you may think that I am going to give some sort of Scrumish methodology for browser based testing with SpecFlow. Actually, this is more about how I implemented a feature to make browser based testing more realistic in a CI build.

Browser based testing is slow, real slow. So, if you want to integrate this type of testing into your CI build process you need a way to make the tests run faster or it may add considerable time to the feedback loop given to developers. My current solution is to only run tests for the current sprint. To do this I use a mixture of SpecFlow and my own home grown test framework, TestPipe, to identify the tests I should run and ignore.

The solution I’m using at the moment centers on SpecFlow Tags. Actually, I have blogged about this before in my SpecFlow Tagging post. In this post I want to show a little more code that demonstrates how I accomplish it.

Common Scenario Setup

The first step is to use a common Scenario setup method. I add the setup as a static method to a class accessible by all Step classes.

public class CommonStep
 {
 public static void SetupScenario()
 {
 try
 {
 Runner.SetupScenario();
 }
 catch (CharlesBryant.TestPipe.Exceptions.IgnoreException ex)
 {
 Assert.Ignore(ex.Message);
 }
 }

TestPipe Runner Scenario Setup

The method calls the TestPipe Runner method SetupScenario that handles the Tag processing. If SetupScenario determines that the scenario should be ignored, it will throw the exception that is caught. We handle the exception by Asserting the the test ignored with the test frameworks ignore method (in this case NUnit). We also pass the ignore method the exception method as there are a few reasons why a test may be ignored and we want the reason included in our reporting.

SetupScenario includes this bit of code

if (IgnoreScenario(tags))
 {
 throw new IgnoreException();
 }

Configuring Tests to Run

This is similar to what I blogged about in the SpecFlow Tagging post, but I added a custom exception. Below are the interesting methods for the call stack walked by the SetupScenario method.

public static bool IgnoreScenario(string[] tags)
 {
 if (tags == null)
 {
 return false;
 }
string runTags = GetAppConfigValue("test.scenarios");
runTags = runTags.Trim().ToLower();
return Ignore(tags, runTags);
 }

This method gets the tags that we want to run from configuration. For each sprint there is a code branch and the app.config for tests in the branch will contain the tags for the tests we want to run for the sprint on the CI build. There is also a regression branch that will run all the tests which runs weekly. All of the feature files are kept together, so being able to tag specific scenarios in a feature file to run gives the ability to run specific tests for a sprint while keeping all of the features together.

Test Selection

Here is the selection logic.

public static bool Ignore(string[] tags, string runTags)
 {
 if (string.IsNullOrWhiteSpace(runTags))
 {
 return false;
 }
//If runTags has a value the tag must match or is ignored
 if (tags == null)
 {
 throw new IgnoreException("Ignored tags is null.");
 }
 if (tags.Contains("ignore", StringComparer.InvariantCultureIgnoreCase))
 {
 throw new IgnoreException("Ignored");
 }
if (tags.Contains("manual", StringComparer.InvariantCultureIgnoreCase))
 {
 throw new IgnoreException("Manual");
 }
if (runTags == "all" || runTags == "all,all")
 {
 return false;
 }
if (tags.Contains(runTags, StringComparer.InvariantCultureIgnoreCase))
 {
 return false;
 }
return true;
 }

This provides the meat of the solution where most of the logic for the solution is. As you can see the exceptions contain messages for when a test is explicitly ignored with the Ignore or Manual tag. The manual tag identifies features that are defined, but can’t be automated. This way we still have a formal definition that can guide our manual testing.

The variable runTags holds the value retrieved from configuration. If the config defines “all” or “all,all”, we run all the tests that aren’t explicitly ignored. The “all,all” is a special case when ignoring test at the Feature level, but this post is about Scenario level ignoring.

The final test is to compare the tags to the runTags config. If the tags include the runTags, we run the test. Any tests that don’t match are ignored. For scenarios this only works for one runTag. Maybe we name the tag after the sprint, a sprint ticket, or whatever it is it has to be unique for the sprint. I like the idea of tagging with a ticket number as it gives tracability to tickets in the project management system.

Improvements and Changes

I have contemplated using a feature file organization similar to SpecLog. They advocate a separate folder to hold feature files for the current sprint. Then I believe, but not sure, that they tag the current sprint feature files so they can be identified and ran in isolation. The problem with this is that the current sprint features have to be merged with the current features after the sprint is complete.

Another question I have asked myself is do I want to allow some kind of test selection through command line parameters. I am not really sure yet. I will put that thought on hold for now or until a need for command line configuration makes itself evident.

Lastly, another improvement would be to allow specifying multiple runTags. We would have to then iterate the run tags and compare or come up with a performant way of doing. Performance would be an issue as this would have to run on every test and for a large project there could be thousands of tests with each test already taking having an inherent performance issue in having to run in a browser.

Conclusion

Well that’s it. Sprint testing can include browser based testing and still run significantly faster than running every test in a test suite.

C# MEF BrowserFactory for Browser Based Testing

In my browser based test framework I use MEF to help abstract the concept of a browser. The reason I do this is so I am not tied to a specific browser driver framework (e.g. WebDriver, WatiN, System.Net.WebClient). This allows me to change drivers without having to touch my test code. Here’s how I do it.

Browser Interface

First I created an interface that represents a browser. I used a mixture of interfaces from WebDriver and WatiN.

namespace CharlesBryant.TestPipe.Interfaces
{
 using System;
 using System.Collections.Generic;
 using System.Collections.ObjectModel;
 using CharlesBryant.TestPipe.Browser;
 using CharlesBryant.TestPipe.Enums;
public interface IBrowser
 {
 IBrowserSearchContext BrowserSearchContext { get; }
BrowserTypeEnum BrowserType { get; }
string CurrentWindowHandle { get; }
string PageSource { get; }
string Title { get; }
string Url { get; }
ReadOnlyCollection WindowHandles { get; }
IElement ActiveElement();
void Close();
void DeleteAllCookies();
void DeleteCookieNamed(string name);
Dictionary<string, string> GetAllCookies();
bool HasUrl(string pageUrl);
void LoadBrowser(BrowserTypeEnum browser, BrowserConfiguration configuration = null);
void Open(string url, uint timeoutInSeconds = 0);
void Quit();
void Refresh();
void SendBrowserKeys(string keys);
void TakeScreenshot(string screenshotPath);
void AddCookie(string key, string value, string path = "/", string domain = null, DateTime? expiry = null);
}
}

Pretty basic stuff although BrowserSearchContext took some thought to get it working. Basically, this abstraction provides the facility to search for elements. A lot of the concepts here are borrowed from WebDriver and WaitN and are just a way to be able to wrap there functionality and use it without being directly dependent on them. To use this you have to change your tests from directly using a browser driver to using this abstraction. At the start of your tests you use the BrowserFactory to get the specific implementation of this interface that you want to test with.

Browser Factory

Then I created a BrowserFactory that uses MEF to load browsers that implement the browser interface. When I need to use a browser I call Create in the BrowserFactory to get the browser driver I want to test with. To make this happen I have to actually create wrappers around the browser drivers I want available. One caveat about MEF is that it needs to be able to find your extensions so you have to tell it where to find them. To make the browsers available to the factory I added an MEF class attribute, [Export(typeof(IBrowser))] to my browser implementations. Then I add a post build event to the browser implementation projects to copy their DLL to a central folder:

copy $(TargetPath) $(SolutionDir)\Plugins\Browsers\$(TargetFileName)

Then I added a appConfig key with a value that points to this directory to the config of my clients that use the BrowserFactory. Now I can reference this config value to tell MEF where to load browsers from. Below is sort of how I use the factory with MEF.

namespace CharlesBryant.TestPipe.Browser
{
 using System;
 using System.ComponentModel.Composition;
 using System.ComponentModel.Composition.Hosting;
 using System.Configuration;
 using System.IO;
 using System.Reflection;
 using CharlesBryant.TestPipe.Enums;
 using CharlesBryant.TestPipe.Interfaces; 

 public class BrowserFactory
 {
 [Import(typeof(IBrowser))]
 private IBrowser browser; 

 public static IBrowser Create(BrowserTypeEnum browserType)
 {
 BrowserFactory factory = new BrowserFactory();
 return factory.Compose(browserType);
 } 

 private IBrowser Compose(BrowserTypeEnum browserType)
 {
 this.browser = null; 

 try
 {
 AggregateCatalog aggregateCatalogue = new AggregateCatalog();
 aggregateCatalogue.Catalogs.Add(new DirectoryCatalog(ConfigurationManager.AppSettings["browser.plugins"])); 
 CompositionContainer container = new CompositionContainer(aggregateCatalogue);
 container.ComposeParts(this);
 }
 catch (FileNotFoundException)
 {
 //Log
 }
 catch (CompositionException)
 {
 //Log;
 } 

 this.browser.LoadBrowser(browserType);
 return this.browser;
 }

namespace CharlesBryant.TestPipe.Enums
{
public enum BrowserTypeEnum
 {
 None,
 IE,
 Chrome,
 FireFox,
 Safari,
 Headless,
 Remote,
 Other
 }
}
 }
}

Conclusion

Well that’s the gist of it. I have untethered my tests from browser driver frameworks. This is not fully tested across a broad range of scenarios so there may be issues, but so far its doing OK for me.

The examples above are not production code, use at your own risk.