June 24, 2014

Manage Windows Services with PowerShell

This is just a quick post to document some PowerShell commands so I don’t forget where they are. One of them wasn’t as easy to find it as I thought it should be (Mr. Delete Service). If you want to delete a Windows Service, how do you do it with PowerShell? You can use WMI, but PowerShell also includes some more friendly methods for working with services that aren’t that hard to find.

Delete Service

PS> (Get-WmiObject win32_service -filter "name='Go Agent 2'").Delete()

Here I am deleting one of my Go.cd Agent Services. The only item I change from service to service in this command is the “name=”, everything else has been boilerplate so far, but there are other parameters you can set. One thing I noticed is that if the service is started you have to first stop it for the delete to complete, otherwise it is just marked for deletion.

You can get more info on PowerShell WMI here:
http://msdn.microsoft.com/en-us/library/dd315295.aspx
http://msdn.microsoft.com/en-us/library/aa384832(v=vs.85).aspx

New Service

PS> New-Service -Name "Go Agent 2" -Description "Go Agent 2" -BinaryPathName "`"D:\Go Agents\2\cruisewrapper.exe`" -s `"D:\Go Agents\2\config\wrapper-agent.conf`""

Here I am creating the Go Agent. Notice that I am able to set additional command parameters in the binaryPathName, like the -s to set my config file above. I use the back tick (`) to escape quotes.

Start Service

PS> start-service -name "Go Agent 2"

This is a simple command that just needs the service name. You only need the double quotes if your name has spaces.

Stop Service

PS> stop-service -name "Go Agent 2"

This is another simple one just like start.

Conclusion

Don’t remote into your server anymore to manage your services. Run remote PowerShell commands.

Update

They say “Reading is Fundamental” and the delete service answer I was looking for was at the bottom of the page I learned about creating services, http://technet.microsoft.com/en-us/library/hh849830.aspx. It even lists another command to delete services:

PS> sc.exe delete "Go Agent 2"

June 15, 2014

Failing Test Severity Index

Have you ever been to the emergency room and gone through triage? They have a standard way of identifying patients that need immediate or priority attention vs those that can wait. Actually, they have an algorithm for this called the Emergency Severity Index (ESI). Well I have a lot of failing tests in my test ER with their bloody redness in test reports begging to be fixed. Everyday I get a ping from them, “I’ve fallen and I can’t get up.” What do I do? Well fix them of course. So, I decided to record my actions in fixing a few of them to see if any patterns arise that I can learn from. In doing so, I formalized my steps so that I can have a repeatable process for fixing tests. I borrowed from ESI and decided to call my process the Failing Test Severity Index (FTSI), sounds official like its something every tester should use and it comes with an acronym FTSI (foot-see). If there is no acronym it’s not official, right?

Triage

The first step is to triage the failing tests to identify the most critical tests. I want to devote my energy to the most important, or the tests that provide the most bang for the buck, and triage gives me an official stage in my workflow for prioritizing the tests that need to be fixed.

As a side note, I have so many failures because I inherited the tests and they were originally deployed and never added to a maintenance plan. So, they have festered with multiple issues with test structure and problems with code changes in the system under test that need to be addressed in the tests. With that said, there are a lot, hundreds, of failures. This gave me an excellent opportunity to draft and tune my FTSI process.

During triage I categorize the failures to give them some sort of priority. I take what I know about the tests, the functionality they test, and the value they provide and grade them “critical”, “fix now”, “fix later”, “flaky”, and “ignore”.

Critical

Critical tests cover critical functionality. These failing tests cover functionality that if broken would cripple or significantly harm the user, the system, or the business.

Fix Now

Fix now are tests that provide high value, but wouldn’t cause a software apocalypse. These are usually easy fixes or easy wins with a known root cause for the failure that I can tackle after the critical failures.

Fix Later

Fix later are important tests, but not necessary to fix in the near term. These are usually tests that are a little harder to understand why they are failing. Sometimes I will mark a test as fix later, investigate the root cause and once I have a handle on what it takes to fix them, I move them to fix now. Sometimes I will tag a failure as fix later when a test has a known root cause, but the the time to fix is not in line with the value that the test provides compared to other tests I want to fix now.

Flaky

Flaky tests are tests that are failing for indeterminate reasons. They fail one time and pass another time, but nothing has changed in the functionality they cover. This is a tag I am questioning whether I should keep because I could end up with a ton of flaky test being ignored and cluttering the test report with a bunch of irritating yellow (almost as bad as red).

Ignore

Ignore are tests that can be removed from the test suite. These are usually tests that are not providing any real value because they no longer address a valid business case or are too difficult to maintain. If I triage and tag a test as ignore, it is because I suspect the failure can be ignored, but I want to spend some extra time investigating if I should remove them. When a test is ignored it is important to not let them stay ignored for long. Either remove them from the test suite or fix them so that they are effective tests again.

Test Tagging and Ticketing

Once the test failures are categorized, I tag them in the test tool so they are ignored during the test run. This also gives me a way to quickly identify the test code that is causing me issues. I also add the failing tests to our defect tracker. For now failing tests are not related to product releases so they are just entered as a defect on the test suite for the system under test and they are released when they are fixed. I am not adding any details on the reason of the failure besides the test failure message and stack trace, if available. Also, I may add cursory information that gives clues to possible root cause of failure such as test timing out, config issue, data issue…obvious reasons for failures. This allows me to focus on the initial triage and not spend time on trying to investigate issues that may not be worth my time.

This process helps to get rid of the red and replace it with yellow (ignore) in the test reports so there is still an issue if we are not diligent and efficient in the process. It does clear the report of red so that we can easily see new failures. If we leave the tests as failures, we can get in the habit of overlooking the red and miss a critical new failure.

Critical Fix

After I complete triage and tagging, I will focus on fixing the critical test failures. Depending on the nature of the critical failure I will forgo triage on the other failures and immediately fix the critical failure. Fixing critical tests have to be the most important task because the life of the feature or system under test is in jeopardy.

Investigate

Once I have my most critical tests fixed, I will begin investigating. I just select the most high priority tests that don’t have a known root cause and begin looking for the root cause of the failure. I begin with reading through the scenario covered by the test. Then I read the code to see what the code is doing. Of course if I was very familiar with a test I may not have to do this review, but if I have any doubts I start with gaining an understanding of the test.

Now, I run the test and watch it fail. If I can clearly tell why the test fails I move on to the next step. If not, since I have access to the source code for the system under test, I open a debug session and step through the test. If it still isn’t obvious I may get a developer involved, ask a QA, BA, or even end user about their experience with the scenario.

Once I have the root cause identified, I will tag the test as fix now. This is not gospel, if the fix is very easy, I will forgo the fix now tag and just fix the test.

Fix

Once I have run through triage and tagging, fixed critical tests, and have investigated new failures, I focus time on fixing any other criticals then move on to the fix now list. During the fix I will also focus on improving the tests through refactoring and looking at improving things like maintainability and performance of the tests. The goal during the fix is not only to fix the problem, but simplify future maintenance and improve the performance of the test. I want the test to be checked in better than I checked it out.

Conclusion

This is not the most optimal process at the moment, but is the result of going through the process on hundreds of failures and it works for the moment. I have to constantly review the list of failures and as new failures are identified, there could be a situation where some tests are perpetually listed on the fix later list. For now this is our process and we can learn from it to form an opinion on how to do it better. It’s not perfect, but its more formal than fixing tests all willy-nilly with no clear strategy for achieving the best value for our efforts. I have to give props to the medical community for the inspiration for the Failing Test Severity Index, now we just have to possibly formalize the process algorithm like ESI and look for improvements.

Let me know what you think and send links to other process you have had success with.

June 10, 2014

Test Automation Tips: 3

#3 Don’t become complacent in testing when a change is simple.

The Heartbleed bug was caused by a somewhat simple change that was not properly tested before it was released to the public. Maybe that is a overreaching generalization, but the fact is, the change and the eventual fix to Heartbleed were trivial, but incredibly damaging. I have found time and time again that many bugs are introduced through simple changes that are not properly tested before releasing them because they are so simple. I’m just inverting some logic, doing a simple refactor, updating how a variable is updated, adding a column to a query…I can create a long list of bugs that jump in my mind that were thought to be simple changes, but leaked risk and chaos into a release.

Quality Starts with Development

When I say test, I’m not talking about QA testing, I’m talking about developer testing. Quality is a team sport and the developers have the ball most of the game. The fact is developers have the largest share in insuring quality. There are usually more developers than any other role on the product team. The developers usually eat up most of the clock in the release cycle.

QA just got stuck with quality in their title because developers where a custom to tossing their code across the fence for others to worry about it working or not. QA was enlisted to sit on the other side of the fence and catch the bugs. OK, that is not the true story of how the QA role came to be, but that’s how I imagine it. I have seen this happen so many times: developers deliver code at last the last minute to QA, QA has to rush to test it, a bug is found that could have easily been caught with a more thorough inspection of quality by the developer, then QA has to test it again and pray that the developer tested it as they deal with their backed up schedule with no time to thoroughly test. I don’t look at QA as quality managers, but risk managers. They measure, analyze, and report the risk of a release. They don’t build quality into the product developers do. They don’t even insure that the team ships a quality product. They only point out risks in a release version. It’s up to the team to insure a certain level of quality is maintained and the analysis by QA is a part of defining what level of quality is expected. In the end, the quality buck stops with development, but the team insures quality of the product. Yet, if something breaks in the code, its not QA or the BA that will code the fix.

Test It

Simple changes from the database to the UI have been found to be culprits of pesky, hard to find bugs. The one thing they all have in common is that they were not tested or the tests were not good enough to cover the change. I am not going to say that “every change has to have a unit test…roar!”, but you need to find a way to test the fix that is better than visual inspection of an awesomely simple code change.

To me the best test is from the perspective of the user. In the end its the user who has to suffer from crappy code. So, if you are hacking on a web app, open a browser and test your change. Developing a service, open Fiddler or WireShark and send a request to exercise your change. If you are into it, go ahead and write a unit test. Learn about testing, just like you learn about new technologies to make your coding life easier.

Developing is more than coming up with elegant algorithms and simple solutions to complex problems. Do what you have to in order to find a meaningful way to exercise your changes. Don’t stop with one test, find more than one way to test your change. Your goal should be to commit the code better than you checked it out. If not, your QA will waste time and money rejecting your change or worse your users will reject your change and you will risk the reputation and brand of your company on something you thought was simple.

Conclusion

Well, if you have been a developer long, you have heard most of this before. This was more of a rant against myself for being asleep at the wheel on a simple change, I am my biggest critic. I believe all good developers share my sentiments in some way. Yet, simple changes have a habit of biting us because of our complacency. To validate our assumptions about a change, no matter how simple, we must try to test the change through the UI or public API. Even a simple smoke test to verify the UI still works is better than no test at all. Running a stored procedure after a simple column change is better than just checking it in because it was so easy to add that misspelled column name. You can take the time now to test or later, but later may cost you and your team in the end.

As a Software Developer in Test, I will ask a developer how they tested their change just to measure their due diligence. If I don’t hear something about the UI or API, unit test or some kind of substantive test, I have an urge to release the hounds. When I release the hounds, the cats run up a tree. When the cats run up a tree they scare the birds and they poop on your mom’s car. Don’t get your mom’s car pooped on 🙂

View more tips.

When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?

June 7, 2014

GoCD: Integrating Bug Tracking

Go.cd allows you to integrate with your tracking tools. You can define how to handle ticket numbers in your source control commit messages. Go.cd parses the message looking for a specific pattern and transforms matches into links to your tracking tool.

We use AxSoft OnTime and integration wasn’t as straight forward as I envisioned. OnTime uses two query strings to identify the ticket number and the type of ticket (defect, feature, incident…).

View Defect: https://name.ontimenow.com/viewitem.aspx?id=102578&type=defects

Create Defect: https://name.ontimenow.com/edititem.aspx?type=defects

From what I can tell Go.cd only allows use of one parameter and no facility to expand the regex and parameter to work with the various patterns for the type of ticket. Example: we may have a defect ticket OTD 102578 and a feature OTF 87984. When commits are related to either one of these tickets, the ticket number, including the prefix, is added to the commit message. To turn this into a link in Go.cd, we have to parse the OT* and correlate that to defect or feature, depending on the value of *, and add that to the type query string parameter. Next we have to grab the numbers after the space after the ticket type prefix and add that to the id query string parameter.

I am not really sure how to get this to work, aside from messing around with the source code. Did I say source code? Solved! Well maybe…tune in.

June 6, 2014

GoCD: Pipeline Parameters

I am currently getting a Go Continuous Delivery Server stood up and I am reusing scripts from our CCNET Continuous Integration server to save time getting the new server up. In doing this, I am also reviewing the scripts for improvements.When defining builds for my shinny new Go server I noticed that I have duplicate properties across tasks when calling Nant targets.

I have a property that defines the server environment the task should work with. This basically, sets properties in the script so that they target the correct environment. If I set the property to dev, the script will set the server names, paths and more to point to the dev environment.

There is a property that tells the task what source code repository branch to use. In the context of a task, this mainly has to do with paths to the branch already checked out and updated on the Go server and not controlling the actual branch updates. The branch name is concatenated in a common path so the task knows where to get and save source files.

There are more duplicated properties, but it got me wondering if there is a better way to do this than repeating the same value over and over for each task. When I need a new task or a new pipeline with different values this can become a maintenance nightmare. Not to mention, this duplication will make it hard to take advantage of Go Pipeline Templates so I need to solve this, if I want to gain the ease in creating new pipelines that templates afford.

Go has the concept of Environment Variable that is a common value saved on an agent. That is a nice feature as it lets me define values common to every pipeline running on an agent targeting a specific environment. To fix my issue I need to be able to set values at the job level. The agent level is to broad as I can have multiple pipelines target specific agents with different values for the properties I want to abstract. I wonder if we can set up a common variable at the job or pipeline level that the tasks can use?

Go Pipeline Parameters

Oh look, there is something called a Parameter in the pipeline configuration.

Parameters help reduce repetition within your configurations and combined with templates allow you to setup complex configurations. More..

Well let’s look into that and see what we come up with. The “More” link above gives the details of how to use parameters in Go. Basically, you define a parameter then you can use them in your tasks. To use the parameter in your task, just wrap it with #{}. Example: I have a parameter named ENV and I would use it in my script by tokenizing it like so, #{ENV}.

This gets rid of a bunch of unnecessary duplication, improves the maintainability of build scripts, and opens up the possibility of creating templates.

Conclusion

There is so much more to understand about Go. One thing that was reinforced in this exercise was to remember to constantly look for pain points in solutions and processes and search for ways to cure the pain.

June 5, 2014

If It Looks Like a Defect is It a Defect?

Our software quality metrics work group had a discussion today and metrics around defects became an interesting topic. One of the work group members said that the concept of a defect is not relevant to agile teams. This was clarified as defect metrics within the confines of an agile sprint. I felt kind of dumb, because I didn’t know this and it appeared that there may be a consensus with it. Maybe I misunderstood, but the logic was that there are no defects in sprint because once a problem is found it is immediately fixed in the sprint. I wanted to push for defect metrics from check-in through production. The later in the software delivery pipeline that a defect is found the more it will cost, so you have to know where it was caught. I didn’t get to dig in to the topic with the group because I was contemplating whether I needed to revisit my understanding of Agile and I didn’t want to slow the group down. I already feel like a lightweight in the ring with a bunch of heavyweights :).

Defects Cost Money

After pondering it a bit, I am still of the opinion that defects exists whether you name them something else, quietly fix them before anyone notices, or collectively as a team agree not to track them. Defects are an unavoidable artifact of software development. Defect, bug, issue…it doesn’t work as expected, name it what you like or obscure it in the process, defects are always there and will be until humans and computers become perfect beings. Defects cost money when more than one person has to deal with them. If a defect is caught in an exploratory test and it is acknowledged that it must be fixed in sprint, then it will have to be retested after the fix. Pile this double testing cost on top of the development cost and defects can get expensive.

Not to mentions, defects slow sprints down. When you estimated a certain amount of story points, let’s say 2, and ultimately the story was an 8 because of misunderstandings and bad coding practices, there is a cost associated with this. Maybe estimates are stable or perfect in mature hard core Agile teams or defects just another chore in the process that don’t warrant tracking or analysis. For new teams just making the transition to agile, tracking defects provides an additional signal that something is wrong in the process. If you are unable to see where your estimate overruns are occurring you can’t take action to fix them.

Conclusion

If someone besides the developer finds a defect, the story should be rejected. At the end of the sprint we should be able to see how many rejections there were and at what stage the rejects occurred in the pipeline. If these number are high or trending up, especially later in the pipeline, something needs to be done and you know there is a problem because you tracked defects. It may be my lack of experience in a hard core Agile team, but I just can’t see a reason to ignore defects just because they are supposed to be fixed in sprint.

Can someone help me see the light? I thought I was agile’ish. I am sure there is an agile expert out there than can give me another view of what defects mean in agile and how my current thought process is out of place in agile. I think my fellow group members are awesome, but I usually look for a second opinion in topics I am unsure about.

June 4, 2014

Scripting New Server Builds

What I am going to talk about is probably common knowledge in IT land and maybe even common knowledge to a lot of developers, but it just recently occurred to me as a good idea. Scripting new server builds hit me like a ton of bricks when I saw it being done by one of our developers, we will call him Foo since I didn’t ask permission to use his name.

Revelation

Foo was recording the scripts he used to build the server in a text document along with additional steps he took to get the server up. He also stored this text document in source control so that the instructions are versioned. Did I say genius…very smart guy. Maybe I’m just a developer caveman and watching him burning that tree made me curious and want to share it with my family. So, I grab a burning branch, run back to the cave and look…I invented fire! OK, maybe not that deep, but a it was a revelation that missed connecting with me although I have been exposed to the idea before.

Framework

In my quest to become a better Automation Engineer, I learned about server virtualization, how to optimize server images, strategies for provisioning and teardown of virtual instances, and more. I even learned about scripting and automating server configuration, but I didn’t dive into the depths of any of the subjects. As I looked through the text files that Foo had in source control, a lightbulb went off and everything connected. So, I grab the text files, blog about them, and wait for it…….I invented scripting new server builds! At least I invented it in the small world in my mind.

The idea I am exploring now is to use the same logic that the Foo used and expand upon it so that it can be used in provisioning virtual instances. I could cozy up to Chef, Puppet, or System Center and stop writing this post and do some research to figure out best practices and various strategies for doing this, but where is the fun in that. So, let’s blog it out, get the basics, then find a better way as I feel the overwhelming weight of what I got myself into. Even if I do end up using a boxed tool, knowing how to do this manually and hand rolling my own basic automation will make me that much more dangerous when I get the power of a tool in my hand. So, I enter the 36th Chamber.

Requirements

First thing I want to do is reorganize how this is done so, lets set out some requirements.

I want scripts that are runnable, right now I have to copy and paste them into a script file to get them to run. So, the scripts should be in script files that can be easily ran by a script engine.
I want the scripts to be reusable so I won’t create a file that contains every step, but many files that can be called by other scripts and customized by passing in arguments. This gives me a way to compose a server build without having do duplicate major functionality, a lot easier to maintain and optimize.
I want the scripts to be generic enough to use across multiple types of server instances, but custom enough that we don’t create a massive mess that is just a dumb abstraction on top of the scripting engine and its plugins. It is important that the scripts do more than just call another external function in the script engine o a plugin, there should be some additional logic that makes it worthwhile to script.

Additional goals,

I want to log the tasks and exceptions while running the scripts. Especially, timing and contextual data so that I can monitor and analyze script execution.
I want to notify engineers when there is a problem or a manual steps that have to be done. When we hit a step that is in distress or needs manual intervention, I want the script to send an email, IM, tweet…or something to the engineer or group managing the provisioning of the server.
I want this to be scalable and performant, but the initial iterations should focus on getting it to work when thinking about provisioning just one instance. Scaling maybe better solved with a third party tool and I will face that issue when I face the scaling problem or at least at a point that I can project and understand the impending scaling issue.

Workflow

I guess that is enough to get me going. So, I take on a couple of the steps to stand the server up, install and start services

I run the manual steps to install a service
I script the manual install steps
I run the install script on the server
I run the manual steps to uninstall the service
I script the manual uninstall steps
I run the install script on the server
I run the uninstall script on the server

As I take each step, I am researching how to best script it, I am addressing issues, because it rarely goes as planned. This is a poor man’s script, debug and test methodology. I am sure that there must be some fancy IDE that can help with this. I am configuring the servers remotely with Powershell from my local environment. I’m a DotNetter, but I can see myself doing this with any scripting engine, on any platform, with any supporting tools to make it easier.

Iterate & Elaborate

I repeat the workflow to script out service start and stop. After I am satisfied, I save the scripts in a file named config_services.ps1 and change the scripts so they can accept arguments. Now I have a script whose focus is to manage services. Then I check them into source control.

Next, I create another script whose job it is to orchestrate the workflow to configure the server using scripts like the config_services.ps1 script. I hard code the arguments in the call to the install service function, but you know I’m thinking about how to get away from the hard coding, but I don’t want to go deeper down the rabbit hole than I have to. Speaking of the rabbit hole, how do I unit test a PowerShell script? I save this file as configure_server.ps1 and I commit it.

That was fun, but we need to do a lot more to configure a server. So, I take another task, configure DTC, and I follow my development workflow to script out the manual steps. This involved a little registry manipulation so I also created a script to manage the registry too. Then I added calls to these scripts in the configure_server.ps1 script inside the same function that is calling the install and start services functions. Now, I have three of the steps to configure this server instance scripted with somewhat generic encapsulated functions. This satisfies that major goals in my requirements.

Although, I have ideas for refactoring this, I stop at this branch and switch gears to stub out a script that can log messages and send alerts. Then I add calls to all of the scripts to get some messaging instrumented and ready when I am done with the feature. I’m feeling good about myself and I continue working in this manner until I have a solution to automate the configuration of a server instance.

Conclusion

That’s it for now. I know you are like, wait…where are the damn scripts. I might share them on GitHub if someone needs them, but I didn’t feel like digging them up and cleaning them up to add to this post and GitHub.

If you are just getting into scripting servers like me, I hope this helps spark a flame for you as you think it through. If you are a Monk of the 36th Chamber and you see all kinds of issues and naive assumptions that I shouldn’t be publicizing to unknowing newbies, please let me know. If you are looking to become better at automating the software delivery pipeline, drop me a line, I am always looking for someone to spar and train with.

June 3, 2014

Test Automation Tips: 2

#2 If you update or delete test data in a test, you need to isolate the data so it isn’t used by any other tests.

A major cause of flaky tests (tests that fail for indeterminate reasons) is bad test data. This is often caused by a previous test altering the test data in a way that causes subsequent tests to fail. It is best to isolate test data for scenarios that change the test data. You should have an easy way to isolate read only data so that you can insure it isn’t corrupted up by your tests.

Delta Seed

One way I like to do this is having what I call a delta test data seed. This delta seed loads all read only test data at the start of a test suite run. Any test data that needs to be updated or deleted is created per test. Mutable test data seeds are ran after the delta seed. So, in addition to the delta seed I will have a suite, feature or scenario seed.

Suite Seed

The suite seed is ran right after the delta seed, usually with the same process that runs the delta seed. Because the suite seed data is available to all tests being ran, it is the most riskiest seed and the least efficient unless you are running all of your tests as you may not need all of the data being loaded. I say risky, because it opens up the scenario where someone writes a test against mutable data when it should only be used by the test that will be changing the data.

Feature Seed

The feature seed would run at the beginning of a feature test during text fixture setup. This basically loads all the data used by tests in the feature. This has some of the same issues as the suite seed. All of the data is available for all tests in the feature and if someone gets lazy and writs a test against the mutable data instead of creating new data specifically for the test may result in flaky tests.

Scenario Seed

The scenario seed runs at the beginning of an individual test in the feature test fixture. This is the safest in terms of risk as the data is loaded for the test and deleted after the test so no other tests can use it. The problem I have with this is when you have a lot of tests having to create hundreds of database connections and dealing with seeding in the test can have an impact on overall and individual test time. If not implemented properly this type of data seeding can also create a maintenance nightmare. I like to use test timing as an indicator of issues with tests. If you can’t separate the time to seed the data from the time to run the test, having the seed factored into the test can affect the timing in a way that has nothing to do with what is being tested. So, you have to be careful not to pollute your test timing with seeding.

Which Seed?

Which seed to use depends on multiple factors. How you run your tests? If you are running all tests, it may be efficient to use a suite seed. If you run features in parallel, you may want a feature seed to quickly load all feature data at one time. If you run tests based on dependencies in a committed change, you may want to keep seeding granular with a scenario seed. There are many other factors that you can take into account and trial and error is a good way to go as you optimize test data seeding.

Conclusion

The thing to take away is you need a strategy to manage the isolation of test data by desired immutability of the data. Tests that don’t alter test data should use read only data. Tests that alter test data should use test data specifically created for the test. If you allow a test to use data altered by another test, you open yourself up to flaky test syndrome.

View more tips.

When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?

June 1, 2014

An Easy Win for Testable Methods

One of our developers was having an issue testing a service. Basically, he was having a hard time hitting the service as it is controlled by an external company and we don’t have firewall rules to allow us to easily reach it in our local environment. I suggested mocking the service since we really weren’t testing getting a response from the service, but what we do with the response. I was told that the code is not conducive to mocking. So, I took a look and they were right, but the fix to make it testable was a very simple refactor. Here is the gist of the code:

public SomeResponseObject GetResponse(SomeRequestObject request)
{
//Set some additonal properties on the request
request.Id = "12345";

//Get the response from the service
SomeResponseObject response = Client.SendRequest(request);

//Do something with the response
if (response != null)
{
//Do some awesome stuff to the response
response.LogId = "98765";
if (response.Id > "999")
{
response.Special = true;
}
LogResponse(response);
}

return response;
}

What we want to test is the “Do something with the response” section of the code, but this method is doing so many things that we can’t isolate that section and test it…or can we? To make this testable we simply move everything that conserns “Do something with the response” to a separate method.

public SomeResponseObject GetResponse(SomeRequestObject request)
{
//Set some additonal properties on the request
request.Id = "12345";

//Get the response from the service
SomeResponseObject response = Client.SendRequest(request);

return ProcessResponse(response);
}
public SomeResponseObject ProcessResponse(SomeResponseObject response)
{
//Do something with the response
if (response != null)
{
//Do some awesome stuff to the response
response.LogId = "98765";
if (response.Id > "999")
{
response.Special = true;
}
LogResponse(response);
}

return response;
}

Now we can just test the changes to ProcessResponse method in isolation away from the service calls. Since there were no changes to the service or the service client, we didn’t have to worry about testing them for this specific change. We don’t care what gets returned we just want to know if the response was properly processed and logged. We still have a hard dependency on LogResponse’s connection to the database, but I will live with this as an integration test and fight for unit tests another day. This is a quick win for testability and a step closer to making this class SOLID.

May 31, 2014

Optimizing the Software Delivery Pipeline: Deployment Metrics

Currently, I have no way of easily determining what build version is deployed to an environment. This made me take more interest in metrics about deployments, we basically have none. I can look at the the CD (continuous deployment) server and see what time a deployment was done and I can look at the builds on the server and sort of deduce which build was deployed, but I have to manually check the server to verify my assumptions. I wondered what else I am missing. Am I flying blind, should I know more?

Metrics in the Software Delivery Pipeline

I am part of a work group that is exploring software quality metrics. So, my first instinct was to think about deployment quality metrics. After some soul searching, I decided what would be most helpful to me is to know where our bottle necks are. We have an assembly line or pipeline that consists of various stages our software goes through as it makes its way to public consumption. Develop, build, deploy, test, and release are the major phases of our software delivery pipeline (I am not including planning or analysis right now as that is another animal).

I believe that metrics that focus on reducing time in our software delivery pipeline will be more effective than just focusing on reducing defects or increasing quality. If we can reduce defects or increase quality in faster delivery iterations, the effect of defects and poor quality will have less of an impact. This is the point of quality metrics in the first place, reducing the effects of poor quality on our customers and the business. Focusing on reducing time in the pipeline also supports our quality initiatives as the tools to reduce time, like automated CI and testing, not only reduce iteration time, but improve quality. Faster release iterations will allow us to address quality issues quicker. This is not to say that other metrics should be ignored. I just think that since we have no real metrics at the moment starting with metrics that support speeding up the pipe is a worthy first step.

Deployment Metrics

Back to the point. What metrics should I capture for deployments. If my goal is to increase throughput in the pipeline, I need to identify bottlenecks. So, I need some timing data.

How long does deployment take?
How long do the individual deployment steps take?
How do we report this over time so we can identify issues?

This is pretty simple and I can extract it from the deployment log on the server. Reporting would be just a matter of querying this data and displaying deployment time totals over time.

Additional Deployment Metrics

In addition to the timing data it may be worthwhile to capture additional metrics like the size of deployment. Deploying involves pushing packages across wires and the size of the packages can have an effect on deployment time. Issues with individual servers can affect deployment time so, knowing the servers being deployed to can help identify server issues. With the timing data, we can also capture

The version of the build being deployed
The environment being deployed to
The individual servers being deployed to
The size and version of the packages being deployed to a server

Deployment Data

So, my first iteration of metrics center around timing, but would also have other data to give a more robust picture of deployments. This is a naive first draft of what the data schema could look like. I would suspect that this can all be captured on most CI/CD servers and augmented with data generated by the reporting tool:

Deployment Id – a unique identifier for the deployment, generated by the reporting tool
Environment Id – a unique identifier for the environment deployed to, generated by the reporting tool
Build Version – build version should be the version captured on the server
Timestamp – timestamp is the date/time the deployment record was created
Start – the date/time the deployment started
End – the date/time the deployment completed
Tasks – tasks are the individual steps taken by the deployment script; it is possible that there is only one step, it all depends on how deployment is scripted
- Deployment Task Id – a unique identifier for the task, generated by the reporting tool
- Server Id – a unique identifier for the physical server deployed to, generated by the reporting tool
- Packages – packages represent the group of files pushed to the server, this is normally a zip or NuGet package in my scenarios
  - Package Version – the version of the package being pushed, this may be different than the software version and is generated outside of the reporting tool
  - Package Size – the physical size of the package in KB or MB (not sure which is better)
- Start – the date/time the deployment to the server started
- End – the date/time the deployment to the server ended

Imagine the above as some beautiful XML, JSON, or ProtoBuf, because I am too lazy to write it.

If my goal is to increase throughput in the pipe I should probably think about a higher level of abstraction in the hierarchy so that I can relate metrics from other parts of the pipeline. For now I will focus on this as a first step to prove that this is doable and provides some value.

All I need to do is a create data parsing tool that can be called by the deployment server once a deployment is done. The tool will receive the server log and store it, parse the log and generate a data structure similar to above, then store the data in a database. Then I have to create a reporting tool that can present graphs and charts of the data for easy analysis. Lastly, create an API that will allow other tools to consume the data. This maybe a job for CQRS and event sourcing. Easy right :). I know there is a tool for that, but I am a sucker for punishment.

Conclusion

This post will take more time than I thought so I will make this a series. I will cover my thoughts on metrics for development, build, test, and release in upcoming posts (if I can remember). Then possibly some posts on my thoughts on how the metrics and tools can be used to optimize the pipeline. Pretty ambitious, but sounds like fun to me.

Delete Service

New Service

Start Service

Stop Service

Conclusion

Update

Share this:

Triage

Critical

Fix Now

Fix Later

Flaky

Ignore

Test Tagging and Ticketing

Critical Fix

Investigate

Fix

Conclusion

Share this:

#3 Don’t become complacent in testing when a change is simple.

Quality Starts with Development

Test It

Conclusion

Share this:

Share this:

Go Pipeline Parameters

Conclusion

Share this:

Defects Cost Money

Conclusion

Share this:

Revelation

Framework

Requirements

Workflow

Iterate & Elaborate

Conclusion

Share this:

#2 If you update or delete test data in a test, you need to isolate the data so it isn’t used by any other tests.

Delta Seed

Suite Seed

Feature Seed

Scenario Seed

Which Seed?

Conclusion

Share this:

Share this:

Metrics in the Software Delivery Pipeline

Deployment Metrics

Additional Deployment Metrics

Deployment Data

Conclusion

Share this: