Category: Pipeline
Install IIS with PowerShell
Here is another PowerShell command. This one is for installing IIS.
First I establish a session with the server I want to install to:
PS> enter-pssession -computername winbuildserver1
Next we just need to run a simple command:
winbuildserver1: PS> Install-WindowsFeature Web-Server -IncludeManagementTools -IncludeAllSubFeature -Source E:\sources\sxs
In the example I am installing the IIS web server, including the management tools and all sub-features, and I am installing from a specific source path, easy-peasy.
Of course you can achieve more fine grained control of the install and you can get more information on that at:
Manage Windows Services with PowerShell
This is just a quick post to document some PowerShell commands so I don’t forget where they are. One of them wasn’t as easy to find it as I thought it should be (Mr. Delete Service). If you want to delete a Windows Service, how do you do it with PowerShell? You can use WMI, but PowerShell also includes some more friendly methods for working with services that aren’t that hard to find.
Delete Service
PS> (Get-WmiObject win32_service -filter "name='Go Agent 2'").Delete()
Here I am deleting one of my Go.cd Agent Services. The only item I change from service to service in this command is the “name=”, everything else has been boilerplate so far, but there are other parameters you can set. One thing I noticed is that if the service is started you have to first stop it for the delete to complete, otherwise it is just marked for deletion.
You can get more info on PowerShell WMI here:
http://msdn.microsoft.com/en-us/library/dd315295.aspx
http://msdn.microsoft.com/en-us/library/aa384832(v=vs.85).aspx
New Service
PS> New-Service -Name "Go Agent 2" -Description "Go Agent 2" -BinaryPathName "`"D:\Go Agents\2\cruisewrapper.exe`" -s `"D:\Go Agents\2\config\wrapper-agent.conf`""
Here I am creating the Go Agent. Notice that I am able to set additional command parameters in the binaryPathName, like the -s to set my config file above. I use the back tick (`) to escape quotes.
Start Service
PS> start-service -name "Go Agent 2"
This is a simple command that just needs the service name. You only need the double quotes if your name has spaces.
Stop Service
PS> stop-service -name "Go Agent 2"
This is another simple one just like start.
Conclusion
Don’t remote into your server anymore to manage your services. Run remote PowerShell commands.
Update
They say “Reading is Fundamental” and the delete service answer I was looking for was at the bottom of the page I learned about creating services, http://technet.microsoft.com/en-us/library/hh849830.aspx. It even lists another command to delete services:
PS> sc.exe delete "Go Agent 2"
Failing Test Severity Index
Have you ever been to the emergency room and gone through triage? They have a standard way of identifying patients that need immediate or priority attention vs those that can wait. Actually, they have an algorithm for this called the Emergency Severity Index (ESI). Well I have a lot of failing tests in my test ER with their bloody redness in test reports begging to be fixed. Everyday I get a ping from them, “I’ve fallen and I can’t get up.” What do I do? Well fix them of course. So, I decided to record my actions in fixing a few of them to see if any patterns arise that I can learn from. In doing so, I formalized my steps so that I can have a repeatable process for fixing tests. I borrowed from ESI and decided to call my process the Failing Test Severity Index (FTSI), sounds official like its something every tester should use and it comes with an acronym FTSI (foot-see). If there is no acronym it’s not official, right?
Triage
The first step is to triage the failing tests to identify the most critical tests. I want to devote my energy to the most important, or the tests that provide the most bang for the buck, and triage gives me an official stage in my workflow for prioritizing the tests that need to be fixed.
As a side note, I have so many failures because I inherited the tests and they were originally deployed and never added to a maintenance plan. So, they have festered with multiple issues with test structure and problems with code changes in the system under test that need to be addressed in the tests. With that said, there are a lot, hundreds, of failures. This gave me an excellent opportunity to draft and tune my FTSI process.
During triage I categorize the failures to give them some sort of priority. I take what I know about the tests, the functionality they test, and the value they provide and grade them “critical”, “fix now”, “fix later”, “flaky”, and “ignore”.
Critical
Critical tests cover critical functionality. These failing tests cover functionality that if broken would cripple or significantly harm the user, the system, or the business.
Fix Now
Fix now are tests that provide high value, but wouldn’t cause a software apocalypse. These are usually easy fixes or easy wins with a known root cause for the failure that I can tackle after the critical failures.
Fix Later
Fix later are important tests, but not necessary to fix in the near term. These are usually tests that are a little harder to understand why they are failing. Sometimes I will mark a test as fix later, investigate the root cause and once I have a handle on what it takes to fix them, I move them to fix now. Sometimes I will tag a failure as fix later when a test has a known root cause, but the the time to fix is not in line with the value that the test provides compared to other tests I want to fix now.
Flaky
Flaky tests are tests that are failing for indeterminate reasons. They fail one time and pass another time, but nothing has changed in the functionality they cover. This is a tag I am questioning whether I should keep because I could end up with a ton of flaky test being ignored and cluttering the test report with a bunch of irritating yellow (almost as bad as red).
Ignore
Ignore are tests that can be removed from the test suite. These are usually tests that are not providing any real value because they no longer address a valid business case or are too difficult to maintain. If I triage and tag a test as ignore, it is because I suspect the failure can be ignored, but I want to spend some extra time investigating if I should remove them. When a test is ignored it is important to not let them stay ignored for long. Either remove them from the test suite or fix them so that they are effective tests again.
Test Tagging and Ticketing
Once the test failures are categorized, I tag them in the test tool so they are ignored during the test run. This also gives me a way to quickly identify the test code that is causing me issues. I also add the failing tests to our defect tracker. For now failing tests are not related to product releases so they are just entered as a defect on the test suite for the system under test and they are released when they are fixed. I am not adding any details on the reason of the failure besides the test failure message and stack trace, if available. Also, I may add cursory information that gives clues to possible root cause of failure such as test timing out, config issue, data issue…obvious reasons for failures. This allows me to focus on the initial triage and not spend time on trying to investigate issues that may not be worth my time.
This process helps to get rid of the red and replace it with yellow (ignore) in the test reports so there is still an issue if we are not diligent and efficient in the process. It does clear the report of red so that we can easily see new failures. If we leave the tests as failures, we can get in the habit of overlooking the red and miss a critical new failure.
Critical Fix
After I complete triage and tagging, I will focus on fixing the critical test failures. Depending on the nature of the critical failure I will forgo triage on the other failures and immediately fix the critical failure. Fixing critical tests have to be the most important task because the life of the feature or system under test is in jeopardy.
Investigate
Once I have my most critical tests fixed, I will begin investigating. I just select the most high priority tests that don’t have a known root cause and begin looking for the root cause of the failure. I begin with reading through the scenario covered by the test. Then I read the code to see what the code is doing. Of course if I was very familiar with a test I may not have to do this review, but if I have any doubts I start with gaining an understanding of the test.
Now, I run the test and watch it fail. If I can clearly tell why the test fails I move on to the next step. If not, since I have access to the source code for the system under test, I open a debug session and step through the test. If it still isn’t obvious I may get a developer involved, ask a QA, BA, or even end user about their experience with the scenario.
Once I have the root cause identified, I will tag the test as fix now. This is not gospel, if the fix is very easy, I will forgo the fix now tag and just fix the test.
Fix
Once I have run through triage and tagging, fixed critical tests, and have investigated new failures, I focus time on fixing any other criticals then move on to the fix now list. During the fix I will also focus on improving the tests through refactoring and looking at improving things like maintainability and performance of the tests. The goal during the fix is not only to fix the problem, but simplify future maintenance and improve the performance of the test. I want the test to be checked in better than I checked it out.
Conclusion
This is not the most optimal process at the moment, but is the result of going through the process on hundreds of failures and it works for the moment. I have to constantly review the list of failures and as new failures are identified, there could be a situation where some tests are perpetually listed on the fix later list. For now this is our process and we can learn from it to form an opinion on how to do it better. It’s not perfect, but its more formal than fixing tests all willy-nilly with no clear strategy for achieving the best value for our efforts. I have to give props to the medical community for the inspiration for the Failing Test Severity Index, now we just have to possibly formalize the process algorithm like ESI and look for improvements.
Let me know what you think and send links to other process you have had success with.
GoCD: Integrating Bug Tracking
Go.cd allows you to integrate with your tracking tools. You can define how to handle ticket numbers in your source control commit messages. Go.cd parses the message looking for a specific pattern and transforms matches into links to your tracking tool.
We use AxSoft OnTime and integration wasn’t as straight forward as I envisioned. OnTime uses two query strings to identify the ticket number and the type of ticket (defect, feature, incident…).
View Defect: https://name.ontimenow.com/viewitem.aspx?id=102578&type=defects
Create Defect: https://name.ontimenow.com/edititem.aspx?type=defects
From what I can tell Go.cd only allows use of one parameter and no facility to expand the regex and parameter to work with the various patterns for the type of ticket. Example: we may have a defect ticket OTD 102578 and a feature OTF 87984. When commits are related to either one of these tickets, the ticket number, including the prefix, is added to the commit message. To turn this into a link in Go.cd, we have to parse the OT* and correlate that to defect or feature, depending on the value of *, and add that to the type query string parameter. Next we have to grab the numbers after the space after the ticket type prefix and add that to the id query string parameter.
I am not really sure how to get this to work, aside from messing around with the source code. Did I say source code? Solved! Well maybe…tune in.
GoCD: Pipeline Parameters
I am currently getting a Go Continuous Delivery Server stood up and I am reusing scripts from our CCNET Continuous Integration server to save time getting the new server up. In doing this, I am also reviewing the scripts for improvements.When defining builds for my shinny new Go server I noticed that I have duplicate properties across tasks when calling Nant targets.
I have a property that defines the server environment the task should work with. This basically, sets properties in the script so that they target the correct environment. If I set the property to dev, the script will set the server names, paths and more to point to the dev environment.
There is a property that tells the task what source code repository branch to use. In the context of a task, this mainly has to do with paths to the branch already checked out and updated on the Go server and not controlling the actual branch updates. The branch name is concatenated in a common path so the task knows where to get and save source files.
There are more duplicated properties, but it got me wondering if there is a better way to do this than repeating the same value over and over for each task. When I need a new task or a new pipeline with different values this can become a maintenance nightmare. Not to mention, this duplication will make it hard to take advantage of Go Pipeline Templates so I need to solve this, if I want to gain the ease in creating new pipelines that templates afford.
Go has the concept of Environment Variable that is a common value saved on an agent. That is a nice feature as it lets me define values common to every pipeline running on an agent targeting a specific environment. To fix my issue I need to be able to set values at the job level. The agent level is to broad as I can have multiple pipelines target specific agents with different values for the properties I want to abstract. I wonder if we can set up a common variable at the job or pipeline level that the tasks can use?
Go Pipeline Parameters
Oh look, there is something called a Parameter in the pipeline configuration.
Parameters help reduce repetition within your configurations and combined with templates allow you to setup complex configurations. More..
Well let’s look into that and see what we come up with. The “More” link above gives the details of how to use parameters in Go. Basically, you define a parameter then you can use them in your tasks. To use the parameter in your task, just wrap it with #{}. Example: I have a parameter named ENV and I would use it in my script by tokenizing it like so, #{ENV}.
This gets rid of a bunch of unnecessary duplication, improves the maintainability of build scripts, and opens up the possibility of creating templates.
Conclusion
There is so much more to understand about Go. One thing that was reinforced in this exercise was to remember to constantly look for pain points in solutions and processes and search for ways to cure the pain.
If It Looks Like a Defect is It a Defect?
Our software quality metrics work group had a discussion today and metrics around defects became an interesting topic. One of the work group members said that the concept of a defect is not relevant to agile teams. This was clarified as defect metrics within the confines of an agile sprint. I felt kind of dumb, because I didn’t know this and it appeared that there may be a consensus with it. Maybe I misunderstood, but the logic was that there are no defects in sprint because once a problem is found it is immediately fixed in the sprint. I wanted to push for defect metrics from check-in through production. The later in the software delivery pipeline that a defect is found the more it will cost, so you have to know where it was caught. I didn’t get to dig in to the topic with the group because I was contemplating whether I needed to revisit my understanding of Agile and I didn’t want to slow the group down. I already feel like a lightweight in the ring with a bunch of heavyweights :).
Defects Cost Money
After pondering it a bit, I am still of the opinion that defects exists whether you name them something else, quietly fix them before anyone notices, or collectively as a team agree not to track them. Defects are an unavoidable artifact of software development. Defect, bug, issue…it doesn’t work as expected, name it what you like or obscure it in the process, defects are always there and will be until humans and computers become perfect beings. Defects cost money when more than one person has to deal with them. If a defect is caught in an exploratory test and it is acknowledged that it must be fixed in sprint, then it will have to be retested after the fix. Pile this double testing cost on top of the development cost and defects can get expensive.
Not to mentions, defects slow sprints down. When you estimated a certain amount of story points, let’s say 2, and ultimately the story was an 8 because of misunderstandings and bad coding practices, there is a cost associated with this. Maybe estimates are stable or perfect in mature hard core Agile teams or defects just another chore in the process that don’t warrant tracking or analysis. For new teams just making the transition to agile, tracking defects provides an additional signal that something is wrong in the process. If you are unable to see where your estimate overruns are occurring you can’t take action to fix them.
Conclusion
If someone besides the developer finds a defect, the story should be rejected. At the end of the sprint we should be able to see how many rejections there were and at what stage the rejects occurred in the pipeline. If these number are high or trending up, especially later in the pipeline, something needs to be done and you know there is a problem because you tracked defects. It may be my lack of experience in a hard core Agile team, but I just can’t see a reason to ignore defects just because they are supposed to be fixed in sprint.
Can someone help me see the light? I thought I was agile’ish. I am sure there is an agile expert out there than can give me another view of what defects mean in agile and how my current thought process is out of place in agile. I think my fellow group members are awesome, but I usually look for a second opinion in topics I am unsure about.
Scripting New Server Builds
What I am going to talk about is probably common knowledge in IT land and maybe even common knowledge to a lot of developers, but it just recently occurred to me as a good idea. Scripting new server builds hit me like a ton of bricks when I saw it being done by one of our developers, we will call him Foo since I didn’t ask permission to use his name.
Revelation
Foo was recording the scripts he used to build the server in a text document along with additional steps he took to get the server up. He also stored this text document in source control so that the instructions are versioned. Did I say genius…very smart guy. Maybe I’m just a developer caveman and watching him burning that tree made me curious and want to share it with my family. So, I grab a burning branch, run back to the cave and look…I invented fire! OK, maybe not that deep, but a it was a revelation that missed connecting with me although I have been exposed to the idea before.
Framework
In my quest to become a better Automation Engineer, I learned about server virtualization, how to optimize server images, strategies for provisioning and teardown of virtual instances, and more. I even learned about scripting and automating server configuration, but I didn’t dive into the depths of any of the subjects. As I looked through the text files that Foo had in source control, a lightbulb went off and everything connected. So, I grab the text files, blog about them, and wait for it…….I invented scripting new server builds! At least I invented it in the small world in my mind.
The idea I am exploring now is to use the same logic that the Foo used and expand upon it so that it can be used in provisioning virtual instances. I could cozy up to Chef, Puppet, or System Center and stop writing this post and do some research to figure out best practices and various strategies for doing this, but where is the fun in that. So, let’s blog it out, get the basics, then find a better way as I feel the overwhelming weight of what I got myself into. Even if I do end up using a boxed tool, knowing how to do this manually and hand rolling my own basic automation will make me that much more dangerous when I get the power of a tool in my hand. So, I enter the 36th Chamber.
Requirements
First thing I want to do is reorganize how this is done so, lets set out some requirements.
- I want scripts that are runnable, right now I have to copy and paste them into a script file to get them to run. So, the scripts should be in script files that can be easily ran by a script engine.
- I want the scripts to be reusable so I won’t create a file that contains every step, but many files that can be called by other scripts and customized by passing in arguments. This gives me a way to compose a server build without having do duplicate major functionality, a lot easier to maintain and optimize.
- I want the scripts to be generic enough to use across multiple types of server instances, but custom enough that we don’t create a massive mess that is just a dumb abstraction on top of the scripting engine and its plugins. It is important that the scripts do more than just call another external function in the script engine o a plugin, there should be some additional logic that makes it worthwhile to script.
Additional goals,
- I want to log the tasks and exceptions while running the scripts. Especially, timing and contextual data so that I can monitor and analyze script execution.
- I want to notify engineers when there is a problem or a manual steps that have to be done. When we hit a step that is in distress or needs manual intervention, I want the script to send an email, IM, tweet…or something to the engineer or group managing the provisioning of the server.
- I want this to be scalable and performant, but the initial iterations should focus on getting it to work when thinking about provisioning just one instance. Scaling maybe better solved with a third party tool and I will face that issue when I face the scaling problem or at least at a point that I can project and understand the impending scaling issue.
Workflow
I guess that is enough to get me going. So, I take on a couple of the steps to stand the server up, install and start services
- I run the manual steps to install a service
- I script the manual install steps
- I run the install script on the server
- I run the manual steps to uninstall the service
- I script the manual uninstall steps
- I run the install script on the server
- I run the uninstall script on the server
As I take each step, I am researching how to best script it, I am addressing issues, because it rarely goes as planned. This is a poor man’s script, debug and test methodology. I am sure that there must be some fancy IDE that can help with this. I am configuring the servers remotely with Powershell from my local environment. I’m a DotNetter, but I can see myself doing this with any scripting engine, on any platform, with any supporting tools to make it easier.
Iterate & Elaborate
I repeat the workflow to script out service start and stop. After I am satisfied, I save the scripts in a file named config_services.ps1 and change the scripts so they can accept arguments. Now I have a script whose focus is to manage services. Then I check them into source control.
Next, I create another script whose job it is to orchestrate the workflow to configure the server using scripts like the config_services.ps1 script. I hard code the arguments in the call to the install service function, but you know I’m thinking about how to get away from the hard coding, but I don’t want to go deeper down the rabbit hole than I have to. Speaking of the rabbit hole, how do I unit test a PowerShell script? I save this file as configure_server.ps1 and I commit it.
That was fun, but we need to do a lot more to configure a server. So, I take another task, configure DTC, and I follow my development workflow to script out the manual steps. This involved a little registry manipulation so I also created a script to manage the registry too. Then I added calls to these scripts in the configure_server.ps1 script inside the same function that is calling the install and start services functions. Now, I have three of the steps to configure this server instance scripted with somewhat generic encapsulated functions. This satisfies that major goals in my requirements.
Although, I have ideas for refactoring this, I stop at this branch and switch gears to stub out a script that can log messages and send alerts. Then I add calls to all of the scripts to get some messaging instrumented and ready when I am done with the feature. I’m feeling good about myself and I continue working in this manner until I have a solution to automate the configuration of a server instance.
Conclusion
That’s it for now. I know you are like, wait…where are the damn scripts. I might share them on GitHub if someone needs them, but I didn’t feel like digging them up and cleaning them up to add to this post and GitHub.
If you are just getting into scripting servers like me, I hope this helps spark a flame for you as you think it through. If you are a Monk of the 36th Chamber and you see all kinds of issues and naive assumptions that I shouldn’t be publicizing to unknowing newbies, please let me know. If you are looking to become better at automating the software delivery pipeline, drop me a line, I am always looking for someone to spar and train with.
Test Automation Tips: 2
#2 If you update or delete test data in a test, you need to isolate the data so it isn’t used by any other tests.
A major cause of flaky tests (tests that fail for indeterminate reasons) is bad test data. This is often caused by a previous test altering the test data in a way that causes subsequent tests to fail. It is best to isolate test data for scenarios that change the test data. You should have an easy way to isolate read only data so that you can insure it isn’t corrupted up by your tests.
Delta Seed
One way I like to do this is having what I call a delta test data seed. This delta seed loads all read only test data at the start of a test suite run. Any test data that needs to be updated or deleted is created per test. Mutable test data seeds are ran after the delta seed. So, in addition to the delta seed I will have a suite, feature or scenario seed.
Suite Seed
The suite seed is ran right after the delta seed, usually with the same process that runs the delta seed. Because the suite seed data is available to all tests being ran, it is the most riskiest seed and the least efficient unless you are running all of your tests as you may not need all of the data being loaded. I say risky, because it opens up the scenario where someone writes a test against mutable data when it should only be used by the test that will be changing the data.
Feature Seed
The feature seed would run at the beginning of a feature test during text fixture setup. This basically loads all the data used by tests in the feature. This has some of the same issues as the suite seed. All of the data is available for all tests in the feature and if someone gets lazy and writs a test against the mutable data instead of creating new data specifically for the test may result in flaky tests.
Scenario Seed
The scenario seed runs at the beginning of an individual test in the feature test fixture. This is the safest in terms of risk as the data is loaded for the test and deleted after the test so no other tests can use it. The problem I have with this is when you have a lot of tests having to create hundreds of database connections and dealing with seeding in the test can have an impact on overall and individual test time. If not implemented properly this type of data seeding can also create a maintenance nightmare. I like to use test timing as an indicator of issues with tests. If you can’t separate the time to seed the data from the time to run the test, having the seed factored into the test can affect the timing in a way that has nothing to do with what is being tested. So, you have to be careful not to pollute your test timing with seeding.
Which Seed?
Which seed to use depends on multiple factors. How you run your tests? If you are running all tests, it may be efficient to use a suite seed. If you run features in parallel, you may want a feature seed to quickly load all feature data at one time. If you run tests based on dependencies in a committed change, you may want to keep seeding granular with a scenario seed. There are many other factors that you can take into account and trial and error is a good way to go as you optimize test data seeding.
Conclusion
The thing to take away is you need a strategy to manage the isolation of test data by desired immutability of the data. Tests that don’t alter test data should use read only data. Tests that alter test data should use test data specifically created for the test. If you allow a test to use data altered by another test, you open yourself up to flaky test syndrome.
When I am writing and maintaining large functional UI tests I often realize somethings that would make my life easier. I decided to write this series of posts to describe some of the tips I have for myself in hopes that they prove to be helpful to someone else. What are some of your tips?
An Easy Win for Testable Methods
One of our developers was having an issue testing a service. Basically, he was having a hard time hitting the service as it is controlled by an external company and we don’t have firewall rules to allow us to easily reach it in our local environment. I suggested mocking the service since we really weren’t testing getting a response from the service, but what we do with the response. I was told that the code is not conducive to mocking. So, I took a look and they were right, but the fix to make it testable was a very simple refactor. Here is the gist of the code:
public SomeResponseObject GetResponse(SomeRequestObject request)
{
//Set some additonal properties on the request
request.Id = "12345";
//Get the response from the service
SomeResponseObject response = Client.SendRequest(request);
//Do something with the response
if (response != null)
{
//Do some awesome stuff to the response
response.LogId = "98765";
if (response.Id > "999")
{
response.Special = true;
}
LogResponse(response);
}
return response;
}
What we want to test is the “Do something with the response” section of the code, but this method is doing so many things that we can’t isolate that section and test it…or can we? To make this testable we simply move everything that conserns “Do something with the response” to a separate method.
public SomeResponseObject GetResponse(SomeRequestObject request)
{
//Set some additonal properties on the request
request.Id = "12345";
//Get the response from the service
SomeResponseObject response = Client.SendRequest(request);
return ProcessResponse(response);
}
public SomeResponseObject ProcessResponse(SomeResponseObject response)
{
//Do something with the response
if (response != null)
{
//Do some awesome stuff to the response
response.LogId = "98765";
if (response.Id > "999")
{
response.Special = true;
}
LogResponse(response);
}
return response;
}
Now we can just test the changes to ProcessResponse method in isolation away from the service calls. Since there were no changes to the service or the service client, we didn’t have to worry about testing them for this specific change. We don’t care what gets returned we just want to know if the response was properly processed and logged. We still have a hard dependency on LogResponse’s connection to the database, but I will live with this as an integration test and fight for unit tests another day. This is a quick win for testability and a step closer to making this class SOLID.
Optimizing the Software Delivery Pipeline: Deployment Metrics
Currently, I have no way of easily determining what build version is deployed to an environment. This made me take more interest in metrics about deployments, we basically have none. I can look at the the CD (continuous deployment) server and see what time a deployment was done and I can look at the builds on the server and sort of deduce which build was deployed, but I have to manually check the server to verify my assumptions. I wondered what else I am missing. Am I flying blind, should I know more?
Metrics in the Software Delivery Pipeline
I am part of a work group that is exploring software quality metrics. So, my first instinct was to think about deployment quality metrics. After some soul searching, I decided what would be most helpful to me is to know where our bottle necks are. We have an assembly line or pipeline that consists of various stages our software goes through as it makes its way to public consumption. Develop, build, deploy, test, and release are the major phases of our software delivery pipeline (I am not including planning or analysis right now as that is another animal).
I believe that metrics that focus on reducing time in our software delivery pipeline will be more effective than just focusing on reducing defects or increasing quality. If we can reduce defects or increase quality in faster delivery iterations, the effect of defects and poor quality will have less of an impact. This is the point of quality metrics in the first place, reducing the effects of poor quality on our customers and the business. Focusing on reducing time in the pipeline also supports our quality initiatives as the tools to reduce time, like automated CI and testing, not only reduce iteration time, but improve quality. Faster release iterations will allow us to address quality issues quicker. This is not to say that other metrics should be ignored. I just think that since we have no real metrics at the moment starting with metrics that support speeding up the pipe is a worthy first step.
Deployment Metrics
Back to the point. What metrics should I capture for deployments. If my goal is to increase throughput in the pipeline, I need to identify bottlenecks. So, I need some timing data.
- How long does deployment take?
- How long do the individual deployment steps take?
- How do we report this over time so we can identify issues?
This is pretty simple and I can extract it from the deployment log on the server. Reporting would be just a matter of querying this data and displaying deployment time totals over time.
Additional Deployment Metrics
In addition to the timing data it may be worthwhile to capture additional metrics like the size of deployment. Deploying involves pushing packages across wires and the size of the packages can have an effect on deployment time. Issues with individual servers can affect deployment time so, knowing the servers being deployed to can help identify server issues. With the timing data, we can also capture
- The version of the build being deployed
- The environment being deployed to
- The individual servers being deployed to
- The size and version of the packages being deployed to a server
Deployment Data
So, my first iteration of metrics center around timing, but would also have other data to give a more robust picture of deployments. This is a naive first draft of what the data schema could look like. I would suspect that this can all be captured on most CI/CD servers and augmented with data generated by the reporting tool:
- Deployment Id – a unique identifier for the deployment, generated by the reporting tool
- Environment Id – a unique identifier for the environment deployed to, generated by the reporting tool
- Build Version – build version should be the version captured on the server
- Timestamp – timestamp is the date/time the deployment record was created
- Start – the date/time the deployment started
- End – the date/time the deployment completed
- Tasks – tasks are the individual steps taken by the deployment script; it is possible that there is only one step, it all depends on how deployment is scripted
- Deployment Task Id – a unique identifier for the task, generated by the reporting tool
- Server Id – a unique identifier for the physical server deployed to, generated by the reporting tool
- Packages – packages represent the group of files pushed to the server, this is normally a zip or NuGet package in my scenarios
- Package Version – the version of the package being pushed, this may be different than the software version and is generated outside of the reporting tool
- Package Size – the physical size of the package in KB or MB (not sure which is better)
- Start – the date/time the deployment to the server started
- End – the date/time the deployment to the server ended
Imagine the above as some beautiful XML, JSON, or ProtoBuf, because I am too lazy to write it.
If my goal is to increase throughput in the pipe I should probably think about a higher level of abstraction in the hierarchy so that I can relate metrics from other parts of the pipeline. For now I will focus on this as a first step to prove that this is doable and provides some value.
All I need to do is a create data parsing tool that can be called by the deployment server once a deployment is done. The tool will receive the server log and store it, parse the log and generate a data structure similar to above, then store the data in a database. Then I have to create a reporting tool that can present graphs and charts of the data for easy analysis. Lastly, create an API that will allow other tools to consume the data. This maybe a job for CQRS and event sourcing. Easy right :). I know there is a tool for that, but I am a sucker for punishment.
Conclusion
This post will take more time than I thought so I will make this a series. I will cover my thoughts on metrics for development, build, test, and release in upcoming posts (if I can remember). Then possibly some posts on my thoughts on how the metrics and tools can be used to optimize the pipeline. Pretty ambitious, but sounds like fun to me.
