Category: Planning

February 25, 2025

How Do You Steer Towards Success? The North Star Metric

I have been thinking a lot about aligning product teams with product strategy. Here’s a post about one tool in the alignment arsenal. The North Star.

What’s a North Star Metric Anyway?

A North Star Metric (NSM) is the one metric that matters most. It represents the core value your product delivers to users and serves as your guiding light for long-term success. It keeps your team aligned, focused, and moving in the right direction. Simple as that.

Why Bother With a North Star Metric?

Having a North Star Metric means you’re not chasing random numbers that look good but don’t really matter. Here’s why it’s a game-changer:

1. Keep Your Eye on the Prize

Forget vanity metrics. Your NSM ensures you’re tracking what really moves the needle for users and the business.

2. Get Everyone on the Same Page

From product to marketing to ops, everyone should be rowing in the same direction. A solid NSM helps teams sync up.

3. Build for the Long Haul

Short-term wins are great, but sustainable growth is the goal. A good NSM makes sure you’re scaling in the right way.

How to Pick a North Star Metric That Works

A great NSM should be simple, actionable, and tied to real business outcomes. Here’s how you figure it out:

1. Understand Your Core Value

Ask yourself: What’s the main reason people use our product? Your NSM should reflect the value users get from it.

2. Connect It to Growth

If your NSM improves but your business isn’t growing, you’ve got the wrong metric. Pick something that’s directly tied to success.

3. Make It Measurable

If you can’t track it, you can’t improve it. Your NSM should be easy to monitor and analyze.

4. Don’t Ignore Other Metrics

A North Star Metric is important, but it’s not the only thing you should track. Pair it with other KPIs for a complete picture.

Real-World North Star Metrics

Some of the biggest companies out there rely on their North Star Metrics to guide growth:

Airbnb → Nights booked (measures marketplace health and user value)
Spotify → Minutes streamed (tracks engagement and content value)
Slack → Messages sent per user (measures engagement and product dependency)

Each of these metrics is directly linked to user experience and business success. They’re not just numbers—they tell the story of product value.

Picking the Wrong North Star Metric? Here’s What Happens

Messing up your NSM can lead to some bad decisions. Avoid these common pitfalls:

1. Chasing Vanity Metrics

Page views, downloads, or social media followers might look great, but they don’t necessarily mean you’re delivering value.

2. Making It Too Complex

If it takes a whole team just to calculate your NSM, it’s too complicated. Keep it simple.

3. Ignoring User Experience

A metric focused purely on revenue might drive bad decisions—like aggressive upselling—that hurt user trust.

4. Choosing a Short-Term Fix

A good NSM isn’t about short-term wins. It should reflect the bigger picture and long-term success.

How to Use a North Star Metric to Actually Get Results

Having an NSM is one thing. Making it work for you is another. Here’s how to put it to good use:

1. Let It Guide Your Decisions

Use your NSM to prioritize product updates, marketing campaigns, and operational strategies.

2. Track It Like a Hawk

Measure your NSM over time to understand trends and make data-driven decisions.

3. Keep Everyone in the Loop

Make sure the whole company knows what the NSM is and why it matters.

4. Be Willing to Adapt

If your NSM isn’t driving the right behaviors, change it. Business evolves, and so should your metric.

Wrapping Up

A well-chosen North Star Metric keeps teams focused, drives meaningful growth, and ensures your product delivers real value. But remember, it’s not set in stone.

Bottom Line

As a team, stay focused; Stay aligned; Keep moving towards success. Your North Star Metric is your roadmap, make sure it’s taking you where you actually want to go.

Don’t have an NSM, let’s talk about setting one.

October 26, 2017

An Agile Transformation

I wrote this a few years ago, but I’m going through a similar agile transformation right now. Although, every agile transformation is different, this still makes sense to me although it is just a draft post. I figured I’d just post it because I never search my drafts for nuggets of knowledge :).

If we are going to do Kanban we shouldn’t waste time formally planning sprints. Just like we don’t want to do huge upfront specifications because of waste caused by unknowns that invalidate specs, we don’t want to spend time planning a sprint because the work being done in the sprint can change anytime the customer wants to reprioritize.

We should have a backlog of prioritized features. The backlog is regularly prioritized (daily, weekly…) to keep features available to work. If we want to deliver a specific set of features or features in two weeks, prioritize them and the team will do those features next.

There is a limit on the number of features the team can have in progress (work in progress or WIP). Features are considered WIP until they pass UAT. Production would be a better target, but saying a feature is WIP until production is a little far fetched if you aren’t practicing “real” continuous delivery. So, for our system, production is considered passing UAT. When the team is under their WIP limit they are free to pull the next feature from highest priority features in the backlog.

This is going to most likely reduce resource utilization, but will increase throughput and improve quality. Managers may take issue at developers not being used at full capacity, but there is a reason for this madness and hopefully I can explain it.

Having features pulled into the pipeline from a prioritized backlog instead planning a sprint allows decisions on what features to be worked to be deferred until the last possible moment. This provides more agility in the flow of work in the pipeline and the product owner is able to respond quickly to optimize the product in production. Isn’t agile what we’re going for?

Pulling work with WIP limits also gives greater risk management. Since batch sizes are smaller, problems will only affect a limited amount of work in progress and risk can be mitigated as new work is introduced in the pipeline. This is especially true if we increase the number of production releases. If every change results in a production release we don’t have to worry about the branch and hotfix dance.

Focusing on a limited amount of work improves the speed at which work is done. There is no context switching and there is a single focus on moving a one or limited amount work items through the system at one time. This increases the flow of work even though there may be times when a developer is idle.

The truth is the system can only flow as fast as its slowest link, the constraint. Having one part of the system run at full capacity and overload the constraint introduces a lot of potential waste in the system. If the idle parts of the system worked to help the bottlenecked part of the system, the entire system improves. So having a full system focus is important.

On my current team, we have constraints that determine how quickly we can turn around a feature. Currently, code review and QA are constraints. QA is the largest constraint that limits faster deployment cycles, but more on that later. To optimize our constraints we could follow the five basic steps outlined in the Theory of Constraints (TOC) from the book The Goal:

Identify the constraint(s) – in this instance it’s code review and manual testing
Exploit the constraint to maximize productivity – focus on improvements on the constraint
Subordinate all other steps or processes to speed up or reduce capacity of the constraint – no new work may enter as WIP until the constraint has WIP available
Elevate the constraint – prioritize work that helps remove the constraint.
Repeat

To help with the code review constraint the plan is to have developers do code reviews any time WIP stops the movement of work. With this time developers can dig in and do more thoughtful code reviews and look for ways to refactor and improve the code base. Since we are touching code, why not make recommendations to make the code better. So, we can improve what an acceptable pull request is: good syntax, style, logic, tests… everything we can think of to make the codebase more maintainable and easy to validate.

To remove the QA constraint, the plan focuses on developers creating automated tests to help lessen the work that QA has to do. The reason we don’t first focus on optimizing QA processes directly is because focusing on simply optimizing QA processes would actually increases the capacity for QA without increasing the speed at which we can flow work to production. We don’t want to increase the number of features that QA can handle because it is important to take the proper time in testing. What we want to do is remove manual regression checks for QA. Exploiting QA for us means increasing QAs effectiveness freeing up time to do actual testing instead of just following a regression script. Having developers automate regression opens us up to deliver new features to production faster because automation runs these test much faster than QA. QA can focus on what they do best, testing and not running mundane scripted checks. Trick here is how do we convince developers to write automated tests without causing a revolt.

In summary, we would have to wait for a manual regression test cycle to occur and couldn’t introduce new work because it would invalidate the regression test. With automation handling +80% of regression QA can move faster, actually test more, and we can not only increase throughput through the entire system, but the overall quality of the product is also increased.

Monitoring Delivery Pipeline

We track work through the delivery pipeline as features. A feature in this sense is any change, new function, change existing function, or to fix a defect. Features are requested on features kept in a central database. We monitor the delivery pipeline by measuring:

Inventory
Lead Time
Quantity: Unit of Production
Production Rate

Inventory

Inventory (V) is any work that has not been delivered to the customer. This is the same as work in progress (WIP). This counts all work from the backlog to a release awaiting production deployment. Whenever there is undelivered work and we have to cancel the work for some reason, we considered it an Operational Expense. Canceled work won’t be delivered to production because of defect, incorrect specs, the customer pivoted or otherwise doesn’t want it. Cancelled work is wasted effort and in some cases can also cause expensive un-budgeted rework. In traditional cost accounting inventory is seen as an asset, but in TOC it is a potential Operational Expense if it is not eventually delivered to customer so turning inventory as fast as possible without injecting defects is a goal.

Quantity

Quantity (Q) is the total number of units that have moved through our delivery pipeline. Our unit of production is a feature. When a feature is deployed to production we can increase quantity by one unit. A feature is still considered inventory until it has been delivered to the customer in production. If a customer decides they don’t want the feature or some other reason to stop the deployment of the feature, it is counted as an Operational Expense and not quantity.

Flow Time

Flow time (FT) is the time it takes to move a feature, one unit, from submission to the backlog to deployed to a customer in production.

Production Rate

Production rate (PR) is the number of units delivered during a time period. This is the same as throughput. If we we deliver 3 features to production in a month our production rate is 3 features per month.

Optimize Delivery Pipeline for Flow Time

We should strive to optimize the delivery pipeline for flow time instead of production rate or throughput. The Theory Of Constraints – Productivity Metrics in Software Development posted on lostechies.com explains this well.

Let’s say our current flow time (FT) is 1 unit (Q) in a week or a production rate (PR) of 4 Q per month. If we optimize FT to 1 Q in 3 days, we will see a jump in PR to 6.67 Q per month or a 59% increase.

If we focus on optimizing PR, we may still see improvement in FT, but it can also lead to only an increase in inventory as WIP increases. The PR optimization may increase Q that is undeliverable because of some bottleneck in our system so the Q sits as inventory, ironically in a queue. The longer a feature sits in inventory the more it costs to move it through the pipeline and address any issues found in later stages of the pipeline. So, old inventory can also cause delay down stream as the team must take time to ramp up to address issues after they have moved on to another task.

So, to make sure we are optimizing for FT we focus on reducing waste or inventory in the pipeline by reducing WIP. The delivery team keeps a single purposed focused on one unit or a limited amount of work in progress to deliver what the customer needs right now, based on priority in the backlog. Reducing inventory reduces Operation Expense. (Excuse me if I am allowing some lean thinking into this TOC explanation)

Metrics

Investment

Investment (I) is the total cost invested in the pipeline. In our case we will count this as time invested. We can sum the time invested on each unit in inventory in the pipeline to see how much is invested in WIP. We could count hours in timecards to determine this, but time cards are an evil construct. If we are good about moving cards, or even automated movement of cards based on some event (branch created, PR submitted, PR approved…), we could assign the time a card sits in some state to a standard investment amount in the time it sat. I’m still pondering this, but I feel like time investment based on card movement is way better than logging time.

Operating Expense

Operating expense (OE) is the cost of taking an idea and developing it to a deliverable. This is not to be confused with operational expense which is a loss in inventory or loss in investment. Any expense, variable or fixed, that is a cost to deliver a unit is considered OE. We will just use salaries of not only developers, but BA, QA, IT as our OE. Not sure how we will divide up our fixed salaries, maybe a function that includes time and investment. Investment would be a fraction of OE because all of a developers time is not invested in delivering features (still learning).

Throughput

Throughput (T) in this sense is the amount earned per unit. Traditionally, this is that same as production rate as explained earlier, but in terms of cost, we calculate throughput by taking the amount earned on production rate, features delivered to production, minus the cost of delivering the features or the investment.

Throughput Accounting

To maximize ROI and net profit (NP) we need to increase T while decreasing I and OE.

NP = (T – OE)

ROI = NP/ I

Average Cost Per Feature

Average cost per feature (ACPF) is the average amount spent in the pipeline to create a feature.

ACPF = OE/Q

There are more metrics that we can gather, monitor, and analyze; but we will keep it simple for now and learn to crawl first.

Average Lead Time Per Feature

The average time it takes to move a feature from the backlog to production. We also calculate the standard deviation to get a sense on how varying work sizes in the pipeline affects lead time.

Bonus: Estimating Becomes Easier

When we begin to monitor our pipeline with these metrics estimating becomes simpler. Instead of estimating based on time we switch to estimating based on size of feature. Since we are tracking work, we have a history to base our future size estimates on.

Issues in Transformation

Our current Q is a release, a group of features that have been grouped together for a deployment. We will build up an inventory of features over a month at times before they are delivered to production. This causes an increase in inventory. It would be better to use a feature instead of a release as our Q. When a feature is ready, deliver it. This reduces inventory and increase the speed at which we get feedback.

To change our unit, Q, to feature we have to attack our largest constraint, QA. Currently, we have to sit on features or build up inventory to get enough to justify a QA test cycle. We don’t want to force a two week regression on one feature that took a couple days to complete. So, reducing the test cycle is paramount with this approach.

References

The Goal: A Process of Ongoing Improvement, by Eliyahu M. Goldratt

The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win, by Gene Kim, Kevin Behr, and George Spafford.
https://www.amazon.com/Phoenix-Project-DevOps-Helping-Business/dp/B00VATFAMI/ref=sr_1_1?s=books&ie=UTF8&qid=1509016772&sr=1-1&keywords=the+phoenix+project

The Metrics in TOC: Productivity Metrics In Software Development, by erick Bailey, https://lostechies.com/wp-content/uploads/2011/04/TheoryOfConstraints-ProductivityMetricsInSoftwareDevelopment.pdf
Agile Management for Software Engineering, by David J. Anderson
Reaching The Goal, by John Arthur Ricketts
Applying Theory of Constraints to Manage Bottlenecks, by Kamran Khan, http://www.isixsigma.com/methodology/theory-of-constraints/applying-theory-constraints-manage-bottlenecks/
http://chronologist.com/blog/2012-07-27/theory-of-constraints-and-software-engineering/
http://chronologist.com/blog/2012-10-04/buffer-management-and-risk-management-in-TOC/
https://www.timecockpit.com/blog/2013/08/30/Project-Reporting-in-Agile-Projects

November 27, 2015

GTP for BDD

Graphical Test Plan

I read a little about graphical test planning created by Hardeep Sharma and championed by David Bradley, both from Citrix. It’s a novel idea and sort of similar to the mind map test planning I have played around with. The difference is your not capturing features or various heuristics and test strategies in a mind map, you are mapping expected behavior only. Then you derive a test plan from the graphical understanding of the expected behavior of the system. I don’t know a lot about GTP, so this is a very watered down explanation. I won’t attempt to explain it, but you can read all about it:

Plan Business Driven Development with GTP

What interested me was the fact that I could abstract how we currently spec features into a GTP type model. I know the point of GTP is not to model features, but our specs model behavior and they happen to be captured in feature files. Its classic Behavior Driven Development (BDD) with Gherkin. We have a feature that defines some aspect of value that the system is expected to provide to users. In the feature we have various scenarios that describe the expected behaviors of the feature. Scenarios have steps that define pre-condition, action, and expectations (PAE) or in Gherkin, Given-When-Then (GWT) that define how a user would execute the scenario. We also have feature backgrounds which is a feature wide pre-condition that is shared by all scenarios in the feature.

I said we use Gherkin, but our new test runner transcends just GWT. We can define PAE in plain English without the GWT constraints, we can select the terms to describe PAE instead of being forced to use GWT which sometimes causes us to jump through hoops to force the GWT wording to sound correct.

GTP Diagram

If we applied something like GTP we would model the scenarios, but there would be more hierarchy before we define the executable scenarios. We currently use tagging to group similar scenarios that exercise a specific subset of a feature’s scenarios. This allows us to provide faster feedback by running checks for just a subset instead of the entire feature when we are only concerned with changes to the subset.

In a GTP’ish model the left most portion of the diagram would hold generalized behavior specs, similar to how we use tagging, and as we go to the right the behavior becomes more granular until we hit a demarcation point for executable scenarios that can then be expressed in a linked test case diagram (TCD). In the GTP there are ways to capture meta data like related requirement/ticket ID for traceability back to requirements. Also, meta for demarcation point (can’t think of better name) to link to the TCD or feature file that further defines it.

Test Case Diagram

The test case diagram would define various scenarios that define the behavior of the demarcation points in the GTP. The TCD diagram would also include background preconditions and the steps to execute the scenario. At this point it feels like this is an extra step. We have to write the TCD in a feature file so diagramming it is creating a redundant document that has to be maintained.

In the TCD there are shapes for behavior, preconditions, steps, and expectations. I think there should be additional shapes or meta to express tags because this is important in how we categorize and control running of scenarios. It may help if there is also meta to link back to the GTP that the TCD is derived from so we can flow back and forth between the diagrams. Meta in the TCD is important because it gives us the ability extract understanding outside of just the test plan and design. We could have shapes, meta descriptions and links to

execute automated checks
open a manual exploratory test tool
view current test state (pass/fail)
view historical data (how many times has this step failed, when was the last failure of this scenario…)
view flake analysis or score
view delivery pipeline related to an execution
view team members responsible for plan, develop, test and release
view related requirement or ticket
much more…

Since we also define manual tests by just tagging features or scenarios with a manual tag or creating exploratory test based feature files, we could do this for both automated checks and manual tests.

GTP-BDD Binding

To get rid of the TCD redundancy we could generate the feature file from the diagram or vice-versa. Being able to bind GTP to BDD would make GTP more valuable to me.

We would need an abstract object graph that could be used to generate both the diagram and the feature file (Excel spread sheet, HTML page or whatever else). We are almost here, we have a tool that can generate feature files from persisted objects and vice versa. We would just have to figure out how to generate the diagram and express it as an interactive UI and not just a static picture.

What we have been struggling with is the ability to manually edit feature files and keep that in sync with the persisted objects. With a centralized UI this is easy because everyone uses the UI to update the objects. When people are updating features files from a source code repository we have to worry about merge conflicts (yuck) and if we consider the feature file or the persisted object as the source of truth. So, we may have to reduce flexibility and force everyone to use the UI only. Everyone would have to have discipline and not touch the feature files even though we have nice tools built into our IDE to help write and manage them. The tool would have to detect when someone has violated the policy and so on…I digress.

Conclusion

With a graphical UI modeled on GTP/TCD to manage BDD we can provide an arguably simpler way to visualize tests and provide the ability to drill down to see different aspects of test plans and designs and their related current and historical execution. With 2-way binding from diagram to feature file we have a new way to manage our executable specifications. This model could provide a powerful tool to not only aide test planning, but test management as a whole. The end result would hopefully be a better understanding for the team, increased flow in delivery pipeline, enhanced feedback, and more value to the customer and the business.

Now lets ask Google if something like this already exists so I don’t have to add it to my ever increasing backlog of things I want to build. Thanks to Hardeep Sharma, David Bradley, and Citrix for sharing GTP.

November 12, 2015

Monitoring Change Tickets in Delivery Pipelines

DevOps sounds cool like some covert special operations IT combat team, but it is missing the boat in many implementations because it only focuses on the relationship between Dev and Ops and is usually only championed by Ops. The name alienates important contributors on the software delivery team. The team is responsible for software delivery including analysis, design, development, build, test, deploy, monitoring, and support. The entire team needs to be included in DevOps and needs visibility in to delivery pipelines from end-to-end. This is an unrelated rant, but this lead me to thinking about how a delivery team can monitor changes in delivery pipelines.

Monitor Change

I believe it is important that the entire team be able to monitor changes as they flow through delivery pipelines.. There are ticket management systems that help capture some of the various stages that a change goes through, but its mostly various project management related workflow stages and they have to be changed manually. I’d like a way to automatically monitor a change as if flows from change request all the way to production and monitor actions that take place outside of the ticket or project management system.

Normally, change is captured in some type of ticket maybe in a project management system or bug database (e.g. Jira, Bugzilla). We should be able to track various activities that take place as tickets make their way to production. We need a way trace various actions on a change request back to the change request ticket. I’d like a system where activities involved in getting a ticket to production automatically generate events that are related to ticket numbers and stored in a central repository.

If a ticket is created in Jira, a ticket created event is created. A developer logs time on a ticket, a time logged activity event is created that links back to the time log or maybe holds data from the time log for the ticket number.

When an automated build that includes the ticket happens, then a build stated activity event is created with the build data is triggered. As various jobs and tasks happen in the automated build a build changed activity event is triggered with log data for the activity. When the build completes a build finished activity event is triggered. There may be more than one ticket involved in a build so there would be multiple events with similar data captured, but hopefully changes are small and constrained to one or a few tickets… that’s the goal right, small batches failing fast and early.

We may want to capture the build events and include every ticket involved instead of relating the event directly to the ticket, not sure; I am brainstorming here. The point is I want full traceability across my software delivery pipelines from change request to production and I’d like these events stored in a distributed event store that I can project reports from. Does this already exists? Who knows, but I felt like thinking about it a little before I search for it.

Ticket Events

Ticket Created Event
Ticket Activity Event
Ticket Completed Event

A ticket event will always include the ticket number and a date time stamp for the event, think Event Sourcing. Ticket created occurs after the ticket is created in the ticket system. Ticket completed occurs once the ticket is closed in the ticket system. The ticket activities are captured based on the activities that are configured in the event system.

Ticket Activity Events

A ticket activity is an action that occurs on a change request ticket as it makes its way to production. Ticket activities will have an event for started, changed, and finished. Ticket activity events can include relevant data associated with the event for the particular type of activity. There may be other statuses included in each of these ticket activity events. For example a finish event could include a status of error or failed to indicate that the activity finished but it had an error or failed.

{Ticket Activity} Started
{Ticket Activity} Changed
{Ticket Activity} Finished

Deploy Started that has deploy log, Build Finished that has the build log, Test Changed that has new test results from an ongoing test run.

Maybe this is overkill? Maybe this should be simplified where we only need one activity event per activity and it includes data for started, changed, finished, and other statuses like error and fail. I guess it depends on if we want to stream activity event statuses or ship them in bulk when an activity completes; again I’m brainstorming.

Activities

Every ticket won’t have ticket activity events triggered for every activity that the system can capture. Tickets may not include every event that can occur on a ticket. Activity events are triggered on a ticket when the ticket matches the scope of the activity. Scope is determined by the delivery team.

Below are some of the types of activity events that I could see modeling for events on my project, but there can be different types depending on the team. So, ticket activity events have to be configurable. Every team has to be able to add and remove the types of ticket activity events they want to capture.

Analysis
1. Business Analysis
2. Design Analysis
  1. User Experience
  2. Architecture
3. Technical Analysis
  1. Development
  2. DBA
  3. Build
  4. Infrastructure
4. Risk Analysis
  1. Quality
  2. Security
  3. Legal
Design
Development
Build
Test
1. Unit
2. Integration
3. End-to-end
4. Performance
5. Scalability
6. Load
7. Stress
8. …
Deploy
Monitor
Maintain

Reporting and Dashboards

Once we have the events captured we can make various projections to create reports and dashboards to monitor and analyze our delivery pipelines. With the ticket event data we can also create reports at other scopes. Say we want to report on a particular sprint or project. With the ticket Id we should be able to gather this and relate other tickets in the same project or sprint. It would take some though as to whether we would want to capture project and sprint in the event data or leave this until the time when we make the actual projection, but with ticket Id we can expand our scope of understanding and traceability.

Conclusion

The main goal with this exploration into my thoughts on a possible application is to explore a way to monitor change as it flows through our delivery pipelines. We need a system that can capture the raw data for ticket create and completed events and all of the configured ticket activity events that occur in between. As I look for this app, I can refer to this to see if it meets what I envisioned or if there may be a need for this.

December 15, 2014

Sev1 Incident

I read a book called the Phoenix Project. A surprisingly good book about a company establishing a DevOps culture. One of the terms in the book that I had no experience with was Sev1 incident. I have since heard it repeated and have come to find out that it is part of a common grading of incident severity. Well, I decided to finally research it about a year after I read the book and put more thought into a formalized incident reporting, triage, mitigation, and postmortem workflow. Which is similar to the thoughts I had on triaging failing automated tests.

Severity Levels

So, first to define the severity levels. Fortunately, David Lutz has a good break down on his blog – http://dlutzy.wordpress.com/2013/10/13/incident-severity-sev1-sev2-sev3-sev4-sev5/.

Severity Levels

Sev1 Complete outage
Sev2 Major functionality broken and revenue affected
Sev3 Minor problem, bug
Sev4 Redundant component failure
Sev5 False alarm or alert for something you can’t fix

Identify Levels

With that I need to define how to identify the levels. IBM has a break down that simplifies it on their Java SDK site – http://publib.boulder.ibm.com/infocenter/javasdk/v1r4m2/index.jsp?topic=%2Fcom.ibm.java.doc.diagnostics.142%2Fhtml%2Fbugseverity.html:

Sev 1

In development: You cannot continue development.
In service: Customers cannot use your product.

Sev 2

In development: Major delays exist in your development.
In service: Users cannot access a major function of your product.

Sev 3

In development: Major delays exist in your development, but you have temporary workarounds, or can continue to work on other parts of your project.
In service: Users cannot access minor functions of your product.

Sev 4

In development: Minor delays and irritations exist, but good workarounds are available.
In service: Minor functions are affected or unavailable, but good workarounds are available.

Severity Analysis

Now that we have more guidance on identifying the severity of an incident, how should it be reported? I believe that anyone can report an incident, bug, something not working, but it is up to an analyst to determine the severity level of the report.

So, the first step is for the person who discovered the issue to open a ticket. Of course if it is a customer and we don’t have a self-support system, they will probably report it to an employee in support or sales and the employee will create the ticket for the customer. All tickets should be auto routed to the analyst team where it is assigned to an analyst to triage. The analyst will assign the severity level and assign to engineering support where the ticket will be reviewed, discussed and prioritized. The analyst in this instance can be a QA, BA, even a developer assigned to the task, but the point is to have a dedicated team/person responsible.

During the analysis, a time line of the the failure should be established. What led up to the failure, the changes, actions taken, and people involved should all be laid out in chronological order. Also, during triage, a description of how to recreate the failure should be written if possible. The goal is to collect as much information about the failure as possible in one place so that the team can review and help investigate. Depending on the Sev level various degrees of details and speed in which feedback is given should be established.

Conclusion

This is turning out to be a lot deeper than I care to dive into right now, but this gives me food for thought. My take aways so far are to

formalize severity levels
define how to identify the levels
assign someone to do the analysis and assign the levels

May 26, 2014

What Makes a Good Candidate for Test Automation?

Writing large UI based functional tests can be expensive in terms of money and time. It is sometimes hard to know where to focus your test budget. New features are good candidates, especially the most common successful and exceptional paths through the feature. But, when you have a monster legacy application with little to no coverage, where to get the biggest bang for the buck can be hard to ascertain.

Bugs, Defects, Issues…It Doesn’t Work

I believe bugs provide a good candidate for automation, especially if regression is a problem for you. Even if regression is not an issue, its always good to protect against regressions. So, automating bugs are kind of a win-win in terms of risk assessment. Hopefully, when a bug is found whoever finds the bug or whoever adds it to the bug database provides reproduction steps. If the steps are a good candidate for automation, automate it.

Analyzing Bugs

What makes a bug a good candidate for test automation? When analyzing bugs for automated testing I like to evaluate on 4 basic criteria. In descending order of precedence:

The steps are easy to model in the test framework.
The steps are maintainable as an automated test.
The bug was found before.
The bug caused a lot of pain to users or the company.

It is just common sense that “bug caused a lot of pain” is the top candidate. If a bug caused a lot of pain, you don’t want to repeat it, unless you like pain. Yet, if the painful bug is a maintenance nightmare as an automated test, the steps are hard to model, and the bug wasn’t found before you may want to just mark it for manual regression. If your test matches 2 or more of the criteria I’d say it is a high priority candidate for test automation.

Conclusion

These are just my opinion and there is no study to prove any of it. I know this has been thought of and pondered, maybe even researched by someone. If you know where I can find some good discussions on this topic or if you want to start one, please let me know.

May 22, 2014

How Much Does Automated Test Maintenance Cost?

I saw this question on a forum and it made me pause for a second to think about it. The quick answer is it varies. The sarcastic answer is it costs as much as you spend on it, or how about, it cost as much as you didn’t spend on creating a maintainable automation project.

I have only been involved in 2 other test automation projects prior to my current position. In both I also had feature development responsibility. On one of the projects, comparing against time developing features, I spent about 10-15% of my time maintaining tests and about 25% writing them. So, that is about 30-40% of my total test time on maintenance. Based on my knowledge today, some of my past tests weren’t that good so maybe the numbers should have been higher or lower. On the other project, test maintenance was closer to 50% and that was because of poor tool choice. I can state the numbers because I tracked my time spent. I could not use these as benchmarks to estimate maintenance cost on my current project or any other unless the context was very similar and I can easily draw the comparison.

I have seen where someone might say “it’s typically between this and that percentage of development cost,” or something similar. Trying to quantify maintenance costs is hard, very hard and it depends on the context. You can try to estimate based on someone else’s guess of a rough percentage and hope it pans out, but in the end it is dependent on execution and environment. An application that changes often vs. one that rarely changes, poorly written automated tests, bad choice of automation framework, skill of the automated tester…there is a lot that can change cost from project to project. I am curious if someone has a formula to calculate an estimate across all projects, but having an insane focus on the maintainability of your automated test suites can significantly reduce costs in the long run. So a better focus, IMHO, is on getting the best test architecture, tools, framework, people and make maintainability a high priority goal. Also properly tracking maintenance in the project management or bug tracking system can provide a more valuable measure of cost across the life of a project. If you properly track maintenance cost (time), you get a benchmark that is customized for your context. Trying to calculate cost up front with nothing to base the calculations on but a wild uneducated guess can lead to a false sense of security.

So, if you are trying to plan a new automation project and you ask me about cost the answer is, “The cost of having automated tests…priceless. The cost of maintaining automated tests…I have no idea.”

June 12, 2013

Specifications?

This is more of a “what I’m thinking” post than something full of good information. I am trying to frame my idea of feature specifications and I need to answer some questions. This seemed like as good a place as any to store my questions, but you are more than welcome to provide answers and opinions. Continue reading →

What’s a North Star Metric Anyway?

Why Bother With a North Star Metric?

1. Keep Your Eye on the Prize

2. Get Everyone on the Same Page

3. Build for the Long Haul

How to Pick a North Star Metric That Works

1. Understand Your Core Value

2. Connect It to Growth

3. Make It Measurable

4. Don’t Ignore Other Metrics

Real-World North Star Metrics

Picking the Wrong North Star Metric? Here’s What Happens

1. Chasing Vanity Metrics

2. Making It Too Complex

3. Ignoring User Experience

4. Choosing a Short-Term Fix

How to Use a North Star Metric to Actually Get Results

1. Let It Guide Your Decisions

2. Track It Like a Hawk

3. Keep Everyone in the Loop

4. Be Willing to Adapt

Wrapping Up

Bottom Line

Share this:

Monitoring Delivery Pipeline

Inventory

Quantity

Flow Time

Production Rate

Optimize Delivery Pipeline for Flow Time

Metrics

Investment

Operating Expense

Throughput

Throughput Accounting

Average Cost Per Feature

Average Lead Time Per Feature

Bonus: Estimating Becomes Easier

Issues in Transformation

References

Share this:

Graphical Test Plan

Plan Business Driven Development with GTP

GTP Diagram

Test Case Diagram

GTP-BDD Binding

Conclusion

Share this:

Monitor Change

Ticket Events

Ticket Activity Events

Activities

Reporting and Dashboards

Conclusion

Share this:

Severity Levels

Identify Levels

Severity Analysis

Conclusion

Share this:

Bugs, Defects, Issues…It Doesn’t Work

Analyzing Bugs

Conclusion

Share this:

Share this:

Share this: