Data Lord: Metrics

Metrics are, and always have been, a subject on any test conference and any conversation between QAs anywhere. They are the basis of your reporting, the numbers you care about. The eternal question is always “What metric should I use?”

You get what you measure

A good old saying is that “you get what you measure”. I couldn’t agree more. I’m certainly convinced that you rarely will get what you hope without making some measurements so people can see the state of progress towards some goal you set.

The funny thing is that you don’t do it primarily for the purpose of control, you should do it for the purpose of making a positive story about the great things you and your employees achieve. It is a great motivator to see your team is doing well in the form of hard factual numbers. So if you find the right metrics to measure, you will see people trying to achieve them and end up with an additional motivator.

The backside of the medal is that if you use the wrong metrics, you can get some really demotivated employees if the metric is unachievable. Especially if you use negative reinforcement to achieve the goal, such as salary or bonus reduction. You also risk making a metric which contradicts your current process, which leads to chaos and confusion.

When deciding on a metric, you also have to consider how hard it is to get valid data of high quality. If it requires elaborate manual labour to get the data, you are asking for trouble. It will tire your employees, leading to neglect, data rot and demotivation. Sometimes you want to trade in some manual work for a set of data, you just have to be aware of it upfront.

So be very aware of what you measure and how you reinforce it. It can make you or break you.

How I do it

I’ll go through some of those I use and give my interpretation of them as well. I’ll also use the term dimension, which is a BI term used in data cubes for a way to slice a metric, eg. Number of Bugs sliced on Status=Active is the same as Active Bugs.

Number of Bugs

Most intuitive and common metric. But sadly also very easy to become obfuscated as the product ages, multiple versions are released and features gets closed or refactored. You can’t afford to have any discussions on whether a bug is really a bug (this was the case in Unity before we instated a very strict regime with only having bugs which were positively reproduced by a QA), because any report you make using this number will have a debate about the number itself instead of the conclusion you draw from it.

I want to slice this number on a set of dimensions:

Status: Active, Resolved (Fixed), Resolved (Duplicate) etc. should really be the same set of status used in any software project. Active bugs should never be assigned to a developer, resolved bugs should always be assigned to the person reproducing it when creating the bug and this person should verify the resolution cause before closing the bug.

Area: Which part of the application are these bugs belonging to? Bug clustering is impossible without this information, but it is also very hard to make the data quality of this dimension good. There’s a tendency to invent too many areas and mix platforms and areas, so a firm hand is needed on this.

Assigned To: Who has the bug assigned now. Can be aggregated to team level. Used for resource allocation and occasionally shaming people into action.

Version: You want to track the version in which the bug was registered. Newer bugs tend to get higher priority when you go through them all in a bug scrub and eventually you close out very old bugs because they never got enough priority.

Milestone: Purely for scheduling purposes, so you can make the burndown charts for release managers to track how the final stabilization is going.

Priority: A scheduled dependent number setting the priority within the current release or milestone. In Unity this is set exclusively by developers because we have no project managers. QA leaves the bugs after having reproduced it and converted it to a bug, but in other companies I have been used to be a member of a triage team consisting of a project manager, a developer and a QA which attended all triage and scrubbing of bugs.

Severity: In contrast to Priority, this should be a release agnostic number stating how severe this bug is for customers. Set primarily by QA.

Is Regression: Tracing how many regressions you have on an area, version or milestone is important if you have a codebase with a lot of dependencies and integration points between teams and developers, because this will highlight communication and process problems. You might start out by having QAs being frustrated about this without having any hard data and then you institute a more rigorous process of reproducing the bugs to set this flag.

Customer Found: Was the bug found by a customer? With the active community Unity has, I use this to identify clusters of functionality where QAs are finding too few of the bugs compared to how many our customers find. The community finds about 30% of the bugs on our 4.x releases, so we have sufficient data to make this a meaningful statistic. We recently found a cluster where QA simply wasn’t doing a good enough job and we are now taking active measures to relieve our customers of this pain

These are just some of the most important dimensions I mentioned here, I have a lot more in the warehouse I made. Combining these numbers and dimensions I can make a lot of ad-hoc analysis of the situation on the product.

Other metrics of interest I use are Google Analytics number of times an editor has been started. Number of test cases, manual or automated, on an area can help determine some rough coverage. Automation which has been disabled due to known issues or those which are known to not work on a specific platform. The number of customer reported incidents we haven’t touched yet, how many are converted to real bugs etc. etc.

The list goes on and on, but in the end, I don’t really measure anyone’s performance solely on metrics. They are indicators which have to be interpreted and this interpretation has to be in the context of the current situation in time, organization, goals and management, so you get a balanced picture of the numbers. “Lies, damned lies, and statistics” is something anyone can relate to, but don’t be fooled: Numbers make you credible like nothing else in a world of engineers.

Data Lord

onsdag den 10. januar 2018

Metrics

You get what you measure

How I do it

Number of Bugs

Ingen kommentarer:

Send en kommentar

online markedsføring og webkommunikation

Rapportér misbrug