As an Amazon Associate I earn from qualifying purchases from

Metrics that Matter – Cisco Blogs

In giant, advanced organizations, generally the one metric that appears to matter is imply time to innocence (MTTI). When a system breaks down, MTTI is the tongue-in-cheek measure of how lengthy it takes to show that the breakdown was not your fault. Someway, MTTI by no means makes it into the slide deck for the quarterly board assembly.

With the explosion of instruments obtainable as we speak—observability platforms for gathering system telemetry, CI/CD pipelines with check suite timings and utility construct occasions, and actual person monitoring to trace efficiency for the tip person—organizations are blessed with a wealth of metrics. And cursed with lots of noise.

Each staff has its personal set of metrics. Whereas each metric may matter to that staff, only some of these metrics could have important worth to different groups and the group at giant. We’re left with two challenges:

  1. Metrics inside a staff are sometimes siloed. No person outdoors the staff has entry to them and even is aware of that they exist.
  2. Even when we are able to break down the silos, it’s unclear which metrics really matter.

Breaking down silos is a fancy matter for one more put up. On this one, we’ll concentrate on the simpler problem: highlighting the metrics that matter. What metrics does a expertise group want to make sure that, within the huge image, issues are working properly?  Are we good to push that change, or may the replace make issues worse?

Availability Metrics

People like huge, easy metrics: the Dow Jones, heartbeats per minute, variety of shoulder massages you get per week. To get the large image in IT, we even have easy, easily-understandable metrics.


As a share of availability, uptime is the best metric of all. We’d all guess that something lower than 99% is taken into account poor. However chasing these previous few nines can get costly. Advanced techniques designed to keep away from failure may cause failure in their very own proper, and the price of implementing 99.999% availability—or “5 nines”—will not be value it.

Imply Time Between Failures (MTBF)

MTBF is the typical time between failures in a system. The fantastic thing about MTBF is that you could really watch your boss begin to twitch as you method MTBF: Will the system fail earlier than the MTBF? After? Maybe it’s much less hectic to throw the breakers deliberately, simply to take pleasure in one other 87 days!

Imply Time To Restoration (MTTR)

MTTR is the typical time to repair a failure and will be regarded as the flip facet of MTBF. Each Martin Fowler and Jez Humble have quoted the phrase, “If it hurts, do it extra typically,” and that precept looks as if it may apply to MTTR as properly. Quite than avoiding adjustments—and usually treating your techniques with child gloves to try to maintain MTBF excessive—why not get higher at restoration? Work to cut back your MTTR. Paradoxically, you possibly can take pleasure in extra uptime by caring about it much less. 

Improvement Metrics

For years, an essential enchancment metric utilized by builders was Product Proprietor Glares Per Day. Improvement within the twenty first century has given us new methods to grasp developer productiveness, and a rising physique of analysis factors to the metrics we have to concentrate on. 

Deployment Frequency

The excellent work of Nicole Forsgren, Jez Humble, and Gene Kim in Speed up demonstrates that groups that may deploy incessantly expertise fewer change failures than groups that deploy occasionally. It will be a courageous transfer to try to recreation this metric by deploying each hour out of your CI/CD pipeline. Nonetheless, capturing and understanding this metric will assist your staff examine its impediments.

Cycle Time

Cycle time is measured from the time a ticket is created to the wholesome deployment of the ensuing repair in manufacturing. In case you wanted to repair an HTML tag, how lengthy wouldn’t it take to get that single change deployed? If you must begin calling conferences in regards to the deployment outages, you recognize that the worth of that metric, in your group, is just too excessive.

Change Failure Price

Of all of your group’s deployments, what number of must be rolled again or adopted up with an emergency bugfix? That is your change failure price, and it’s a superb metric to attempt to enhance. Bettering your change failure price helps builders to proceed extra confidently. This can enhance the deployment frequency price im flip.

Error Price

What number of errors per hour does your code create at runtime? Is that higher or worse because the final deployment? This can be a nice metric to reveal to stakeholders: Since many demos solely present the UI of an utility, it’s useful to see what’s blowing up behind the scenes.

Platform Workforce Metrics

Metrics typically originate from the platform staff as a result of metrics assist elevate the maturity degree of their staff and different groups. So, which metrics are most helpfu? Whereas uptime and error price matter right here too, month-to-month energetic customers and latency are additionally essential.

Month-to-month Energetic Customers

With the ability to plan capability for infrastructure is a present. Month-to-month energetic customers is the metric that may make this occur. Builders want to grasp the load their code could have at runtime, and the advertising staff might be extremely grateful for these metrics.


Similar to ordering espresso at Starbucks, generally you must wait a short while. The extra you worth your espresso, the longer you could be prepared to attend. However your persistence has limits.

For utility requests, latency can destroy the end-user expertise. What’s worse than latency is unpredictable latency: If a request takes 100ms one time however 30s one other time, then the affect on techniques that create the request might be multiplied.

UX Metrics

Senior and non-technical management are likely to concentrate on what they’ll see in demos. They are often vulnerable to nitpicking the frontend as a result of that’s what’s seen to them and the tip customers. So, how does a UX staff nudge management to concentrate on the achievements of the UX as a substitute of the location of pixels? 

Conversion Price

The group all the time has a purpose for the tip person: register an account, log in, place an order, purchase some cash. It’s essential to trace these targets and see how customers carry out. Take a look at totally different variations of your utility with A/B testing. An enchancment in conversion price can imply the distinction between revenue and loss.

Time on Process

Even for those who’re not making an utility for workers, the period of time spent on a activity issues. In case your customers are being distracted by colleagues, youngsters, or pets, it helps if their interactions with you might be as environment friendly as attainable. In case your finish person can full an order earlier than they should assist the children with their homework or get Bob unstuck, that’s one much less purchasing cart deserted.

Internet Promoter Rating (NPS)

NPS comes from asking an extremely easy query: On a scale of 1 to 10, how doubtless is it that you’d suggest this web site (or utility or system) to a good friend or colleague? Embedding this survey into checkout processes or receipt emails is simple. Given sufficient quantity of response, you may work out if a latest change compromised the expertise of utilizing a services or products.

In case you can evaluate NPS scores for various variations of your utility, then that’s much more useful. For instance, possibly the navigation that the advertising supervisor insisted on actually is much less intuitive than the earlier model. NPS comparisons can assist determine these impacts on the tip person.

Safety Metrics

Safety is a self-discipline that touches every little thing and everybody—from the developer inadvertently creating an SQL injection flaw as a result of Jenna can’t let the product launch slip, to Bob permitting the bodily pen tester into the information middle as a result of they smiled and requested him about his day. Thankfully, a number of safety metrics can assist a company get a deal with on threats.

Variety of Vulnerabilities

Safety groups are used to taking part in whack-a-mole with vulnerabilities. Vulnerabilities are constructed into new code, found in outdated code, and generally inserted intentionally by unscrupulous builders. Tackling the invention of vulnerabilities is an effective way to point out administration that the safety staff is on the job squashing threats. This metric may also present, for instance, how pushing the devs to hit that summer time deadline precipitated dozens of vulnerabilities to crop up.

Imply Time To Detect (MTTD)

MTTD measures how lengthy a problem had been in manufacturing earlier than it was found. A corporation ought to all the time be striving to enhance the way it handles safety incidents. Detecting an incident is the primary precedence. The extra time an adversary has inside your techniques, the more durable will probably be to say that the incident is closed.

Imply Time To Acknowledge (MTTA)

Generally, the smallest sign that one thing is improper seems to be the red-alert indicator {that a} system has been compromised. MTTA measures the typical time between the triggering of an alert and the beginning of labor to deal with that challenge. If a junior staff member raises considerations however is instructed to place these on ice till after the large launch, then MTTA goes up. As MTTA goes up, potential safety incidents have extra time to escalate.

Imply Time To Comprise (MTTC)

MTTC is the typical time, per incident, it takes to detect, acknowledge, and resolve a safety incident. In the end, that is the end-to-end metric for the general dealing with of an incident.

Sign, Not Noise

Amidst the noise of numerous metrics obtainable to groups as we speak, we’ve highlighted particular metrics at totally different factors within the utility stack. We’ve checked out availability metrics for the IT staff, adopted by metrics for the developer, platform, UX, and safety groups. Metrics are a improbable software for turning chaos into managed techniques, however they’re not a free trip.

First, organising your techniques to collect metrics can require a major quantity of labor. Nonetheless, knowledge gathering instruments and automation can assist liberate groups from the duty of accumulating metrics.

Second, metrics will be gamed, and metrics will be confounded by different metrics. It’s all the time value testing the complete story earlier than making enterprise selections solely primarily based on metrics. Generally, the look of rigor in data-driven decision-making is simply that.

On the finish of the day, the purpose in your group is to trace down these metrics that really matter, after which construct processes for illuminating and bettering them.


Las Vegas
Be a part of our every day livestream from the DevNet Zone throughout Cisco Dwell!

Keep Knowledgeable!
Join the DevNet Zone Cisco Dwell E-mail Information and be the primary to find out about particular classes and surprises whether or not you might be attending in particular person or will interact with us on-line.



We’d love to listen to what you suppose. Ask a query or go away a remark beneath.
And keep related with Cisco DevNet on social!

LinkedIn | Twitter @CiscoDevNet | Fb | YouTube Channel


We will be happy to hear your thoughts

Leave a reply

Enable registration in settings - general
Compare items
  • Total (0)
Shopping cart