Why DORA Metrics Are a System, Not a Checklist

There is a specific way engineering teams tend to fail at DORA metrics. They look at the four measurements: - Deployment frequency - Lead time for changes - Change failure rate - Mean time to recovery...

There is a specific way engineering teams tend to fail at DORA metrics.

They look at the four measurements:

Deployment frequency
Lead time for changes
Change failure rate
Mean time to recovery

And treat them as four separate goals.

They:

Assign an owner to each one.
Set improvement targets for each one.
Track progress on each one independently.
Celebrate when individual numbers move in the right direction.

Six months later:

Deployment frequency has improved significantly.
Lead time has come down.
Change failure rate has gotten worse.
Mean time to recovery has gotten worse.

The team has two metrics that look better and two that look worse, and nobody is sure whether things have improved overall.

This is what happens when DORA metrics are treated as a checklist rather than as a system.

What Makes Them a System

The DORA metrics are not independent measurements.

They are interconnected signals about a single underlying thing:

How well a software delivery system converts developer work into reliable production outcomes.

The relationships between them are structural.

Improving one metric through an approach that ignores the others almost always creates pressure on the others in ways that are predictable in hindsight and invisible in the moment.

Deployment Frequency and Change Failure Rate

Deployment frequency and change failure rate have the most direct relationship.

Deploying more often creates more opportunities for regressions to reach production.

When teams improve deployment frequency without corresponding improvements to testing infrastructure, change failure rate rises.

The pipeline is faster. The software arriving through it is less reliably validated. More failures reach production.

Lead Time and Change Failure Rate

Lead time and change failure rate have a similar relationship.

When lead time is reduced by removing friction from the review and merge process without addressing why that friction existed, the validation steps that friction was providing stop happening.

Code moves faster. Failures that were being caught in review reach production instead.

Mean Time to Recovery and Reliability

Mean time to recovery and reliability have a reinforcing relationship.

Teams that invest in observability infrastructure recover from failures faster.

Faster recovery reduces the cost of individual failures.

Lower cost per failure changes how teams think about acceptable change failure rate.

The system develops a tolerance for failures that might otherwise be prevented.

None of these relationships make it impossible to improve DORA metrics sustainably.

They make it impossible to improve them sustainably by treating each metric as an isolated optimization target.

The Correct Reading of Each Metric

Reading DORA metrics as a system means understanding what each one is actually telling you rather than what the number appears to say on its own.

Deployment Frequency

Deployment frequency is a signal about delivery cadence.

High deployment frequency indicates that the team has removed barriers between completed work and production delivery.

But the signal only means what it appears to mean when change failure rate is also healthy.

High deployment frequency with high change failure rate means the team is delivering failures frequently, not value frequently.

Lead Time for Changes

Lead time for changes is a signal about flow efficiency.

Short lead time indicates that work moves smoothly from development through review, testing, and deployment without accumulating in queues.

But short lead time achieved by skipping validation steps is not flow efficiency.

It is validation debt that will appear in change failure rate and mean time to recovery.

Change Failure Rate

Change failure rate is the signal most directly connected to engineering quality practices.

Low change failure rate indicates that the validation infrastructure is catching behavioral regressions before they reach production.

It is the metric that testing investment most directly influences and the metric that reveals most clearly whether improvements to deployment frequency and lead time are genuine or illusory.

Mean Time to Recovery

Mean time to recovery is a signal about organizational resilience.

Fast recovery indicates that the team can identify failures quickly and restore service efficiently.

It is influenced by:

Observability infrastructure
Incident response practices
Deployment architecture

It does not substitute for low change failure rate.

Recovering quickly from failures that should not have happened is not the same as not having those failures.

Reliability

Reliability, the fifth metric in the current DORA framework, is the cumulative signal.

It captures whether the system as a whole is maintaining stability under the pace of change.

A team can have individually acceptable scores on the other four metrics and still have degrading reliability if the aggregate pace of change consistently exceeds what the validation and recovery infrastructure can absorb.

The System Behavior That Produces Sustainable Improvement

Teams that improve all five DORA metrics sustainably tend to make changes to the underlying delivery system rather than to the individual metrics.

The changes that produce sustainable improvement share a common characteristic:

They address the root causes of metric values rather than the metric values themselves.

Improving Change Failure Rate

Change failure rate improves when regression testing infrastructure catches more failures before deployment.

Not when:

Deployment gates are tightened to block more releases.
Rollback processes are improved.
Teams simply become more conservative about shipping.

Those changes affect other metrics but do not reduce the number of failures reaching production.

The sustainable improvement comes from improving validation quality.

Improving Lead Time

Lead time improves when code review bottlenecks are addressed upstream.

Common causes include:

Large pull requests that are difficult to review quickly.
Unclear change scope.
Flaky tests that require investigation before approval.

Addressing those root causes reduces lead time without removing validation.

Improving Deployment Frequency

Deployment frequency increases sustainably when change failure rate is low enough that more frequent deployment does not produce proportionally more failures.

The frequency improvement that lasts is the one that follows testing infrastructure improvement rather than preceding it.

Improving Mean Time to Recovery

Mean time to recovery improves when observability investment precedes deployment frequency increases.

Teams that add observability tooling after reliability problems emerge are instrumenting a system they cannot yet see clearly.

Teams that invest in observability before accelerating delivery are building the foundation for fast recovery before it becomes urgently necessary.

What Treating Them as a System Actually Looks Like

In practice, treating DORA metrics as a system means making improvement decisions based on the relationships between metrics rather than on individual metric values.

Scenario 1: Low Deployment Frequency, Low Change Failure Rate

When deployment frequency is low but change failure rate is also low, the bottleneck is likely in the deployment process itself.

Potential causes include:

Approval gates
Manual deployment steps
Pipeline architecture limitations

Increasing deployment frequency is appropriate.

Scenario 2: Low Deployment Frequency, High Change Failure Rate

When deployment frequency is low and change failure rate is high, the bottleneck is in testing infrastructure.

Increasing deployment frequency before addressing change failure rate will produce more frequent failures.

The correct intervention is testing infrastructure improvement first.

Scenario 3: Long Lead Time

When lead time is long, the cause determines the intervention.

Examples include:

Slow code reviews → Improve review workflow.
Slow pipeline execution → Improve test execution architecture.
Frequent deployment gate blocks → Improve change failure rate.

The metric alone does not reveal the solution. The surrounding system behavior does.

Scenario 4: High Mean Time to Recovery

When mean time to recovery is high, both observability infrastructure and incident response processes deserve examination.

The weaker of the two is usually the highest-leverage place to invest.

Scenario 5: Degrading Reliability

When reliability is degrading while other metrics appear acceptable, the pace of change is likely exceeding what the validation and recovery infrastructure can safely absorb.

The appropriate response is not necessarily to slow delivery.

The better response is often to invest in:

Testing infrastructure
Observability systems
Recovery processes

That make the current delivery pace sustainable.

The Diagnostic Value of Reading Them Together

The most useful thing about DORA metrics read as a system is that the pattern of values across all five metrics is more informative than any individual value.

Pattern: Fast Delivery, Poor Reliability

Characteristics:

High deployment frequency
Low lead time
High change failure rate
High mean time to recovery

Diagnosis:

Testing infrastructure problem.

The delivery process is fast. The validation process is insufficient.

Pattern: High Quality, Slow Releases

Characteristics:

Low deployment frequency
Low lead time
Low change failure rate
Low mean time to recovery

Diagnosis:

Unnecessary friction problem.

Quality is good. Something is blocking releases that does not need to be blocking them.

Pattern: Reliable Delivery, Slow Recovery

Characteristics:

High deployment frequency
Low lead time
Low change failure rate
High mean time to recovery

Diagnosis:

Observability problem.

The team is shipping reliably but recovering slowly when failures occur.

Each pattern points to a different root cause and a different intervention.

None of those interventions are visible when metrics are read in isolation.

Why DORA Metrics Are a System

That is why DORA metrics are a system.

Not because someone decided to group several measurements together.

But because the underlying delivery system they measure is itself a system.

And systems are only legible when you look at how their components relate to each other rather than at each component alone.

📚 Nguồn: Viblo

Agile Software Development DevOps metrics in software testing Quality Assurance

Bình luận

0 bình luận

Mới nhất Cũ nhất

Chưa có bình luận nào. Hãy là người đầu tiên bình luận.

Chia sẻ bài viết

Facebook Twitter LinkedIn

Cần tư vấn?

Liên hệ với chúng tôi để được hỗ trợ

Liên hệ ngay

Bài viết liên quan

09/06/2026

Proxy hoạt động ở tầng nào trong mô hình TCP/IP? HTTP Proxy Và SOCKS5 nằm ở đâu?

Proxy hoạt động ở tầng nào? Sau khi đã đi qua các tầng mạng như Physical Layer, Data Link Layer, Internet Layer, Transport Layer và Application Layer, ta có thể nhìn Proxy rõ ...

Đọc thêm

09/06/2026

Red Team RAG: Khi mỗi pipeline là một đường hầm tối – Phần 2: Đầu độc dòng chảy – Từ ingestion đến sụp đổ

## Lời mở đầu: Bạn đã vào hầm. Bây giờ, hãy đầu độc dòng nước. Ở phần 1, chúng ta đã đứng trước **cửa hầm**, học cách đọc bản đồ pipeline RAG, v...

Đọc thêm

09/06/2026

Vì sao giá trị truyền thống luôn được đặt lên hàng đầu

Giá trị truyền thống không chỉ là yếu tố mang tính hoài niệm, mà còn đóng vai trò nền tảng trong việc định hình bản sắc và chiều sâu của một công trình ...

Đọc thêm

Bắt đầu dự án của bạn

Hãy để Flash Dev đồng hành cùng bạn

Liên hệ ngay