Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Beyond uptime: How we redefined observability to protect performance, profits and people

A few years ago, my team hit a monumental milestone: a 10% decrease in p99 latency on our core APIs! According to our dashboards, it had been executed flawlessly. However, another metric was telling a much scarier story: an increase of 40% in after-hours alerts for the team supporting the same APIs. 

We were making the system faster for customers, but also demonstrably worse for our engineers. The data was quite clear: our metric worlds were warring against each other. This discord forced us to ask a more sophisticated question: What value is a high-performance system if the human architecture sustaining it is brittle? It was our threshold for beginning to examine a more holistic philosophy; one where we balanced not two, but three pivotal pillars of value. 

First leap: Linking technology to business 

Our story began in the same place as many others. We had too much data — CPU utilization, request latencies, availability trackers, error rates — but lacked serious insight. My team and I had built a best-in-class monitoring and observability, but could not provide meaningful answers to basic business impact questions. 

We had our breaking point with a “minor” 300 millisecond slowdown of our product recommendation engine, technically within our SLOs cost us close to $30,000 in revenue over 48 hours as we checked our customer impact on revenue. That experience spurred us to get serious in understanding and mapping technical performance against business KPI’s. We had meetings with marketing, sales and finance, and learned the terminology of conversion rates, customer lifetime value and paused conversions. We learned to look at systems through the lens of how value flows through those systems, instead of simply monitoring them. 

It was our first evolution, and we were finally connecting the dots. 

And then came the breakthrough: the observability trifecta.

But over time, I grew increasingly concerned with all three disciplines. As we ratcheted up the development of change to drive business outcomes, the systems were becoming more complex. The cognitive load was weighing on my engineers. They were delivering new features faster, but at an increasing toll on their mental load and burnout, and on the time and effort to support the features. Admittedly, we were improving our business metrics, but we were doing it at the expense of our most valuable asset — our engineering talent. 

We were optimizing the ‘how’ (system performance) and optimizing the ‘why’ (business outcomes), and we weren’t optimizing for the ‘who’ (our developers). This is the point we realized our true north was the observability trifecta. 

We concluded that a sustainable, high-performing system takes a holistic perspective on three discrete but collectively interdependent pillars: 

  • System performance: The traditional technical metrics. Is the system fast, reliable and available? This is the baseline.
  • Business outcomes: The financial and customer-facing metrics. Is the system making money, improving conversion and delighting users? This is the purpose.
  • Developer experience (DX): The human-centric metrics. How easy is it to develop, test, deploy and operate a service? We started measuring metrics patterned on DORA and SPACE: What is our lead time for changes? How much time do we spend on unplanned work and operational toil? Which systems generate the most on-call cognitive load and number of alerts? This pillar drives sustainability and speed of innovation. 

Putting the trifecta into action 

By adopting this three-pillar view, we want to transition from reactive to proactive and strategic. 

1. From business-aware SLOs to top-down BLOs 

We stopped developing technical SLOs and trying to justify their business implications. Instead, we collaborated with leadership to identify the top-level business-level objectives (BLOs). With a defined BLO of “Improve new user sign-up success rate from 95 to 98%” as our north star, my teams could develop the technical SLOs necessary from the authentication service, database and the front-end client; the work was not based on bottom-up discovery but now on top-down direction and purpose. 

2. Observability-driven product strategy 

The trifecta was a powerful input for our product strategy. In one of our reviews, we could see that as a legacy payments service, we had poor DX metrics with a high cognitive load, slow time to deploy and mediocre performance. However, the business metrics showed that it was tied to only a small part of our overall recurring revenue. Given this holistic view, we arrived at the strategic decision not to invest in fixing it, but to actively migrate the handful of remaining customers to our modern platform before deprecating the legacy service. Without the DX pillar, we risked spending months trying to enhance a very low-impact system. With the trifecta, we freed up an entire team to work on high-value, revenue-generating product innovation. 

3. Making impact real: The cost of delay dashboard 

To make this real for all, we worked with our data science team and developed a new kind of dashboard on Datadog and Amplitude. Along with metrics for latency, we now display a real-time dollar amount: the “cost of delay” metric. For every 100 milliseconds of latency introduced to our process, the model will estimate the revenue impact. Once an engineer realizes that a small change in performance costs the company $150-$200 per hour, the sense of urgency around fixing it becomes personal and global. 

The future: Composable view of value 

As we move forward, we aim to build a truly composable view of value. We are building tools to aid product managers and technical leads in modeling the trade-offs across the 3 pillars before any code is written. 

Do you want to introduce a feature that will increase conversions by 3%? Our models will show the expected impact on system load and the estimated increase in operational complexity for the team that owns the feature. This would shift observability from a rearview mirror view of the economy to a prediction and strategic vision for the whole enterprise. 

Profits, platforms AND people! 

While closing the initial gap between tech and business was an important step, it was not the end of the journey. The greatest shift was an acceptance of a holistic value perspective that placed our people on the same level of consideration as our profits and platforms. 

The most gratifying experience in my career came not from the dashboard but from a planning meeting that included our head of product, our VP of engineering and a lead architect, where they all used the same terminology. They weren’t debating features; they were discussing trade-offs that had real and rapid revenue impact against system reliability and developer velocity. They were speaking of three-dimensional value terminology together. That was the moment I knew we were no longer simply a platform team but operating as a core business driver.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?


Read More from This Article: Beyond uptime: How we redefined observability to protect performance, profits and people
Source: News

Category: NewsJuly 15, 2025
Tags: art

Post navigation

PreviousPrevious post:SAP seeks to cut 80% of your data management workNextNext post:La orquestación de agentes de IA es el próximo paso crucial para el CIO

Related posts

Snowflake offers help to users and builders of AI agents
April 21, 2026
Does IT have a value problem?
April 21, 2026
Why the CIO is uniquely positioned to lead the digital workforce
April 21, 2026
Increased AI expectations without guidance leads to employee burnout
April 21, 2026
Ciberseguridad en el sector farmacéutico: la experiencia de Faes Farma
April 21, 2026
The gap between SAP and its customers must not widen further
April 21, 2026
Recent Posts
  • Snowflake offers help to users and builders of AI agents
  • Does IT have a value problem?
  • Increased AI expectations without guidance leads to employee burnout
  • Why the CIO is uniquely positioned to lead the digital workforce
  • Ciberseguridad en el sector farmacéutico: la experiencia de Faes Farma
Recent Comments
    Archives
    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.