Hidden Benefits of SLOs
(Cross-posted from certomodo.io)
There are many articles online about Service Level Objectives(SLOs), particularly on the value they provide to customers as part of a Service Level Agreement(SLA).
Let’s discuss some of the benefits of SLOs that aren’t apparent at first glance.
Before we do, let’s quickly review the terminology from the source:
SLI: a service level indicator—a carefully defined quantitative measure of some aspect of the level of service that is provided.
SLO: is a service level objective: a target value or range of values for a service level that is measured by an SLI.
Simply put: an SLI is a metric that you deem important enough to track when considering the health of your service, and an SLO is the acceptable range of values for that metric.
With that covered, let’s begin!
SLOs reveal what truly matters to customers
Many vendors will offer an SLA based on a 99.95% ‘uptime’ metric or something similar, but that provides an incomplete picture of the customer experience.
There are important metrics other than uptime: latency and error rate to name a couple. You can have a service that is consistently responding to requests but doing so too slowly or unsuccessfully.
If customers provide feedback on other aspects of the product that aren’t covered by the SLA, that is a clear signal that the SLOs/SLIs backing it are insufficient.
SLOs create a common understanding in the organization about reliability
The process of defining SLOs can involve these perspectives:
The customer (What level of service do I want?)
Sales (What level of service will close the deal?)
The engineering team (What level of service can I reasonably build for?)
Support organizations (What level of support can I reasonably provide?)
Developing SLOs (when done well) is a cross-functional effort- as these groups discuss and collaborate toward the versions that get published, they begin to understand what’s important to the customer and the business. It’s a powerful tool for breaking down silos and thinking about reliability as a single team.
SLOs require investment into improved observability
Take for example the task of calculating an error rate SLI. This requires collecting all of the request logs from the service, parsing the messages, distinguishing which are errors, and then calculating the percentage of those errors over a time period. In addition, these logs need to be retained in order to assess changes in the error rate over time.
Therefore: infrastructure and application features for logging, aggregation, and reporting need to be built and maintained for an error rate SLI to be calculated at all.
Meaning: SLOs can be a means to finally improve how your infrastructure is managed and monitored- which pays dividends in the long run towards maturing multiple operational responsibilities.
SLOs prompt decisions about risk management… and risk-taking
Assuming that SLOs have executive buy-in, they are a powerful forcing function. Since they measure customer success, unmet objectives will empirically show that the product roadmap isn’t pointing in the right direction- requiring it to change. Perhaps the engineering team needs to focus on those CI/CD improvements that were on the back burner, after all!
On the other hand- consider a team that is consistently meeting SLOs by a wide margin. That is an opportunity to either:
Offer a higher reliability guarantee to the customer via tighter SLOs; OR
Be more aggressive with shipping new features. After all, error budgets are meant to be spent!
Monitoring SLO performance over time provides valuable feedback on what risks are acceptable in project planning!
Conclusion
Implementing Service Level Objectives can provide secondary benefits that can change companies from the inside. In addition to helping better understand customers, SLOs can foster cross-functional collaboration, improve the technology stack, and balance reliability vs features on the product roadmap.
This insight comes from experience- I’ve guided multiple teams through their SLO implementations and understand the factors that contribute to a successful outcome.
Interested in discussing how I can help your team implement SLOs? Schedule an intro call with me!