SLA Metric

greenspun.com : LUSENET : Service Level Management : One Thread

We are in the process of implementing IT SLAs and our first go around will have two metrics, Application availability and response time to respond to a problem ticket.

There are two sides to my question: 1.) For application availability, where we have dependencies on other IT groups within the company (i.e. in the business), how can we best depict the inter-dependencies between IT groups (i.e. Central IT vis-a-vis the business unit IT groups). My thought is to make whichever group has primary responsibility for the application the primary SLA owner, and list the dependencies on a matrix.

2.) For the first go around, how do we determine a fair availability percentage? If we go by history, the organization has changed in light of some recent cutbacks, so that may not be an accurate indicator. Two other thoughts are; coming up with percentages that are artificially low (60% to 65%) until we can get a fair history of what our current support staff can do, and to indicate no percentages at all at this time, and work toward a best effort basis until we know what our support staff can do.

Any visability/insight would be appreciated.

-- Len Brown (Len.Brown@avnet.com), July 29, 2002

Answers

QUESTION 1.) For application availability, where we have dependencies on other IT groups within the company (i.e. in the business), how can we best depict the inter-dependencies between IT groups (i.e. Central IT vis-a-vis the business unit IT groups).

ANSWER - there are two ways to handle this: (1) use operational level agreements among each IT group. This is somewhat cumbersome and adds a layer of additional administration (not to mention monitoring and reporting), or (2) use a consolidated SLA that provides roles and responsibilities for each IT functional group involved, their associated service level objectives and metrics.

QUESTION: ) For the first go around, how do we determine a fair availability percentage?

Step 1 - what does the business require? (realistic and quantifiable) Step 2 - what can IT deliver? Step 3 - can a business case be made for expending the money to close the gap? Step 4 - get consensus among al stakeholders.

In other words, this is driven by business requirements, not IT's ideas of what those requirements are. It's hard work (and requires negotiating skills). Consider this: 99% availability in a 24x7 environment = 88 hours of downtime a year - over an hour a week! 65% availability is as good as being plain unstable.

More importantly, when computing availability, you have to consider availability from all IT groups. If the database is 99% and the network is 99% and the server is 99%, then the aggregate availability is just a hair over 97% (.99 x .99 x .99 = .9702)

Fun, huh?

-- Mike Tarrani (mtarrani@pacbell.net), October 02, 2002.


Moderation questions? read the FAQ