operational service level agreements

Operational Service Level Agreements (SLAs): What Seasoned CIOs Know

As with most things in business, the devil is in the details.

While all cloud service providers (CSPs) offer some type of operational availability, it’s important to understand what the provider is actually measuring.

Seasoned chief information officers (CIOs) get this point.

They know there are two types of operational service level agreements (SLAs):

  1. Infrastructure availability
  2. Application availability

What separates a seasoned CIO from their peers is the ability to understand which one they’re getting.

Often cloud service providers offer a high availability number such as 99.99 percent. However, when you dig into it, they’re really only talking about the infrastructure.

As a CIO, it’s important to ask yourself this question:

“Does it really matter if the ESX host or the storage is up?”

Sure, the application relies upon the underlying infrastructure to run, but what good does it do if the servers are up, but the application is not?

Knowing this then, you’ll want to make sure your CSP is offering an application availability SLA.

Why?

Because it ensures they have some skin in the game when it comes to what’s important to your business.

What’s a good operational SLA for your applications?

Well, that really depends on the application. Software as a Service (SaaS) applications like Salesforce.com, Microsoft Office 365, and SAP Concur can achieve very high levels of availability due to their web scale architecture.

In many cases, levels can be as high as 99.999 percent.

By contrast, traditional business applications like enterprise resource planning (ERP) and SAP business warehouse (BW) don’t have the same luxury as these SaaS applications.

As a result, it’s very difficult/costly to truly architect a 99.999 percent or even a 99.99 percent SLA.

Yes, there are plenty of deployments out there where ERP has never been down, but this is primarily due to good operational and change management practices rather than the underlying technical architecture.

Let’s assume you’re like many CIOs out there: SAP ERP is a core business application and you simply can’t have it go down. After all, if it’s down your production line stops, you can’t ship product and you can’t process any orders.

As a CIO, what do you do?

Odds are you insist that your provider gives you 99.99 percent availability because your business processes can probably tolerate four minutes of unplanned downtime a month.

It’s a bit inconvenient, but you can live with it.

Be careful what you ask for

And so, the service provider does exactly what you ask. They double up on everything and put in every bell and whistle it takes to architect your application and achieve 99.99 percent availability.

This of course costs you more money. A lot more money.

Since most CSPs provide either 99.5 percent (or in the best-case 99.9 percent) availability for SAP, this becomes a one-off deployment. Which in turn makes this much more difficult to operate.

Why?

Because rather than having 30 operations people who can fix it when it breaks, you’re likely down to only one or two information technology (IT) pros who really know how it is set up.

And in IT, what can go wrong usually does, but never, ever, in the way it was tested to fail. (Hey, if nothing else, Ol’ Murphy has a sense of humor!)

So now with this 99.9 percent “super-duper” architecture, when something is broken you have to find one of those two guys who set it up to fix it. Instead of facing an issue that should have taken 30 minutes to solve, it’s now turned into eight hours of escalations and headache.

Are you getting more availability or more complexity?

Clearly, you can spend a lot of money architecting traditional applications to promise high availability, but it comes with an extra cost – that is, the risk adding a lot of operational complexity.

More often than not, it’s the fat finger that’s at fault rather than the architecture.

As you work with your CSP, your best bet is to ensure they give you an application availability SLA that fits their standards. You also want them to establish a business continuity plan for you should bad things happen.

A seasoned CIO will look at it in this light:

“My business applications are important, and we will put all the components in place to make sure they are up and running, but if they are down, we have a plan to remedy the situation.”

Moral of the story?

A seasoned CIO knows that running business solely on the promise of a cloud service provider’s SLA is a recipe for disaster.