Carrier Service Level Agreement (Slas) Tutorial
Carrier Service Level Agreement (Slas) Tutorial
Page 1 of 13
Definition
Tutorial Overview
This tutorial is a guide for companies looking to create the ideal SLA for their business environment. The tutorial
introduces the concept of an SLA, discusses its relevance to almost every business, and contains guidelines for
companies to follow when forging an SLA with a carriers.
Topics
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Introduction
Why Are SLAs Important?
What Are the Basic Components of Frame Relay SLAs?
What Service Levels, by Components, Should Be Negotiated with the Service Provider?
Carrier Compensation: What Does It Mean?
How Can We Ensure That the Service Provider Is Meeting the Agreed-upon Service Levels?
What Are the Next Steps in Implementing and Monitoring Frame RelayService Levels with the Service
Provider?
Self-Test
Acronym Guide
Related Products and Services (Web TecPreviews)
1. Introduction
The emergence of distributed processing and intranet-based applications has resulted in increasing reliance on wide area
networks (WANs) for delivery of critical business information. The WAN has, therefore, evolved into a key corporate
asset. Network performance and reliability impact not just messaging and simple database transactions but information
from suppliers required for just-in-time manufacturing as well as inventory, accounting, and sales information.
The proliferation of distributed application and database systems also has resulted in the emergence of public network
services such as frame relay. Frame relay was designed to meet the needs of network managers struggling with the
necessity to build redundant, reliable networking services nationwide and globally. This technology has brought costeffective global, redundant, high-performance networking to corporations who, in the past, would have built costly,
meshed private-line networks.
Unfortunately, the migration to public services does come with challenges. By utilizing frame-relay services, network
managers lost the ability to control the performance and reliability of the public-network infrastructure. This has resulted
in unhappy end users, angry executives, and pressure on network managers contending with the complexities of router,
switch, and server management and maintenance.
Today, carrier services with embedded management capabilities are emerging to enable a process for WAN service-level
management. These services help network managers in three areas: determining what the WAN service levels need to be
(planning), monitoring these service levels to ensure compliance (verification), and isolating what the problem is when
service levels are not met (troubleshooting). The key management component that documents what WAN service levels
should be is the service level agreement (SLA).
Figure 1: The WAN Service Level Management Process
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 2 of 13
Network managers who are considering implementing SLAs for their WANs should look closely at these new carrier
services. This guide will assist the reader in negotiating meaningful SLAs with their service providers.
availability
delay
throughput
customer service
cost
When leased lines were the only means of delivering high-speed data services, network managers had standard
throughput and availability statistics at hand. To increase availability, they simply ordered meshing or backup links. To
increase throughput, they ordered more bandwidth. Public services such as frame relay promised performance that was
comparable to leased lines at reduced cost. However, frame relay removed critical performance factors from the view
and control of network managers. This resulted in confusion, protracted network outages, erratic network performance,
and contentious relationships between business users and technical support personnel. The network was no longer an
asset, it was a liability.
The advent of new frame-relay services that include service guarantees provides a means for network managers to ensure
that their crucial business data is delivered in a reliable, consistent manner. Carriers are implementing these services as
"managed network transport" or other such names, where the carrier manages the service all the way to the customer
site. This service usually includes the channel service unit/data service unit (CSU/DSU) and may or may not include
other customer premises equipment (CPE) such as the router or frame relay access device (FRAD). The CSU/DSU has
the embedded service-level-management tools and provides the basis for implementing SLAs.
Service-level guarantees also require frame relayservice providers to provide some type of economic relief should they
fail to meet their contractual obligations. These penalties have spurred service providers to engineer their networks to
meet parameters that exceed those offered to their customers, reducing the chances that service providers will be
obligated to compromise revenues through customer penalties.
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 3 of 13
There are key SLA implementation issues that have a direct impact on their usefulness to the network manager. The first
issue is where the measurements are taken: end-to-end (from the customer premise location) or just within the cloud
(switch-to-switch). The local loop (or last mile) can have a profound impact on network performance but is ignored in a
switch-to-switch implementation. That is why performance measurements (and troubleshooting) must be performed
end-to-end.
Figure 2:
The second issue is utilizing a measurement system that is independent of the network being measured. The switch or
router cannot provide all the statistics needed to support WAN service level management and may include router delay
not indicative of WAN service problems. Using an objective system that is not biased toward switch or router
architectures ensures the validity of SLA measurements end-to-end.
How this information is presented is almost as important as the information itself. Clear and concise reports are
necessary to meet the needs of the network operators, managers, and executive management. Ideally, these reports are
created by the customer on-demand and presented on-line via a Web interface.
SLAs are critical to ensuring that a frame-relay network enables corporate-business processes. How they are measured is
as important as what is measured. Using a managed network service with end-to-end measurements, independent of the
router or switches, provides the best overall SLA measurement capability.
Average Round-trip Network Delay measured over a month as a calculation of the following:
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 4 of 13
Average Round-trip PVC Delay measured over a month as a calculation of the following:
Cumulative Samples of End-to-End, Roundtrip Delay for the PVC
------------------------------------------------------------Number of Samples
Effective PVC Throughput (Frame Delivery Ratio) measured over a month as a calculation of the
following:
Egress Frame Count
----------------------------------------------------------------------------Ingress Frame Count (Frames above Committed Burst Size + Excess Burst Size)
Mean Time to Respond measured as a monthly average of the time from inception of trouble ticket until
repair personnel is on site as follows:
Total Time (in Hours) to Respond
-------------------------------Total Number of Trouble Tickets
Mean Time to Repair or Restore measured as a monthly average of the time from inception of trouble ticket
until outage is repaired to customer satisfaction as follows:
Total Outage Time (in Hours)
------------------------------Total Number of Trouble Tickets
These SLA components, although calculated in a fairly standard manner among carriers, are implemented inconsistently.
When working with a carrier to determine service levels, it is important to understand in detail the meaning of each of
those measurements.
Network Availability
Most carriers are committing to a monthly availability guarantee of at least 99.5%. The availability guarantee includes all
components of the carrier frame relay network, the carrier-provided local loop to the frame relay network, and the
carrier-provided CSU/DSU.
Figure 3:
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 5 of 13
Note the distinction between network-based and site-based availability. For a ten-site network, in a thirty-day month, a
99.5% average network availability would allow thirty-six total hours of downtime (not including the exclusions
mentioned above). If the SLA is written around a site-based availability instead of being network-based, then any one
site can only be down for 3.6 hours in the month. This distinction is very important when computing downtime. Often,
availability SLAs that are network- or site-based do not meet the business requirements, so PVC availability is a better
metric.
PVC Availability
It is better to commit your carriers to a monthly PVC availability. Typically guarantees are 99.5%. This has the desired
impact of restricting the amount of downtime any one PVC will sustain. PVC availability is critical if a company is
utilizing a frame-relay network to carry delay and application-sensitive traffic such as SNA or voice. If SNA sessions are
lost, reestablishing them could take longer if the frame-relay outage resulted in front end or application timeouts.
Companies should make sure their carrier provides the PVC availability their specific application traffic requires.
Included and excluded components are identical to those of network availability.
Round-trip Network and PVC Delay
A wide range of delay guarantees are available and are dependent upon the carriers' specific configuration guidelines and
backbone technology. Round-trip delay guarantees range from less than 110 milliseconds to nearly 300 milliseconds.
Some carriers provide guarantees based on access-line speed, offering much lower delay guarantees for T1 access lines
than for 56 Kbps. However, the most useful delay guarantees are those based on the type of application or type of
traffic. SNA, voice, and video traffic are much more sensitive to round trip delays than FTP or HTTP TCP/IP sessions.
Guarantees specifically targeted at those application types are critical to the success of a frame-relay implementation.
Table 1 illustrates an example of an application-specific guarantee. (Companies should work with carriers to understand
their specific offerings and the requirements to gain the lowest committed round-trip delay.)
Additionally, some carriers expect the customer to be highly involved in the measurement processoften to the extent
of the customer being held responsible for initiating packet Internet groper (PING) tests in concert with carrier
technicians during times of light network utilization. Also, round-trip network delay often is based on predetermined
frame sizes (200 to 250 bytes). Companies should understand how the process works during negotiations. An IP PING
test is not recommended, because it is an inaccurate measurement of frame-relay delay. There are two reasons for this
inaccuracy; first, IP PING has low network priority, and second, IP PINGdelay measurements include router delay.
The best test of network delay is based on frame-relay frames, independent of the router, and measures consistently
throughout the day. Traffic load and delay should be measured when its impact is at its highestpeak traffic loadwith
this information collected and processed for trending analysis. Included and excluded components are identical to those
of network availability.
Throughput
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 6 of 13
A wide range of throughput guarantees is available and is dependent upon the carriers' specific configuration guidelines
and backbone technology. Carrier guarantees range from 99% to 99.999%, dependent on the carrier and their specific
guarantee language. Some carriers base the percentage of delivered frames on committed information rate (CIR) or if the
frames are labeled "discard eligible (DE)." Others base the calculation on committed burst size versus excess burst size
(uncommitted data in excess of committed burst size). Guarantees are therefore difficult to compare without significant
study and understanding of the carrier's specific terminology. Carriers often require a minimum number of PVCs per
link (often four or more). In addition, carriers exclude configurations where it is determined that the egress (destination)
port has not been configured with enough bandwidth or CIR. However, the means of verifying this fact is undefined.
Strong process language must be in place to ensure proper guarantee execution.
Additionally, companies must ensure that the carrier has redundant means of gathering SLA data. Some carriers exclude
lost data from throughput calculations. Other excluded items may include the following:
l
l
l
l
l
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 7 of 13
recurring transmission costs (by identifying available bandwidth without increasing CIR through
determination of unnecessary WAN traffic caused by spurious broadcasts, over-polling, and incorrect
window-sizing)
bandwidth growth by identifying rogue users or applications that are monopolizing WAN bandwidth
inappropriately
warning indication of degradation before users are affected
It is clear that even if hours are spent scouring over carrier reports, there still might not be sufficient data to determine
whether or not the carrier is meeting the service-level guarantee. Additionally, key areas affecting the performance and
reliabilityas well as operation costsof the network have been overlooked. Therefore, is it possible to optimize
network investment and assure that contracted service-level guarantees are met? Yes, but only if network-management
tools designed specifically to monitor and manage service levels on carrierframe relay networks are utilized.
1.
Continuously determine what WAN service levels are needed. To do this, users (subscribers and providers)
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 8 of 13
need to profile usage and baseline normal operation so they can plan WANbandwidth capacity. Users also
need to base these service levels on actual traffic profiles (i.e., per port/circuit, host, and application). Because
traffic can be burst-intensive, they must measure these usage profiles with fine granularity.
2.
Once service levels are established, they must be verified. To verify service levels, users need to monitor
performance in real time, report historical performance, and assess quality-affecting trends. These trends can
only be tracked through an historical database of performance information, including the ability to report on a
WAN SLA.
3.
If service levels are not being met, then users must quickly determine why they are not being met to rapidly
resolve the problem. To do this, users need to analyze, both historically and in real time, the physical and logical
access to the cloud. Finally, the analysis of upper-layerprotocol traffic on the WAN is important as the
configuration of how these protocols are to be run directly impacts their performance over the WAN service in
use.
Each step is critical in implementing and monitoring frame relayservice levels. Accordingly, it is recommended that
companies follow these guidelines:
l
Determine frame relaynetwork requirements and traffic patterns. Baseline the network. Understand
applications, peak times, and areas of concentration. Are there some applications that require large frame sizes?
Are there critical sites that require redundancy? Use this baseline to design the networkwith the carrier. Show
the carriers how network requirements were determined.
Determine network configuration. Involve the carrier from the start. Understand their backbone technology
and how their configurations will impact the performance of applications over the WAN.
Negotiate service level agreements. Analyze and compare. Read the fine print and do the calculations. If the
negotiated network-availability guarantee is 99.5%, how many hours of outage does that mean for the network
per month? If the round-trip network delay is 140 milliseconds for a voice packet, will the conversation between
the chairman of the board and the vice president of marketing sound like a call from one of the first airline
telephones? Will my SNA transaction timeout during end of month accounting close? If my throughput
guarantee is 99%, how often is the application retransmitting data? How does that impact server performance?
Will the president's e-mail download freeze up while waiting for the server to be available? How are the
agreements to be reported? Web? What timeframes? What are the methods of alleviation and penalties? Is there
a maximum penalty monthly, yearly? When are the service levels implemented . . . thirty, sixty days after
installation? What is included and excluded? Does this meet company needs?
Determine a plan for monitoring the carrier. Will the carrier be monitored through a trouble-ticket system?
Through user complaints? An implementation of automated planning, reporting, troubleshooting, and problem
resolution tools is recommended. A company should let the carrier know the plan for monitoring their services.
Baseline the network during the frame-relay implementation. Utilize a system to determine baseline areas such
as the following:
Discuss the results of the baseline process with the carrier. Make adjustments to network configuration if
necessary. Many companies can actually reduce CIR on some PVCs.
Analyze network performance and reliability weekly. Companies should understand the nature of application
and user traffic and be sensitive to change. Look for trends. Is performance degrading on specific links or
PVCs? Has a new application been added? What is the impact to the network and other applications?
Meet with the carrier monthly and compare statistics. Determine courses of action if there are trouble spots.
Seek compensation for service-level guarantees that are unmet. Review possible optimization plans and set the
process in motion for change.
The partnership with a service provider through these steps will help ensure a successful network implementation. The
added functionality of an end-to-end monitoring system assures that the complete business benefits of a WAN will be
realized.
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 9 of 13
8. Self-Test
1.
The growing importance of WANs in corporations has created the need for effective SLAs.
a. True
j
k
l
m
n
j
k
l
m
n
2.
b. False
Frame relay has enabled network managers to keep an eye on and control various critical performance factors.
a. True
j
k
l
m
n
j
k
l
m
n
3.
b. False
Service-level guarantees require frame relayservice providers to provide some type of economic relief should
they fail to meet their contractual obligations.
a. True
j
k
l
m
n
j
k
l
m
n
4.
b. False
SLAs covering individual components unfortunately may contribute to higher downtime than those covering
entire networks.
a. True
j
k
l
m
n
j
k
l
m
n
5.
b. False
Most carriers demand a minimum number of sites (usually ten) before they enter into negotiation of SLAs.
a. True
j
k
l
m
n
j
k
l
m
n
6.
b. False
j
k
l
m
n
7.
j
k
l
m
n
j
k
l
m
n
j
k
l
m
n
d. "acts of God"
j
k
l
m
n
j
k
l
m
n
j
k
l
m
n
b. customer service
j
k
l
m
n
c. cost
j
k
l
m
n
j
k
l
m
n
e. b and c only
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 10 of 13
j
k
l
m
n
8.
j
k
l
m
n
9.
j
k
l
m
n
b. PVC availability
j
k
l
m
n
j
k
l
m
n
j
k
l
m
n
j
k
l
m
n
j
k
l
m
n
10.
j
k
l
m
n
b. 80%
j
k
l
m
n
c. 99.5%
j
k
l
m
n
d. 45%
j
k
l
m
n
j
k
l
m
n
b. monitored
j
k
l
m
n
c. verified
j
k
l
m
n
d. ignored
j
k
l
m
n
Check score
You scored
Clear
percent correct.
Correct answers
Now it's your turn to give us feedback. On a scale of 1 to 5, where 1 is the lowest and 5 is the highest, rate this tutorial in
terms of the following factors:
Clarity
Completeness
Accuracy
Overall
n Lowest
j
k
l
1m
n Lowest
j
k
l
1m
n Lowest
j
k
l
1m
n Lowest
j
k
l
1m
j
k
l
m
2n
j
k
l
m
2n
j
k
l
m
2n
j
k
l
m
2n
j
k
l
m
3n
j
k
l
m
3n
j
k
l
m
3n
j
k
l
m
3n
j
k
l
m
4n
j
k
l
m
4n
j
k
l
m
4n
j
k
l
m
4n
n Highest
j
k
l
5m
n Highest
j
k
l
5m
n Highest
j
k
l
5m
n Highest
j
k
l
5m
Other comments
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 11 of 13
c
d
e
f
Would you like to be notified about future tutorials? g
Send in feedback
Clear form
Correct Answers
1.
The growing importance of WANs in corporations has created the need for effective SLAs.
a. True
b. False
See Topic 1.
2.
Frame relay has enabled network managers to keep an eye on and control various critical performance factors.
a. True
b. False
See Topic 2.
3.
Service-level guarantees require frame relayservice providers to provide some type of economic relief should
they fail to meet their contractual obligations.
a. True
b. False
See Topic 2.
4.
SLAs covering individual components unfortunately may contribute to higher downtime than those covering
entire networks.
a. True
b. False
See Topic 3.
5.
Most carriers demand a minimum number of sites (usually ten) before they enter into negotiation of SLAs.
a. True
b. False
See Topic 4.
6.
http://www.webproforum.com/visualnetworks/full.html
2/10/99
7.
Page 12 of 13
8.
9.
10.
9. Acronym Guide
CIR
CPE
HTTP
PING
SLA
SNA
SNMP
TCP/IP
http://www.webproforum.com/visualnetworks/full.html
2/10/99
Page 13 of 13
Home Page
Home Page
Comment on this tutorial.
Last modified Wednesday, 10-Feb-1999 13:30:24 CST
Copyright 1998
The International Engineering Consortium
http://www.webproforum.com/visualnetworks/full.html
2/10/99