Unit Iii &iv
Unit Iii &iv
6. Associate M1 through MS interfaces in ATM management (Figure 10.9 and Figure 10.10) with TMN refertmce
points and TMN service Interfaces.
7. CMISE services are fisted In Table A.3. Map these senilres, wherever pQSsible, to SNM Pvl.senil ces.
9. TMN can be applied to ATM switch management usil'lll either SNMP or CMIP spedficatlons. Research the ATM
Forum specifications refe.renced In Tabl e 123 and Identify the OBJECT IDENTIFIERS for the two modules,
atmfM4CmlpNEVIew and atmfM4Snmpvlew.
10. The ATM objects are defined under the node lnformationModule(O), which ts the subclass of atm.fCmlpNEVIew.
Five managed object dasses are defined under the lnformatlonModule, which are atmfM40bject0ass,
atmfM4Package, atmfM4Attrlbute, atmfM4Name81ndll18, and atmfM4Actlon. Draw a namll'lll tree far these·,
explicitly ldentlfyfl'lll the Object.ID.
11. Flgure·lO.l8 shows map pine of eTOM Level 2 processes·to-M.3400 Configuration Function. Do slml.lar mapplne
ofeTOM Level 2 processes·to-M.3400speclflcatlons for:
a. Faull Management
b. Perfom~ace Management
c. Account·ing Management
Objectives
Network management and system management
Network management
o Configuration
o Fault
o Perfi>rmnnce
o Security
o Accounting
Configuration management
o Configura/ ion managenumt
o Service/network provisioning
o lnvemory malll!gemenr
Fault management
o Fault detection and iso lation
o Correlation techniques for root cause analysis
Perjorm(JJlce management
o Performance metrics
o Data monitoring
o Problem iso lmion
o Performance statistics
& curity m011agement
o Security policies and proc«<ures
o Security tbi'CQts
o Firewall
o Cryptography: keys, algorithms. authentication. and nuthomation schemes
o Secure message transfer methods
Accounting managemem
Repon management
Policy-based management
Service level management
o Quality of service, QoS
o Service level agreement, S.LA
The management of networked information services involves management of network and system resources. OSl
defmes network management as a five-layer archi1ecturo. We have extended tbc model to include system
management and have presented the integrated. architecture in Figure· 11.1. At the highest .level of 1MN are the
functions assodatoo with managing tbc business, business management. This applies to nil institutions, be it a
commercia l business, educational institute, telecommunications service provider. or any other organization that
uses networked systemsi.o 111anage d1eir business.
The third layer ofTMN deals with network management or system management. Neh,~rk management manages
the global network by aggregating and correlating data obtained from the element management systems. Likewise,
the system management aggregates and coordinates system resources by acquisition of data ·&om the resource
management systems. The complementary functions of network and system management manage the networked
infoonation system composed of network elemems and system resources.
Our focus in this chapter will be on network management application.s. As we learned. in Chapter 3, there are five
different categories of applications: configuration management, fault management, performance management,
security management, and account management. Others [Leinwand and Conroy, 1996] have treated the five
categories of applications and presented simple and complex tools to manage them.
The subject of configuration management may be looked at not only from an operational viewpoint, but also from
engineering and planning viewpoints. In our Lreatment of configuration management in Section II. I, we have
included network provisioning and inventory managcmem . This is in addition to the configuration of networ.k
topology, which is part of traditional net.work managemeot.
Fault management involves detection of a fauk as il occurs in the oel\vork, and subsequently locating the source of
the problem. We. should finally Lqolate the root cause of the problem. lltL~ is covered in Section 11.2.
1t is harder to de tine perfonnance of a network in quantitative terms than in qualitative terms. For examplt, when a
user observes that the network performance is slow, we need to define what slowness is, and which segment of the
·net work is slow. On the other hand, it couid be that the appucatioo, which could be running on a server, is
behaving slowly. We discuss performance maoagemem in Section 11.3. We will discuss perfOrmance metrlcs and
learn how 10 monitor 8 network fur performance. Performance Slatlstics play a very important pan in network
management, and se~eral system tools available fur gathering statistics will be covered.
When a fauk occurs in a netwo[k, eitber due to milure of a component or due 10 perfOrmance, it may manifest itself
in marty places. Thus, from a centmlized management system, we observe alanns coming from multiple locations.
Correlating these alarm even.ts and fmding the root cause of the problem is a challenge. We wiU discuss the various
correlation technologies in Section 11.4.
Security in network is concerned with preventing illegal access to infOrmation by unauthorized personnel It
involves not only technical issues, but also establishment of well-defined policies and procedures. We will discuss
the various issues associated with autbentiCillion and authorization in Section 11.5. We will also deal with the
establishment of secure (ie., without illegal monkoring and manipulalion) communication between the source and
U1e receiver.
The business healt:h of an institution or corporation depends on weU.malntained accounting management and
reponing. Reports for management have a different purpose from that ofrepons generated fur day-In-day network
operation. There are also reports needed fur the user to measure tbe quality of service 10 be provided by ser vice
level agreement (SLA). TI1ese are covered in Sections 1.1.6 and 11.7
We have addressed. tbe five layers of management with network elements being at tbe lowest level and business
management being a1 tbe top. Element management a1 the second layer maintains 1he network. Network
management at the third level and service management 111 the fourth level are based not just on technical issues but
alo;o on policy iSsues. Once policies are established, some of them could be implemented in the system. For
example; if a network is congested due to heavy traffic, should the network parameters be automatically adjusted to
increase bandwidth, or sbould traffic into the network be decreased. A policy decision that has been made on th.is
can be implemented as part of the management system. We will discuss this in Section 11.8.
Service level management is an important aspect of network and system management. II goes beyond managing
resources. It Is concerned with tile SLA between t:he customer and the service provider regarding the qualily of
service of network, systems, and business applications. Tlus is covered in Secrion 11.9.
Network provisioning. also called circuit provisioning in the telephone industry. is an automated process. The
design of 8 trunk (circuit from Ihe originating switching center to the destination switching center) and 11 special
service circuil (customized for customer specifications) is done by application programs wriHeo in operation
systems. Planning S)'1!iems and inventory systems are integrated with design systems to build a system ofsystems.
Thus, a cirouit designed for the future automatically derives its tum-up date from tbe planning system and ensures
that tbe compone.nts are available in the inventory system. Likewise, when a circuit is to be discoruiCCted, it is
coordinated with tbe planning system and the freed-up co.mpollents are added to the Inventory system. Thus, Ihe
design syslem is made aware of1he availability ofcomponenls far future des'igns.
An exnmpleofa circui1· provisioning system is 11 system of systems developed by Bell System (before it was split),
called Trun.k Integrated Record Keeping System (nRKS). TIRKS is used i.n automated circuit provisioning of
trunks. A trunk is a logical circuit between two switching offices and iltraverses over many facilities. 11.RKS is an
oper::rtions system In lhe conleXI ofTelecommunical.ions Management Ne1work (TMN) that we dealt with in
Chapter 10. Given the requirements of a trunk, such as transmission loss and noise, type of circuit, availability
date, etc., as input to the system, the system automatically designs lhe components of lhe lrunk. The designed
circuit will identify transmission facilities beiween switching offices and equipment in intermediate 8J)d end
offices, The equipment will be selected based on what would be available in the future when lhe circuit needs to be
installed.
Network provisioning in a compuler communications network has different requirements. Instead of cirouii·
switcbed connections, we have packer-switcbed palhs fur information to be transmitted from lhe source to tbe
destination. In a connectionless packet-switched circuit, each pacltet takes an independent path, and the routers at
various nodes swit.ch each packet based. on the load in lhe links. The links are provisioned. based. on average and
peak demands. In store-and-forward communication, excess packets can be stored in buffers in routers or
retransm itted in !be event the packets are lost or discarded. In the connection-oriented circuit requirements,
permanent and switched vinuaJ circuit demands need to be accommodated fOr end-to-end demands o n tbe various
links. Network provisioning for pacltet-switched network is. based on perfOrmance statistics and quality of service
requirements.
Network provisioning in broadband wireless area network (WAN) communication using ATM technology is mo~e
complex. The virtual-circuit concept is always used and has to be taken into account in the provisioning process.
The switches are cell-based, in contrast to frame-based packet switching. Each ATM switch has knowledge oflhe
virtual path-virtual c ircuit (VP-VC) of each session connection onlylo the neighboring nodes and not end-t~rend.
Bach ATM switch vendor has buib their proprieblry ass·ignment ofVP-VC for end-t~rend design into the ATM
switch. The arehitectuni of eod-to-cnd provisioning of ATM circuits could be either centralized or distributed, and
is based on whether the circuit is a permanent virtual circuit (PVC) or a switched virtual circuit (SVC).
Commercial products, which provision PVCs across multiple vendor products, have recently been introduced in lhe
market.
We have addressed the importance of inventory management in circuit provisioning. A:n efficient database system
is an essential part of the inventory management system. We need to be aware of all the delllils associated with
components, which should be accessible using different indices. Some of the obvious access keys are the
component description or part number, components that match a set of characteristics, components in use and in
spare, and components to be freed-up fOr future use.
In Section 11.1.1 we cited the example ofTIRKS, which is a system ofsyst.ems. Two of the systems lhat TIRKS
uses are equipment inventory (El) and fiscilities inventory (Fl). TheE I system has an inventory of all equipment
identil}dng what is ·currently available and what will become available in the future with dates of availability.
Similar infurmation is maintained on facilities by the Fl system. With such a detailed inveutory system, TIRKS
can anticipate circuit provisioning for the future wilh compooent·s that would be available.
Legacy inventory management systems use hierarchical and scalar-based database systems. Such databases limit
the addition of new components or extend the properties of existing components by adding new fields. These
limitations can be rc.moved by using relati011al database technology. Further; new NMSs, such as in OSI CMIP and
Web-based management, use object-oriented technology. These manage object-oriented managed objects. An
object-oriented relational database is helpful in configuration and inventory management in s uch an environment.
Neh,\;>rk managemenl is based on knowledge of network topol.o gy. As a network grows. shrinks, or otherwise
changes, the network topology needs to be updated automatically. This is done by lbe discovery application in the
NMS as discussed in Section 9.4.4. The discovery process needs to be constrained as io lhe scope of the network
that it discovers. For example, the arp command can discover any network component tbat responds with an 1P
address, which can then be mapped by the NMS. Jf thls lnclude.s workstations that are rumed on only when they are
in use, the NMS would indicate failure whenever they are off. Obviously, that is not desirable. In addition, some
hosts should not be discovered for security reasons. These should be filtered out during the discovery process so
the discovery application should bavethe capability to set filter parameters to Implement these conStraints.
Autodiscovery can be done using the broadcast ping on ea.ch segmen.t and following up with further SNMP queries
to gather more details on the system. The more efficient method is to look at the ARP cache in the local router. The
ARP caehe !able is large and conlains the addresses o f all the recently communicated hosts and nodes. Usi ng tbls
table, subsequent ARP queries could 'be sent io other routers. TI1is process is continued until information is
obtained on all 1P addresses defmed by the scope of the autodiscovery procedure. A map, s howing network
topology, is presented by the autodiscovery procedure after the addresses of the network eutiiles bave been
discovered.
The autodiscovery procedure becomes more complex in the vi rtual local area network (LAN) configuration. Figure
11.2 shows the physical configuration of a conventional LAN. The router in the figure can be.visuali:r.ed as part of
a backbone ( oot shown). There are two LAN segments connected lo the rooter, Segment A and Segment B. They
are physically connected to two physical ports in the router (i.e., there is one port for each segment used on the
Interface card). They are Identified as Port A and Port B, corresponding to Segment A and Segmenl B,
respectively. Botll LANs are Ethernet LANs and use bub configuration. Two hosts, A l and A2, are connected to
Hub 1 on LAN segment A and ·two hosts, B l and 82, are connected to Hub 2 on L.AN Segment B.
r-------..=.~
A:J.
Ro or
POIIB~~
~IB ~
...
Hub2
82
Figure 11.3 shows the logical configuration for Figure 11.2. The logical configuration is wbat the autod.iscovery
process detects. Lt is very similar to the physical configuration. Segment A corresponds to LAN on Hub I with the
hosts A I and A2. Ills easy to conceptually visualize this and easy to configure.
-,
ROI.Ief
0 F Sew<!<!• B/1-tub ~
~
81 B2
Let us .now contrast Figure 11.2 wid1 Figure 11.4, which shows the physical configuration of two vinual LANs
(VLAN). We notice that only one physical port, Por1 A, is used in lhe router, not two as in the case of a trnditio naJ
LAN. Hosts A I and A2 are configured to be on VLAN I, and hosts 8 I and 8 2 are configured to be on VLAN 2.
Although VLAN grouping can be done on different criteria, let us assume thnt it is done on port: basis o n the
switch. Thus, the two ports marked Segment. A on the switc h are grouped ns VLAN I. The other t\VO ports, marked
Segment B. are grouped as VLAN 2. Thus, Segment A corres po nds to VLAN I and Segment 8 corresponds to
VLAN 2. We observe that VLAN I and VLAN 2 are spread across the two physical.hubs, Bub I and Bub 2. With
a layer-2 brilgcd network. the VLAN network is efficient. As IEEE 802.3 standards are established and widely
adopted, 'this configuration hns been deploy ed more and more, a.Jong with a backbone network .
The logical view oft he physical VLAN configurat ion shown in Figure 11.4 is presented in F igure 11.5. We see
that Hosts A I and A2 still belong to Segment A, but are o n d ifferent hub$. Likewise, Hosts 8 I and 8 2 belong IO
Segment B. The autodlseovery process would not detect the physical hubs thai are Identified in Figure 11.5. lo
many silual'ions, the switch would a.lso be t.ronsparent, ns there are no IP addresses assodated with switch pons.
Cons;:quently, it would be harder to associate the logical configuration with the physical configuration.
ROtJIIIr
()
This makes Ute network management task a ·complex one. First, two sepamre maps must be maintained on an on-
going basis as changes to t.be network are made. Second, when a new oomponent is ooded and autodiscovered by
the system, a manual procedure is needed to follow up on the physical configuration.
In the example above, we talked about grouping ofVLAN based on the ports on Ute switches. We could also group
VLAN based on MAC a.ddress, lP address, or protocol type. Grouping by lP address bas some benefits in tbe
m:Ulagement ofVLAN network. Tb.e logical grouping of components based on lP net\vork segments makes se.nse.
ln addition, as a policy the sysLoeation entity in a system group should be filled in for easi.er management.
Fauh detection is ncoomplished using either n polling scheme (the NMS polling management agents periodically
for status) or by the generation of traps (management agents based on information from the network elementS
sending unsolicited al.arms to the NMS). An application progmm in NMS genemtes the ping command periodicaJiy
and waits for response. Connectivity is declared broken when a pre-set number of consecutive responses are not
received. The frequency of pinging nnd the preset number for failure detection may be optimized for balance
between traffic overhead and the rapidity with which failure is to be dete<.ted.
The nllemative detection scheme is to use traps. For example,. the generic trap messages linkDown and
egpNeighborLoss in SNMPvl can be set in the agents giving them the capability to report events to the NMS with
the legilimat.e COIJimunity name. One ofthe advantages oft raps is that failure detection is accomplished mster with
less traffic overhead .
After having located where 1he fault is, the ne~~,-t step is to iso late the fault (i.e., determine the source of the
problem). First, we should de-lineate the problem between failure of the component and the physical link. Thus, in
the above example, the interfa.ce card may be functioo.ing well, but tbe l.ink to tbe .interfu.ce may be down. We need
to use various diagno.stic tools to isolate the cause.
Let us assume for 1he momem that the link is not the problem bm that the interface card is. We then proceed to
iso late the problem to the layer that is causing il.lt is possib.le tbat excessive packet loss is causing disconnect ion.
We can measure packet loss by pinging, if ping.ing can be used. We can query the various Management
Information Base (MlB) parnmeters on the node itself or other related nodes to further localize tbe cause of the
problem. For example, error ra1es calculated from the interface group parameters, lfinDiscards, lfinE.rrors,
i!OutDlscards, and itOutErrors with respect to the input and outpul packet rates, could help us isolate the problem
in lhe i nterfitce card.
The ideal solution to locating and isolating the fault is ro have an artificial intelligence so lution. By observing aU
the symptoms, we mighl be able to identify the source of the problem. There are seveml teGhniques to accomplish
this, which we will address in Section 11.4.
We have already addressed performance management applications directly and indirectly under the various
headings. We discussed two popular protocol analyzers, Sniffer and NetMetrix, in Chapter 9. In Section 9.2, we
used the protocolanaly.t.er as a ~stem tool to measure traffic monii·oriog on Ethernet LANs, which is in lhe realm
of perfi:>rmance management. We looked a1 load monitoring based on various parameters such as source and
destination addresses, protocols at differenl layers, etc. We addressed traffic slati.stics collected over n period of
from hours to a year using the Mulil Router Traffic Grapber (MRTG) tool in Sectklo 9.2.4. Tbe sratistics obtained
using a protocol nnnlyz.er as a· remote monitnring (RMON) tool was detailed in the case study in Sec1ion 8.6. We
noticed how we were able to obtain tbe ovemJJ trend in lnte.met-rclated traffic and the type of traffic.
Performance of a network is a nebulous term, which is hard to define or quantify in terms of global metrics. The
purpose of 1he network is to carry infOrmation and thus performance management is really (data) traffic
management It involves the fOllowing: dara monitoring, problem isolation, perfOrmance tuning, analysis of
statistical data fOr recognizing trends, and resource planning.
The parameters that can be attributed to defining network performance on a global level are throughput, response
time, network availability, and network reliability. The mctrics on these are depende.nt 011 what, wh•:m, and where
the measurements are made. Real-time traffic performance metrics are latency (i.e., delay) and jitter, which are
addressed in Section I I 3.4.
These macro-level parameters can be defined in terms of micro- level parameters. Some of the parameters that
impae1 network throughput are bandwidth or capacity of the transmission media. its utilization, error rate of the
channel. peak load, and avemge load oftbe traffic. Tbese can be measured at specific points in the network. For
example, bandwidth or capacity will be different in dif!erent segments of the network. An Ethemet LAN with a
capacity of 10 Mbps can.function to full capacily with a. single workstation on it, but reaches full capacity wiib a
u1ilization factor of 30-40% when densely popula1ed wilb workst:rtions. This utilization factor can further be
defined in terms of collision rate, which is mensurable. ln contrast, in a WAN, the bandwidth is fully utiliz.ed
except for the packet overhead.
The response time of a network not on.ly depends on the throughput ofthe network, but also on 1he application; in
other words, it depends on both the network and system performance. Thus, in a clieo~~rver environmeol, the
response time as seen by the client could be slow either due to the server being heavily used, or the network traffic
being overiQaded, or a combination of both. According to Feklmeir [1997]. '~he application responsive.ness on the
network, more than any other measure, reflects whether the network is meeting the end use.rs' expectations and
requirements." He defines three lypes of metrics to measure application responsiveness; application availability,
response time between Lhe user and Lhe server, Wld the burst frame rate, which is Lhe rate at which the requested
data arrive at Lhc user sration.
IETF Net 1\~rk Work:i ng Group has developed several Request for Commems (RPCs) on i raffic flow measurement.
RFC 2063 deftnes the architecture for the measurement and reporting of.network traffic flows. The network is
characterized as traffic passing through four representative .levels. as shown in Figure I L6. Backbone networks are
those that are lypically connected to other networks, and do not have individual hosts connected to them. A
regional network is similar to a backbone, but smaller. lt may have individual hosts connected 10 it. Regional hosts
are subscribers to a backbone network. Stub/enterprise networks connect. hosts and LANs and are subscribers to
regional and backbone networks. End systems or hosts ore subscribers to aU of the above.
lntomntlonaf
Ba<*borws.'Na'~nal
Stub/Enlorprise
The nrcbitecrure defmes three entities for traffic flow measurements: meters, meter renders, and managers. Meters
observe network traffic flows and build up a table of flow data records for them. Meter readers collect traffic flow
dat.a from meters. Managers ove,rsee the operation of meters nncl meter rende.rs. RFC 2064 defines the MlB for the
meter. RFC 2123 describes NeTraMet, an implementation of flow meterbascd on RFC 2063 and RFC 2064.
Data monitoring in i!Je network for abnormal performance behavior. such as high collision rate in Ethernet LAN,
excessive packet drop due to over load, etc .. are detected by t raps generated. by agents and RMON . Performance-
related issues ore detected primarily using Imp messages generat.ed by RMON probes. Thresholds are set for
important SNMP parameters in the RMON. which then generate alarms when the pre-set thresholds are crossed.
For example, the parameters in the alarm group and tiJe event group in RMON MIB [RFC 1757] may be set for the
object identifier 10 be monitored. l11e time interva l over which the data are to be coUected for calculation and i!Je
rising and falling thresholds are specified. In addition, the community names are set fur who would receive lhe
alarm. Ak·bougb we classi.fY such alarms under perfOrmance category, they could also be defmed under fault
category as is done in Section 9.4.6.
NMSs generally report all events selected lor display including alarms. Alnnns are set lor criticality and lhe ioon
ch&nges color based on the criticality. Depe.nding on the implement!llion, the alarm is either automatically cleared
when the alarm condition clears or is manually cleared by an operator. The latter case is useful for alerting the
operations personnel as to what happened.
Problem isolation fur performance-related issues depends on the type of-problem. As we 113\'e indicated befure, a
high percentage of packet loss will cause loss of connectivity, which oould be inlermillent. In Ibis situntion,
monitoring the packet loss over an extended period will isolate the problem. Another example is the perfurmance
problem associated with large delay. This may be altributabl.e to an excessive drop of packets. We can identilY the
source of the packet delay from a route-troc·ing procedure and then probe fur t.he packet discards at t:l.at node. Refer
10 [Rose and McCloghrie, 1995] for a detailed analysis of the performance degradation cases in various
components and media.
As in fauh management , problems could occur at multiple locations simultaneously. Tbese could be reponed to lhe
central management system as multiple independent events although they may be correlated. For example, an
excessive drop in pncket.s in one of the links may switch the trafii.c to an alternate route. This could cause an
overload in that link, which will be reported as an alarm. A more sophisticated approach using the correlation
technology is again required here, which we will discuss in Section 11.4.
Performance st.atistics are used in tuning a network., validating of SLA, which will be covered in Section 11.9,
analyzing use trends, planning facilities, and functional accounting. Data are gathered by means of an RMON
probe 11nd RMON MIS fur statistics. Statistics, to be accurate. requiJ:e· large amounts of data sampling. which
create overhead traffic on the network and thus impact its performance. One of the enormous benefits of using
RMON probe for collecting sllltistics Is that it can be done locally without degra~ing the overall performance of the
network. An RMON MID contllins the history and statistics groups (see Section 8.4) for various media and c.an be
used efficienily to collect the relevant dilta and store them fur current or fuiure analysis.
One application of lhe results oblllined from performance Sllltistics is to tune the network for better perfOrmance.
For example, two segments of the net\vork may be connected by a gateway and excessive iotersegment traffic
could produce excessive performance delay. Error statistics on dropped packets on the gateway interfaces would
manifest this problem. The solution to resolve this problem is to increase the bandwidth of the gateway by either
increasing its capacity or by adding a second gateway between the segments. Of course, adding the eldra gateway
coukl cause configuration-related problems and hence reconfigumtion of traffic may be needed.
Various error statistics at different layers are gathered to measure the quality of service and to do perfurmance
improvement, if needed. Some of the odter .Performance parameters tl.at can be tuned by monitoring network
statistics are bandwidth of links, utilization of links, and controlling peak-to-average ratio of inherently bursty data
traffic. In addition, uaffic utilization can be improved by redistributing lhe load during lhe day, with essential
traffic occupying the busy hours and non-essential traffic the slack hours, with the latter being store and forward.
An impo11ant performance criterion in real-time traffic in broadband .service is the latency or delay caused by
dispersion in large bandwidth signal This affects lhe quality of service due to perfonnance degradation.
Anolher important sunistiC; especiaUy in real- rime broadband services, is the variation in network delay, otherwise
known as jitter. Tit is impacts the quality of se.rvice (QoS) guaranteed to the customer by the SLA. For example, in
a cable modem, a managed objec1 docsQoSServiceClassM;l.x.lirter (see 1be DOCSIS Qunlily of Service M1B i.o
Table 13.3) is used 10 moni10rtbejiuer.
Another performance apl?lkaiion is validation ofSLA bet~veen tbe. service user and tbe service provider. The SLA
may require limiting input io lhe service provider network. If tbe packet rate is lending toward tbe tbresbold of
SLA, one moy hove 1o·controllhe bandwidth ofiheaccess to the service provider network. Tliis can be achi.eved by
implementing application interfa.ces th.a t use algorithm$ such as the le<Jky bucket or the token bucket [Tanenbaum,
1996]. Tbe leaky bucket algorillun Limits the maximum outpUI data role, and the token bucket algorithm colllrols its
average value. Combining the two. we can tune lhe peak-to-avemgc ratio oft be output. Some ATM switches have
such ioierfuces built into lhem, which are easiJy tunable. This would be desirable if a service provider's pricing is
based on peak data rate usage instead of average rate.
Performance statistics are also used to project traffic use trends on traffic use and to plan future resource
requirements. Statistical data on traffic are collected and periodic reports are generated to study use trends and
projec1 needs. Thus, trend analysis is helpful for future resource plwming.
Statistics can be gathered, as we saw in Section 9.2, on the nel\vork load created by various users and applications.
These can be used to do functional accounting so tba1 the cost of operation of a network can be charged to the users
ofthe network or ru leas! be justified.
We. have illustra1ed some simple methods 10 diagnose and isolate 1be source of a problem in faul1 and perfonnance
management. When a centra_Uzed NMS receives a trap or a notificafion, it is called receiving an evem. A single
problem source may cause multiple symptoms, and each symptom detected is reported as an independeni eveoi to
tbe monagernent system. Obviously, \VC· do not want to ttea1 each event independently and actio resolve it. Thus, it
is important Ibat the management system correlates all these events and isolates lhe root C·ause oft he problem. The
techniques used for ~Wcomplishing this are called event correlation toohniques.
lbere are several correlation techniques used to isolate and localize fault in networks. All are based on( I) detecting
and filtering of ev-ents, (2) correlating observed events to isolate and localize lhe fault either topologically or
functionally, and (3) identizying lhe cause of the problem. ln aU three Cl1Se5, tbere is intelligence or reasoning
behind tbe methods. The reasoning melhods distinguish one technique from another.
We will discuss six approaches to correlation teclmiques. They are (I) rule-bliSOO rellsoning. (2) model-based
reasoning, (3) c&se>-base.d re!lS()ning, (4) codebook, (5) state trnnsiti.on graph mode~ and (6) finite state machine
model. See Lewis [1999] for a detaiJed comparison of the various methods.
Jl.4.1.ltulc-.BIISetl Reasoning
Rul~basoo reasoning (RBR) is the earliest form of correlation technique. It is also known by many otbenmmes
such as ru lt>bascd expert sy~em, expert system, production sysu:m, and blackboard system. It has a knowledge
base, working memory. and an inference engine, as shown in Figure 11.7 [Cronk £U.j., 1988; Lewis, 1994]. The
tbree levels representing tbe three components are tbe knowledge leve~ the data !eve~ and the co Ulrol level,
respectively. Cronk et nl. are also a good source for review of nerwodt applications of RBR. The knowledge base
colllains expert knowledge as to (I) definition ofa problem in the network and (2) action that needs to be taken if a
particular condition occurs. The knowledge base infOrmation is rule-based in lhe !Orm of if-then or conditio~
action, containing rules that indicate wllicb operations arc to be performed when. The working memory contains-
as working memory elements-tbe topological and staie iltformatlon of the network being moni1ored. When the
IJCIWodt goes into afimlty state, it is recognized by tbe working memory. Tile inference engine, In coopemtion with
tbe knowledge base, compares tbe curre01 ~ate with tbe left side of tbe rule.base and finds tbe closes! match to
output lhe right side of the rule. The knowledge base then executes an action on the working memory element.
Figurt II. 7. Bui< Rulr-Baud Rtaronlng Parodigm
03talevol
Col'lli'Oilevo
In Figure ll.7, the rul&-based parad igm is interactive between the three compone.ots and is iterative. There are
se-veral strategies for lhe rul&-based paradigm. A specific strategy is implemented in the inference engine.
Choosing a ~-pecific rul.e, an action is performed no the working memory element, wblcb could then initiate another
event. This process continues until the correct state is achieved in the workirlg memory.
Rules are made up in the knowledge base from the expertise of the experts in the field. The rule is an exact match
and tbe action is very specific. lf the antecedent in the rule does not match, the paradigm breaks and it is called
"brinle." However, it can be fixed by adding more rules, which would increa.o;c the database si7.e and degrade the
perforlll3Dcc, referred to as knowledge acquisition bottleneck. There is an exponential growth in ~izeas the number
of working memory elements grows.
In addition. the action is·specific, which could cause unwanted behavior. For example, we can define. the alarm
condition for packet Joss.as follows:
The left side conditions are the working memory elemen1s, which if detected would execute the appropriate ru.Je
defined in the rule-base. As we can see, tbjs could c~use the alarm condition to flip back and forth in boundary
cases. An application offitzzy logic Is used 10 remedy this problem [Lewi<>, 1994], but il is harder to implement.
The RBR is used in Hewlett-Packa.rd Open View Element Manageme.nt Framework [Hajela, 1996]. Figure 11.8 is
an adaptal'ion of the scenario from [Hajela, 1996] to Illustrate an implemenwtioo of RBR. It shows a. rour-layer
oetwol'k.. Backbone Router A links to Router B. Hub C, connected to Router B. has four servers, Dl through D4, in
it!l LAN. Without a correllltio.n eog_ine, failure in ibe inlermce of Router A will ge-nerate an alarm. This mull U1en
propagates 1.0 Router B, Hub C, and finally 1.0 Servers D I through 04.11 is importrun to realize thai. !here· is a time
delay involved in the generation of alarms. In general, propagation of faults and time delay assoeias.ed with them
need to be rccogni7..ed as such in fuult managemen1.
Four correlation rulc·S are specified in Figure 11.9. Rule 0 bas no condition associated with it. Rules 1-3 are
conditional rules. In order t<l allow for the propagation time, a correlation window of20 seconds is set.
The inference engine at the control level interprets lhe above roles and tokes the ac6ons shown in Figure I 1. 10.
Flgurt 11 .10. Conlo'UI Actions for lbt RBR Exam pit or Flgurt 11.8
Se\<-eml commercial systems have been buih using RBR. Some examples are Computer Associates 1NG nod Tivoll
TME.
An event correlator based on model-based reasoning is built on an object-orienled model associlued with each
managed objecl. A model is a representation of !he component i1· models. 1l1e mode~ in the traditional object-
oriented representation, has attributes, relations -.o other models, aod behaviors. The relationship between objects is
reflected in a similar relationship between models.
Let us picture a network of hubs 1hnt are connected to a-rotrter, as shown in the left half of Figure II. II. The right
half shows tbe correspooding model .in the event correlator in the NMS. The NMS pings every hub and the router
(real~ a router interfa.ce to the backbooo network) periodically to check whether each componem is working. We
can nssociaie communication between the NMS aod a managed component as between a model (software object)
in the NMS/co.rrelator aod its counterpart of managed object. Thus, in our example, the model of each bub
periodically pings its hub aod the model oft he router pings the router. As long as all the components are working,
no additional operation is needed.
If Hub I falls. it is recognized by the H I model. Let us assume that the HI model is programmed io wait fur lack of
response in three consec:ul ive pings. After tlu-ee pings with no response, Lbe 1:11 model suspects a fililure of Hubl.
However, before it declares a failure and displays an alarm, it analyses its relation to other models and recognizes
that it should query the router model. If the ronter model responds thnt the roilier is working, only tben Lhe Hub I
alarm is triggered. Iflhe router model respoods Lhnt il. is not reeeiving a response from 1be ronter, then the Hubl
model deduces that the problem is with the router aod 001 Hub l. At leas1, it cannot definitively dete.rmloe a Hub I
failure as long as it cannot communicate with Hub I because oft he router failure.
The above example is modeled after [Lewis, 1999], who presents an intereSLing seeoario of a classroom with
teacher. Outside the classroom is a computer network wil:b a router aod workstations. Each &udent is a modeI
(software mirror) of the workstation. The teacher is a model oft be router. Each student commuoicates with hi-s or
her real-world counte,rpart, which is the workstation outside the classroom. The teacher communicates with the
router. Ifa student fails to commuoicate with his or her workstation, he or sbe querie,s the teacher as to whether the
teacher could,communicate with U1e teacher's ,router. Depending on the yes or no answer ofthe teacher, the student
declares a "fail" (yes) or "no-fail'' (no) condition, respectively. Model-based reasoning is implemented i,n
Cabletron Spectrum.
Case-based reasoning (CBR) overcomes many of the deficiencies ofRBR. In RBR. the unil ofknowle<lge is a rule;
whereas in CBR, the unit o f knowledge Is a case [Lewis, 1995]. The intuition ofCBR is that situations repeat
U1emselves in the real world; and that what was done in one situation is applicable to others in a simllar; but not
necessarily identical situation. Thus, when we try to resolve a trouble, we start with the case that we have
experienced before (Kolodner, 1997; Lewis, 1995]. Kolodner treats CBR from an infOrmation management
viewpoint; and Lewis applies it specificallylo network management.
The general CBR architecture is shown in Figure 11.12 [Lewis, 1995]. It consists offour modules: input; retrieve,
adapt; and process, along with n ease library. The CBR approach uses the knowledge gained befOre and extends it
to the current situation. The former episodes are stored in a case library. If the current s,ituation, as received by the
input tnodule, matches one that is present in the case library (as compared by the retrieve tnodule), it is applied. lf
it docs not, the closest situation is chosen by the adapt module, and adapted to !he current episode io resolve the
problem. The process module takes the appropriate action(s). Once the problem ls resolved, the newly adapted case
is added to the case library.
Le\vis also describes the application ofCBR in a troubkHmcldng system, CRITTER [Lewis, 19%]. The CRITfER
application has evolved into a CBR application for network management named SpectroRx built by Cablctron.
When a trouble ticket is c.reated on a network problem, it is compared to s imilar cases in the case, library containing
previous trouble tickets with resolutions. The current trouble is resolved by ooapting the previous case in one of
three ways: ( 1) parametedzed adaptation, (2) abstmction/respecializntion adaptation, and (3 ) critic-based
adaptlltion. The resolved trouble ticket is then added to the case library. We will use the examples given in the
reference to illustrate the thnie adaptation methods.
Parameteriz.ed Adaptation. Parame,terized. adaptation is used when a similar case e>tists in the case library, but the
parameters may have to be scaled to resolve the current situati:Jn. Consider the, current trouble with
tile_transfer_throughpu~ which matches !he folbwing trouble ticket in the case library, shown in Figure 11.13.
ln the parameter:lzed adaptation oft·he irouble ticket sbown in Figure 11.14, variable F has been modiJied to F? and
the relationship between network loa4 &4justme.nt variable A? and F? remainllle same as between A and F. ln the
default siluation, where there is an exact match, F? and A? are F and A.
Troullfe: file_tr.msfer_lhroughpo.t=f'
Mdtbonal dala: none
Resolubon: A'=f(F j, adjusLJletWOII<.)oaii•A
R!!SOiubon statJs: good
Abstraction/RespeciaHz.ation Adapmtion. Figure 11 . 15 shows three trouble tickets. The first two are two cases from
the case library !hnt matclled the current problem we have been discussing. l11e first option adjusts the network
load, and the seoood option adjusts the bandwichb of t he network. The user orthesystem has the option of adapting
either of the 1:\vo based on restrlctklns to be placed on adjusting the workload or adjusting the bandwidth. Let us
choose the optkln of not restricting the network load, which implies that we have to incrense the bandwidth. We
can add this as additional data. to the trouble ticket that chooses the bandwidth option and create a new trouble
ticket, whioh is shown as the U1ird trouble ticket in Figure 11 . 15. This is now added to U1e case library.
Trouble· file_transfer_throughput=F
Add1bonal data: none
Resoh.rt1on: A=f(F), adjust_netwoikjoad.,A
Resolution status: good
Trouble: file_transfer_throughput=F
Additional data. none
Resolution: B=g(F). adjust,_network_bandwldh=B
Resolution status: good
Troublo: filo_lransfor_lhroughput=F
Addl~onal data: adjust_network_load=no
Resolution: l3=g(F), adjust_network_ bandw!dlh=B
Resolution status: good
This CBR adaptation is referred to as ab.stracilinlrespecialization adaptation. Choosing lO adjust bandwidth and not
load is a. policy decision, wbicb we will diSQUSS in Section 11.8.
Critic-Based Adaptation. The third adaptati:>n, critic-based adaplation, is one where a critic or a craft perSon
decides to add, remove, reorder, or replace an existing solution. Figure 11.16 shows an example where the
network_ load has been added as an additional parameter in adjusting the network load, and reSQiution A is a
function of two variables, F and N. This is added as a new case to the case library.
Trwble· file_tramteutvoughput=F
Alk!JIIOmt dala: ntiWOfk_Joa(r-r.
ResokJI.bo· A=f(F N~ adlu~ ner~ load=A
Resolution status: good
CBR-Based CRJlTER. The architecture of CRllTER i.s shown in Figure 11.17 [Lewis, 1996]. II is integrated with
the NMS, Spectrum. The core modules of CRIT!'ER are the four basic modules of the CB.R system shown in
Figure 11.12: input, retrieve, adapt, and precess. There is a fifth additional module, propose, which displays
potential solutions found by the reasoning module and allows the user to inspect and maouaUy adapt these
solutions.
The input module receives its input from the mull detection module of Spectrum. The process module updates the
ticket library with the new experience. The retrieve module uses detenninators to retrieve a group of tickets from
the library that are similar to an outstanding ticket. The initial set of determination rules is based on expertise
knowledge and is built into the determioators module. The application technique is the strategy used by the adapt
module. User-based adaptation is the lnterfuce module for the user to propose critic-based adaptation.
Comparing RBR and CBR [Kolodner, 1997] distinguishes the differences between RBR and CBR. ln RBR, the
retrieval is done on exact match, whereas in CBR the match is done on a partial basis. RBR is applied to an
iterative cycle of microeveots. CBR is nppued ns a total solution to the trouble and then adapted to the situation on
hand.
AlgoriLbms have been developed to correlate events that are generated in networks based on modeling of the
network and the behavior of network components, Because they are based on algorithms, claims are made that they
do not require expert knowledge to nssociate the events with problems, Although this is true, we still need expert
knowledge in selecting the right ldods ofinput that are 10 be fed to the correlator to develop an efficient system.
Figure I 1.18 [Kliger llM·· 1995) shows the arcb.itecture of a model-based evem correlation system. We should
caution that !he "model-based event con:elatJon" should not be confused with the term "mod el-based reasoning
approach" that we di.scussed in Section 11 .4.2. As ·!he heading states, we will refer to this as codebook correlation.
Monitors capture alarm events and input them to thecorrelator. The configuralion model conmins lhe configuration
of the network. Ibe event model represents the various events and their causal relationships (we will soon define
the causality relationship). The correlaior correlates !he alarm evems wilh the eveni model and determines the
common problems that caused the alarm event
One of the correlation algorithms based o n generic modeling is a coding approach to event correlation [Kiiger ~
.!!l, .1995]. ln this approach, problem events are viewed as messages generated by a system and ''encoded" in sets of
alarms that they cause. The function of the correlator is to "decode" those problem messages to identify the
problems. Thus, the cooing technique co mprises two phases. In the· first phase, caHed lhe codebook selection
phase, problems to be monitored are identified and the symploms or alarms that each of them generates are
aSS<X:iatcd with the problem. (As we stated at !he beginning of this approach, this is where cx.per1 knowledge is
needed.) T bls produces n problem-symptom matrix. In the second phase, the correlator compares the stream of
alarm events with the codebook and identifies the problem.
In order to generate the codebook matrix of problem-symptom, let us first consider a causality graph, which
represents symptom evenlS caused by other events. An ex.ample of such n causality graph is shown in Figure. IJ . 19.
Each node in the graph represents nn event. Nodes are connected by directed edges. with edges starting at a causing
event and terminating at a resulting event For example, event El causes events E4 and ES. Notice that events E l ,
E2, and E3 have the directed edges only going out from them and none coming into tbem. We can identify these
nodes as problem nodes and !he rest as symptom nodes, as r.hey all have at least one directed edge pointing inward.
With problems labeled as Ps and symptoms as Ss, the newly labeled causality graph of Figure 11.19 is shown in
Figure 11.20. Tbere are three problem nodes, PI, P2, and P3, and fOur symptom nodes S I, S2, S3, and S4. We
have eliminated those directed arrows where one symptom causes another symptom, as it does not add any
additional infOrmation to the overall causality graph. ·
We can now genero1e a codebool) of problem-symptom matrix for the ca.usality graph of Figure 11.20 (we will
drop the qualifier "labeled" from now on). This is shown in Figw-e I I .21 with three columns as problems and four
rows liS symptoms.
1>1 1>2 P3
S1 1 1 0
S2 1 1 I
:>3 0 1 I
S4 0 0 I
In general, the number of symptoms will exceed the number of problems and hence, the codebook can be reduced
10 a minimal set of symptoms needed to uniqncly identifY the problems. It is easy to show that two rows are
adequate to uniquely identi:fytbethree problems inthecodebook shown in Figure 11.21. We will keep row Sl and
uy 10 eliminate subseqncnl rows, one a1 a time. AJ each step, we waru to make sure that the remaining codebook
distinguishes between the problems. You can prove to yoUrself that eliminating rows S2 and S3 does not prescrve
the unique.ness, whereas eliminating either S2 and S4 does, The reduced codebook, called the correlation matrix, is
shown in Figure 11.22.
Drawing the causality graph base-d on the correlation matrix of Figure 11.20_, we derive the correlation graph
shown in Figure 1.1.23, whicb is called the correlation graph.
p p
(b)P~~IyGr.lph
We will now reduce the causal.ity graph to a correlation graph. Symptoms 3, 4, and 5 fonn a cycle of causal
cqui>;alencc aod can be replaced by a single symptom, 3. Symptoms 7 aod 10 are caused, respectively, by
symptoms 3 and 5 and hence can be ignored. Likewise, symptom 8 can be elimi.nated as it is an intermediate
symptom node between problem node I aod symptom node 9, which is also directly related to problem node II.
We thus arrive at the correlation graph shown in Figure 11.25 and the ~-orrelation matrix shown in Figure 11.26.
Notice that in the particular example Ute model is unable to distinguish between problems I and II as they produce
identical symptoms in the correlation graph based on the.event model.
PI P2 P11
~ 1 I 1
ss 0 1 0
sg 1 0 1
Funher refinements can be made in the codebook approach to event correlation in terms of tolerance io spurious
noises and probability relations'hip in the causality graph. We have derived the correlation.matrix to be the minimal
causal matrix. Thus, each column in the code matrix is differentiated from other columns by at· least one bit (i.e.,
value in one cell). From coding theory. this corresponds to a Hamming d.istance of one. Any spurious noise in the
event detection could change one of the bits and ·thus a codeword would identify a pair of problems. This could be
avoided by Increasing the Hamming distance to two or more, which would increase the number of symptoms in Ibe
correlation matrix. Also, the relationship between a problem and symptoms could be defined in terms of
probability of occun:ence, and the correlation matrix would be a probabiiLo;tic matrix ..
The codebook correlation technique has been implemented inlnCharge system developed by System Management
ARTS (SMARTS) LYemini ct at., 1996].
A state trnnsition graph model is used by Seagate's NerveCenter correlation system. This could be uSed as a stand-
alone system or integrated with an NMS, which HP OpenView and some other vendors have done.
A simple state diagram w ilb two states fOr a ping/response process is shown in Figure I L.27. The two states are
ping node and receive response. When an NMS sends a ping, it transitions from the ping node state to the receive
response state. When it receives a response, it transitions back to the ping node state. As you know by now, ibis
method is how the health of all the components is monitored by the NMS.
Fi~urt tl.27. Stole Tnm>iilunDiagriUD for Ping!Rtsj>Onu
It is besc to illuslrnte with an example of how a stale crnnsition diagram could be used to corre.b te events in the
netwo(k. Let us cboose the same examp.le as in model-based reasoning, Figure II. II. An NMS is pinging tb.e hubs
that arc accessed via o router. Let us follow through the scenario of the NMS pinging a hub. When the bub is
working and the connectivity to the NMS is good. a response is received fur each ping .sent, say every minute, by
the N MS. This is represented by the top two stales, ping hub rutd receive response, on the left side of Figure 11.28.
Let us now consider the sit\13tion when a response for a ping is not received be!Ore Ill<} next ping is ready to be
sent. NMS typically expects a response in 300 milliseconds (we are not pinging some obscure host in a fureign
country!). An acti:>n is taken by the NMS and the state transitions from receive response to pinged twice ( referred
to as ground state by NerveCenter). lt ls possible that a response is received for Ute second ping and in that
situation the state 1ransitions back to the normal ping hub state.
No Response
IIOI!IRoolot.
Nol\ctlcn
--. ·Ro•ponso ·····
However, if there is no response for the second ping, NMS pings a third time. The state lransition is now pinged
three times. The response for this ping will cause a.u:ansit ion to the ping hub state. However, let us consider the
situation of no-response for the ·third ping. Let us assume that the NMS is configured to ping three times before it
declares thlll there Is a communication failure between iland the hub. Withow any correlation, an alarm will be
triggered and the icon representing the hub would tum red.
However. the bub may actually be working and the workstations on it may all be communicating with e<tch other.
From the topology database, the correlntor in the NMS is aware that the path to the bub is via the router. Hence, on
failure ofthe third ping. an action is Ioken and Ihe system transitions to the ping router state. The router is pinged
and the system i ransitions to the receive response from router state.
There are two possible outcomes now. The connectivity to the router is lost and no response is received from the-
router. The system takes no action, which is indiO<tt.ed by the closed loop in the ping router state. (How does the
router icon tum red in this case?)
The second possibility Is that a response is received &om the rower. This means thai the connection 10 the bub is
lost. Now, the correlntor in the NMS triggers an alarm that turns the hub icon red.
We notice that in tbe scenario of a router connectivity fallure, only the router icon turns red and none oftl~e hubs
connected to it tum red. thus identifYing the roo! cause of the problem.
Another model-based filuh deJection scheme uses !he communicating fini1e sl8te macbioe [Mlllcr, 1998]. Tile main
claim of this process is mat it is a passive testing system. It is assumed that an observer agent is present in each
node and reports abnormality to n central point. We cnn visualize '!be node observer as a Web agent nnd the central
point as the Web ilerver. An application on the server correlates the evenl.s. A failure in a node or aJink is indicated
by the state machine nssocialed with the component entering an illegal stale.
A simple communicating fmile state machine for a client- server system is shown in Figure 1129. It presents
communicat ion between a client and se,rver via a communicution cbnno.el. For simplicity, both the client and the
server are a<;Sume.d to have 1wo slates epch. The client, which is in send request state., sends a request message to
the server, and transitions to receive the response state. The :>e.rver is currently in the receive request slate. The
server receives the request and 1ransitioos to the send response staJe. After processing the request, it sends the
response and transitions back to the receive request state. The client then receives the response from the server and
transitions to the send request state.
Communicatlon
onannec
If either !he client or ihe-server enters an illegal state during the trnnsilions, !he system has encountered a fuulL For
example, after :>ending a response, if the server does not transition to receive a request slate, it is in a tailed state. A
message is sent to a central loc81ion under a fault condition either by the component itself or by tbe one
communicating with the· failed component. This is a passive detection scheme similar to ·the trnp mechanism.
We cnn observe a similarit.y between the finite siate machine model and tbe state lrnnsition graph model with
regard to state transitlo ns. However, !he main difference is that the furmer is a passive sys1em and the latter is an
active one.
Security management is hath a technical and an administrative issue in information management. It involves
securing access to !he network and information flowing in dte network, access to data stored in !he network, and
manipulating the data that fire stored and flowing across the network. TI1e scope of network and access to 1 noI
only covers enterprise imraoet network, bu1 also the Internet that il is connected 10.
Another area of grea1 concern in secure communication is communication with mobile stalions. 1l1ere was an
embarrassing case of n voice conversation from the car-phone of a politician being intercepted by a third party
traveling in an automobile. Of course, this was an onnlog signal. However, this could also happen in the·case of a
mobile digital station such as a hand-held stock trading device. An intruder could intercept messages and alter trade
lf8l)S8Ctions eitber 10 benefit by it or to burt the person sending or receiving them.
In Chapler 7 we covered several of the security issues IISSOciated with SNMP management as pari of SNMPv3
specifications and discussed possible security threatS. Four types of security threaiS to network management were
identified: modification of infOrmation. masquerade, message stream modification, and disclosure. They are
applicable t·o security in the implementation of security subsystems in the agent (authoritative engine) and in the
manager (non-authorillltive engine). The SNMPv3 security subsystem is the User-Based Security Model (USM). It
has two modules-an authentication module and a privacy module. The· former addresses data integrity and data
origin; the latter 5 concerned with data confidentiality, message timeliness, and limited message protection. The
basic concep1S discussed in Chapter 7 are part of generalized security managemenr in data communication.~.
Security management goes beyond tbe realm ofSNMP management. In this section, we will address pol.icies and
procedures, resources to prevent security breaches, !!lid network proteot~n from software allacks. Policies and
procedures should. cover preventive me<~Sures, steps to be taken during the occurrence of a security breach, and
post-incident measures. Because the Internet is so pervasive and everybody' s network is pan of it, all government
and private organizations in the world are concerned wit.h security and privacy of information traversing it:.
1n this introductory I ext book, we will nol be going into the depth of security managemenl that it deserves. For
additional infOrmation, you are advised 10 pursue lhe innumerable· refere.nces available on tbe subject [Cooper ~
!!!., 1995; Kaufman et al., 1995; Leinwand and Conroy, 1996; RFC 2196; Wack and Carnahan, 1994].
11.5. 1. J'olicies 11otl J'rocedu res
The IETF workgroup that. generated RFC 2196 defines a security policy as "a fOrmal statemenr of the rules by
which people who are given access to an organizntion's tec.hnology and information assets must abide." Corporate
policy sboukl address both access and security breaches. Access policy is concerned w.ith who has access to wbal
information and from what source. SNMP management addressed this in terms of a community access policy fOr
network. management information. An example of access policy in an enterprise network could be that all
employees have full access to the network. However, no1 everyone should have access to all corporate information,
and thus accounts are established fi:>r appropriate. employees 10 have access to appropria1e hosts and applications in
those hosts. These pol.icies should be written so lhat all employees.are fully aware of them.
However, illegal entry into systems and accessing of networks must be protected against. The policies and
procedures for site security management on lbe Jnte.met are dealt with in detail elsewhere [RFC 2196; NIST,
1994]. The National Computer Security Center (NCSC) bas published what is known as the Omnge Book, which
de.fmes a rating scheme for computers. It is based on the securil.y design features of the computer. 'Tile issues for
corporate site security using the intranet are the same as for the Internet and are applicable to them equally.. It 5 a
framework ror setting security policies and procedures.
4. Impl ement measures, whloh will protect your assets In a cost-effective man(ler
5. Review the process continuously and make improvements to each item if a weakness is found
The asseiS that need to be protected should be listed including hardware, sof,lware, data, documenta lion, supplies,
and people who have responsibility fur all of the above. The classic threats are &om unauthori7..ed a.ccess to
resources and/or information, unintended andlor unauthorized disclosure of information, and denial of service.
Denial of service is a serious attack on the n.etwork. 1l1e nel~vork is brought to a sUite in which it can no longer
carry legitimate users' dam. This is done eithe·r by attacking the routers or by flooding the network with extraneous
trnffic.
We addressed tbe policies and procedures in the last section. ln this section, we will discuss various security
breache~ that are attempted to access data and systems, and the resources available to protect them.
Figure 11.30 shows a secure commun1cation network, wh1ch is actually a misnomer. There is no fully secure
system in the real world; the.re are only systems which are hard and time..:-onsuming to break into, as we shall
describe. Figure 11.30 shows two networks communicating with each other via a WAN, which has just one router.
Server A and Client A shown in Network A are communicating with each other; and Cuent 8 in Network 8 is al.so
communicating (or trying to communicate) with Server A in Network A.
Let us lo ok at the securit.y breach points in this scenario. Hosts in Network B may not have the privilege to access
Network A. The firewall gateway shown in Figure 11.30 is used to screen traffic going in and out of seoore
Network A Even if Network 8 has access permission to Neh\urk A, some inlntder, for example one who has
access to the router in the path, may intercept the message. The contents oflhe message, as well as source and
des1inai"ion identifications, can he monitored and manipulated, which are se~urity breaches.
Security breaches can occur in the lntemet and intranet environment in numerous ways. ln most corporate
environments, security is limited to user identification and password. Even the password is not changed often
enough. This is the extent ofauthentication. Authorization is limited to the establishment ofa.ccounts, i.e., who can
log into an application on a host. Besides noonal act"iviries of breach, we have to protect against special s ituations,
suc:h as when a disgruntled employee could embed virus programs in company programs and produciS.
11.~.3. Firewalls
The main purpose of a firewall is to protect a network from external attacks. Jt monilors and controls traffic into
and out of a secure network. h can be implemenled in a router, or a gllteway. or a special host. A fu:ewall is
normally located !It the gateway to a network, bui it may also be implemented at host access points.
There are nwnerous 'benefits in implementing a firewall to a net work. It reduces the risk of access to hosts from an
eXIemal network by filrering insecure services. It can provide controlled access to the network in ibat only
specified hosts or network segments oouJd access some hosts. Since security protection from external threats is
centralized and transparent, it reduces the annoyance to iDlcmal users while controlling the e.xtellllll users. A
firewall could also be used to protect the privacy of a corporation. For example. services such as !be utility finger,
which provides information about employees to outsiders, can be prevented from accessing the network.
When the security policy ofa company is implemented in a firewall, it is a concatenalion of a higher-level access
service policy, where a total service is filtered (lUI. For example, !be dial-in service can be t(ltally denied at !be
service policy !eve~ and ihe firewall can filter out selected services. such as the utiliiy .finger, which is used to
obtain infonnation on personnel.
Flrewalls use packet filtering or application-level gateways as the two primary techniques ofcolllrolling u.odeslred
traffic.
Packet Filters. Packet ftltering is the ability to filter packets based on protocol-specific criteria. It is done at the OS!
data link, network. and. transport layers. Packet filters are implemented in some commercial fOUters, called
screening ro1rters or packet-filtering routers. We will use the generic tenn of packet· filtering routers here. Although
routers do not look at the transport layers, S:Ome vendors have implememed this additional feature to sell them as
firewall route.rs. The fdtering is done on the following parameters: source IP address, destination fP address, source
TCP/UDP port, and destination TCP/IP port. The filtering is implemented in each port of the router and can be
progranlOled independently.
Packet-filtering routers can either drop packets or red.irect !hem to specific hOSts for further screening, as shown in
Figure 11.31. Some ofr.he packets never reacb tlte local network as they are trashed. For example, all packets from
network segment a.b.c.O are programmed to be rejected, as well as File Transfer Protocol (FTP) packets from
d.e.f.0:21 (note that Port 2.1 is a standard FTP port). The SMTP (email) and FTP packets are redirected to their
respective gateways for further screening. It may oo observed from the figure that the firewall is aS)'mmetric. All
incoming SMTP and FTP packets are parsed to check whether they should be dropped or forwarded. Howeve.r,
outgoing SMTP and FTP packets have already been screened by the gateways and do not have to be checked by
the packet-filtering router.
Packe1-Alterl119
Ro\rtor
Setured Network
A packet-filtering firewall works well when !be rules to be .implemented are simple. However, the more rules we
introduce, the more difficult it is to implement. The rules have to be implemented in the right order or they may
-produce adverse effects. Testing and debugging are also difficult in packet filtering [Chapman, 1992].
Application-Level Gateway. An application-level g;neway is used to overcome some of the problems idenlified in
packet filtering. Figure 11.32 shows the appliccatioo g::rteway architecmre. Firewalls Fl nod F2 will only forward if
data are to or from the applicat.,n gateway. Thus, the secured LAN is a gateway LAN. The application gateway
be-haves differently for each application, aod filtering is handled by the proxy services In the application gateway.
For example, for FTP service, the file is stored first in the application gateway and then forwarded. Fo.r TELNET
service, the application gateway ve.rifies U1e authentication of the foreign host, the legitimacy to communicate with
·the local host, aod then makes the connection between the gateway 11od tbe local host. It keeps a log of all
transactions.
PtOX)'
Stmc;.e•
Apj>llc:n:on
GlliOWiy
Firewalls protect a secure-site by checking addresses (such as IP address), rranspor1 parameters (s.uoh as FTP,
NNTP), aod applications. However, how do we protect access from an extemal source based on the user. who is
using a false identification? Moreover, how do we protect again.~i an intruder manipulating the data while they are
traversing the network between the source aod tbe destination? These concerns are addressed by secure
communication.
U .5.4. Cryptography
For secure communication, we nocd to eosure integrity protection and authentication validation. Integrity
protection makes sure that the infom~ation has not been tampered with as it tmverses between U1e source sod the
destination. Autbentication validation validates the originator ifentification. l.n other words, wben lao receives a
message that identifies it coming from Rita, is it reaUy Rita who sent the message? These two important aspects
address the four security threats--modification of infOrmation. masquerade, message stream modification. aod
di<iclosure-mentioned nt ihe beginning of Section 11.5. Besides the actual message, eonrrol and protocol
l~andshakes need to be secure.
There are hardware solut.,ns to authentication. However, it is noi a complete solution, since the information could
be intercepted aod tampered wilh as it tmverses from the source to tbe destination, including tbe user identification
aod password.
The technology that is best suited to achie.ving secure communication is software- based. Its foundation li.es in
cryptography. Hashing or message digest, and digital s igoature, whlch we will address soon, are built on top of it to
achieve integrity protection aod soun:e authentication.
Cryptographic Communication. Cryptogmphy means secret (crypto) writing (graphy). It deals with techniqueJ> of
transmitting informatiOJL for example a letter from a seodcr to a receiver without any intermediary being able to
decjpher it. You may view this as the information (letter) being translated to a special language that only the sender
aod receiver can interpret. Now, Cl)-ptogrsphy should also detect if somebody was able to intercept the
infonnation. Again extending our analogy, if the letter written in a secret language were 10 be mailed in a sealed
(we mean really sealed) envelope, ifsomebody Ulmpers with it, the receiver would detect it.
The basic model of cryptographic communication is shown in Figure 11.33. The input message, called plaintext, is
encrypted by tbe encryption module using a seer1.1 (encryption) key. The encrypted message is called ciphertext,
whi.ch u:averses through an unsecure commuo.ication ·channe~ the Internet for example. The ciphertext is
unintelligible information. At the receivi11g end, the decryption module deciphers the message with a decryption
key to retrieve the plaintext
The first known example of cryptography is the Caesar cipher. Lo this scheme, each letter is replaced by another
letter, which is three. letters later in the alphabet (i.e. . key of3). Thus, the plaintel\1; network management. wlll read
as qhwzrun pdqdjhphqw in cipher1ext. Of course, the receiver ·knew ahead of time the secret key (3) for
successfully decrypting the message back to the plaintext network management by moving e~h letter back three
positions.
Secret Key Cryptography. l11e Caesar cipher was later enhanced by the makers of Ovaltine and distributed as
Captain Midnight Secret Decoder rings. Each letter is replaced by another letter n letters later in the alphabet (i.e.,
key of n). Of course, the sender and the receiver have to ~ ahead on the secret key for successful
oommun.ication. lt is the same key that is used for encryption and decryption and is called secret key cryptography.
The encryption and decryption modules can be implemented in either hardware or software,
ll is not hard to decode the above ciphertext by an intmder. It would only take a ma:dmum of 26 attempts to
decipher since there are 26 letters in the alphabet. Another encryption scheme, monoalphabet.ic cipher, is to replace
each letter uniquely with another letter that is randomly chosen. Now, the maximum number of attempts for the
intruder to decipher has been increased 261 (261 =26 · 25 · 24 · ... I). However, it really does not iake that many
attempts as there are patterns in a language.
Obviously. the k.ey is the key (no pm1 intended) to the security of messages. Another a.~pect of the key is the
convenience of using it. We will illustrate our scenario with lan and Rita (lan for initiator and Rita for responder so
that it is easy to remember) as users at the two ends of a "secure" communication link. lao and Rita could share a
key, their " secret key," for accomplishing secure communication. However, iff an wantsio communicate with Ted
(for third party), they both need to share a secret key. Soon. lao has to remember one secret key fOr each person
wiib whom he wants to communicate, which obviously is impr.actical It is bard enough to remember your own
passwords, if you have several of them, and which systems they go with.
Two standard algoriduns implement secret key cryptography. They are Daw Encryption Standard (DES) and
lnt.ernationa1 Data Encryption Algorithm (IDeA) [Kaufinan ~-· 1995], They both deal with 64-bit message
blocks and create the same size ciphertext. DES uses a 56-bit key and IDEA uses a 128-bit key. DES is designed
for e fficient hardware implement!llion and consequently bas a poor per.fonnance if implemented in so ftware. 1n
contrast to thai, IDEA functions efficiently in software implementation.
Both DES and IDEA are based on the same principle of encryption. The bits in the plainte.Xi block are rearranged
using a predetermined algorithm and the secret key several limes. While decrypting, the process is repeated in tbe
reverse order for DES and is a bit more compl icated for IDEA.
A message that is longer than the block length is divided into 64-bil message blocks. There are several algorithms
to break Ute message. One of the more popular ones is the cipher block chaining (CBC) method. We !.earned UJC
use of it in USM in SNMPv3 in Section 7.7. There, the message was broken using CBC rutd then encrypted using
DES. Performing such an operation on till} message. even on identical plainteXl blocks, would resu.lt in dissimilar
ciphertext blocks.
Public Key Cryptography. We observed that epch user has to have a secret key for every user that be/she/it (a
program) wants to commun.icate wit:h. Public key cryptogr:aphy [Diffe and Hellman, 1976; Kaufman S1..!!!, 1995]
overcomes tbe d ifficulty of having too many keys for using cryptography. Secret key cryptography is symmetric i.n
that lbe same key is used for encryption and decryption. but public key cryptography is asymmetric with a public
key and a private key (not secret key, remember secret key is symmetric and private key is not). ln Figure 11.34,
the public key of !an is lbe key that R.ita, Ted, and everybody else (that lao wants to communicate with) would
1-"llow and use to encrypt messages to {Wl. Tbe private key, which only ian knows, is the key thatlan would use to
decrypt the messages. With this scheme, there is secure communication between Ian and his co mmunicators.on a
one-to-one basis. Rita 's message to Ian can be read only by lao and not by anyone else wbo has his public key,
since the public key C.1MOt be used to decrypt the message.
We can oo mpare tbe use of al;ymmetric public and ·private keys in cryptography to n mailbox (or a bank deposit
box) with two openings. There i·s n mail slot to drop the mail and a collection door to take out mail. Suppose it is a
private mailbox in a club and has restricted. access to members only. AU members can open ibe mail slot with a
public key given by the administration io drop their mail, possibly containing comments on a sensitive issue oflbe
club. Any member's mail carmot be accessed by other members since tbe public key only lets tbe members open
the mail slot to drop mail and nottbe collection door. Tbe administrator with a private key can open the collection
door and access the mail of nJJ the members. Of course, Ulis a<>)'mmetric example bas more to do with access than
cryptography. But. you get the ideal
The Diffe-Rellman public key algorithm is the oldest public key algorithm. h is a hybrid of secret and public key.
The commonly used public key cryptography algorithm is RSA, named after its inventors.• Rivest et al [1978]. It
does both encryption and decryption as well as digital signatures. BQth the message length and ihe key length are
variable. The·commonly used key length is 51 2 bils. The block size oftiJC plainte~'l.1, which is variable, should be
less than the key size. The cipberte.1 is always the leogtb oft.be key. RSA is less efficient than either of the secret
key algorithms, DES or IDEA. l:leoce, in practice, RSA is used to first encrypt lhe secret key. The message is lhen
transmitted in one of the secret key algorithms.
Message Digest. Any telecommunicntio.ns engineer is familiar with the. cyclic redundancy check (CRC) detection
of errors in digital transmission. This involves calculating a chec.k sum based on lhe data in tbe frame or packet at
the sending end and transmitting it along with the data. The CRC, also known as checksum, is computed at the
receiving end and is matched. against the received. checksum to ensure that the packet is not corrupted. An
analogous principle is used in validating the integrity oflhe message. In order to ensure that tbe message llas oot
been tampered with between the sender and the receiver, a cryptographic CRC is added with tl!C message. Thi s is
derived using a cryptographic hash algorithm, called message digest (MD). There are several versions, one ofthe
most common being MOS. We covered the useofMDS while di,s(;ussing the authenticalion protocol in SNMPv3 in
Chapter 7. We will look at the use of it in digital signature in the neKt subsection.
1'here are different Implementations of MOS. In particular, the MDS utility is used in PreeBSD 2.x (mdSsum under
LlNUX). T he utility 11lkes as input a message of arbitrary length producing output consisting of a 128-bit message
digest ofthe input. An example ofMD5 utility use is shown in Figure 11.35.
Smd5
The quck 11!0\vr~ to~ lumpeel over 1he laZy dog
~o
d8e8fca2dc0f896fd7cb4cb0031b<C249
As we can see from the example. the message digest for the string that we entered was gimemted based on the data
received from standard input (from the screen). The FreeBSD version also has a test mode lhat can be turned on by
specifying "-x" as a parameter, as shown in Figure 11.36.
SmdS ·~
MO~ ('") - d.& t .&:d50100b204c!le00999c>cl8427o
M05 ("ll1• Occl75b9c:Oflb6a831c399e269772661
MO!! ("alle:1•900 UI090Jc;QI~d28ol7172
MOS ("rneuago digfst"): 196b697d7cb7938d525a2t31aa!1&IdO
M05 ("abCCe~Jdl'l'J'lOPCif$11NWXYZ") ~ c3kld3d76192e«l07dlb496cca67e13b
MOS
("A8COEFGHUO..MNOPORS'TlJ\IWXYZ.aboefg~~012345 6789')
... d1743b96d277d915a5611c2c91419d9f
MDS
(" 1234567800123450789(}12345678901234567890123456TB901234567gg()t23456
783012345678001 : 57edf4822be3d!55ac49da2e2101ll67a
A second algorithm used to obtain a hash or message digest is the Secure Hash Stand!trd (SHS). This has been
proposed by the National fnstimte for Standards and Teclu10logy (NIST). It is similar to MD5, but can bandle a
4
maximum message length of26 bits in contrast with MD5, which can handle an unlimited input length of32-byte
chunks. l11e SHS produces an output o£160-bit, whereas lhe MOS output is 128-bit long.
Some significant features of the message dige~t are worth mentioning. First, there is a one-to-Q.ne relationship
between ·the input and the output messages. Thus, the input is uniquely mapped lo an output digest. It is interesting
to observe that even a one-bit dift'ereuce in a block of 512 bits could produce a message digest lllld looks vastly
different. In addition, the output messnges are completely uncorrelru.ed. Thus, any pattern in r.he input will not be
recognized at the output.
Another leatute of message digest is that the output digest is of oonslantlengtb for a given algorithm with chosen
parameters, irrespective of the input message length. In this respect, it is very similar to CRC in r.hat CRC-32 is
exactly 32 bits long. We saw in SNMPv3 ·r.hat the authKey generated. by the MD5 algorithm is exactly 16-ootet
long.
Lastly, the generation of a message digest is a one·way function. Given a message, we can generate a unique
message digest However, given a message digest, there is no way the original message could be generated. llms,
if a password were transmitted from a client to a server. this. would protect against somebody eavesdropping and
deciphering the password. TbL~ could also be used for storing. the password file in a host without any hu.man being
able to decipher it
We know that the generation of a message digest is a one·way function. We also know that no two messages could
produce identical message digest.s. Could r.hese two combinations ensure Lhatlhe message is not tampered in transit
by an unauthori2ed person? The answer is no. This is becaus-e If t.he interceptor knows which algorithm is being
used, .he or she cou.ld modifY the message (assuming that be/she decrypts the message), generate a new message
digest, and send it alon.g with the modified message. If Jan sent the rne.ssage to Rita, and Ted med ified the message
as per the above see nario while in trans.it, Rita would not know the difference. Additional protection is needed to
guard against such a lhret~t, which is achieved by attaching a digital signature to lhe message.
Digital Signature. In public key cryptography, or even in secret key cryptography, if Rita receives a message
claiming that it is from lan, there is no guarantee as to who sent the message. For elalmple, somebody other than
lao who knows Rita's public key could send a message identifying himself or herself as lao. Rita could not be
absolutely sure who sent it. To overcome this problem. a digital signature can be used Signed public-key
cryptographic communication is shown in Figure 11.37.
The digital ~ignature works in the reverse direction from that of public key cryptography. Ian can create a digilal
signature using his private key (marked "S" in parentheses in Figure 11.37) and Rita could vnlidrue it by reading it
using Ian's public key. The digital signature depends on tbe message and the key. Let us cons.ider tbat lao is
sending a message by email to Rita. A digital signature, which is a message digest, is generated using any bash
algorithm with t.he combined inputs of t.he plninield message and t.he private key of lao. The digital signature is
concatenated with !he plain! eX! message and is encrypled using Rila 's public key (marked "R" in parentheses). AI
the ~elving end, the incoming ciphertext message is decrypted by Rita using her privaJe key. She !hen generates a
message digest with the combined input of the plainteXI message and Jan's public key, and compares it with the
digital signature ~eived. rf!hey match, she concludes thai the message has not been tampered with. Further, she
is assured that the message is from lao, as she used Ian' s public key to authenticate the source of the message.
Notice that only the originator can create the digital sign.ature with his or her private key and others can look at it
with the originator's public key and validate it, but cannot create it A real-world analogy to dig.iial signature is
check writing. The bank can Validate tJIC signature as to its originality, but lt is hard to duplicate a signature (at
least manuaUy) of the person who signed the chec.k.
Oigilal signature is valuable in electronic commerce. Suppose Rita wants to place an order with company ABC for
buying !heir router product. She places the order over the Internet with her digital signature atlllched to it. The
dlgi!nl signature using a public key protects both ABC and Ritn regarding the validity oftbe order and who ordered
it. It is even better than using secret key cryptography, since in tbe latter case, Rita could change her mind and
allege that company ABC generated the order using the secret key that they have been using. In public key
cryptography, she could not do thai.
Authentication is the verification of !he user' s identification, and authorization is the access privilege to the
information. On the Internet without security. the user's identification and password, wbicb are used fur
authentication, can easily be captured by an intruder snooping on the LAN or WAN. There are several secure
mechanisms for authentication, depending on compleKity and sensit.iviiy. Authorization to use the services could be
a simple read, write, read- write, or no-access for a particular service. The privilege of using !he service cot~d be
for an indefinite period, or a fmite period. or just for one-time use.
There are two main classes of systems, which are of inli:rest to us in the implementation of an authentication
scbeme. The first is the client-server environment in which there is a request-response communication between the
client and the server. The client initiates n request for service to the server. The server responds with the results of
the service perfOrmed. The communication is essentiaUy two-\va:y communication. In this environment besides
authentication (and of course, an integrity check), authorization also needs to be addressed.
The second class of service is a one-way·commun.ication environment, such as email or·e-wmmerce transaction.
The message transmitted by the source is receiv.ed by the receiver after a considerable delay-sometimes days if a.n
intermediate server bo kls up the transaction for a long tin!C.ln such a case botb the authentication and an integrity
check need to be perfurmed at the receiving end.
We will address client-;;erver ambentication systems in the neXI section and the one-way ~message authentication
and u1tegrity protection SYstem in Section 11.5.7.
We wi.ll consider four types of client-server .environments and the implementation of authentication function in
each: host/user environment, a ticket-granting system, an authentication system, and authentication using
cryptographic function.
Host/User Authentication. We have the traditional best and user vaUdalion for authentication, both of which are not
very secure. They are also not convenient to use. Host authentication involves certain hosts to be validated by the
server providing the service. llte host nantes are administered by the sei'Ver administrator. 1lte server recognizes
tJIC host by the host address. U ServerS is aut.ltorizcd to serve a client Host C, then anybody wbo has an account in
C could access Server S. The server mainrains the list of users associated wiUt Host C and allows access to the
user. [f John Smith is one of the userll inC, and John wants to access the server from another Workstation W, tbe
workSiation W has 1.0 be authenticated as a client ofS. If not. John is out ofluck. Further, his name has to be added
to the list of users in W to access S. To make the environment flexible, every client wilh every possible user is
added to the server negating the secure access feature!
Let us consider user authenticalion, which is done 'by the user providing identification nod a password. The main
problem with the password is that i1 is detected easily by eavesdropping, say using a network probe. To prolect
again.st the. threat of eavesdropping, the security is enhanced by enci)'Pting the password. belPre transmission.
Commercial systems are available that generate a one-time password associated wah a password server that
validates it when presented by the service-providing host. The user uses a uniqoo key each time to obtain tbe
password. such as in the ticket-granting syslem (nexi section).
Ticket-Granting SySiem. We will explain I he ticket-granting system wi1h I he moSI popular examp.l e of Kerberos,
whlch was a system developed by MIT as part of their Projecl Athena. Figure 11.38 shows Ihe lickel-granting
system with J<erberos. Kerberos consists of an authentication server and a liekel-granting server. Tbe user logs into
a clienl workstation and sends a login request to the authentication server. Afier verifying that tbe user is on the
access control list, the authentication server gives an encrypted ticket-grant ing ticket to the client. The client
workstation requests a password from !heuser, which it ll~es to decrypt tbe message from the authentication server.
The client then lnlemcl~ wilh the tickel-grantlng server and oblains a service-granting ticket and a session key to
use the application server. The client wor.kstation then requests service from the applic-ation server giving the
service-granting ticket and the sess.ion key. The application server, after validation of the ticket and the session key,
provides service to tbe user. Of course, this processing happens in the background. lt is transparent to the user,
whose only interaction is with lheclient workstation requesting applic-ation service.
I KOibet'oS I
! t
Appi!Qir.on
Saver' Ticl<el-
SeMce I+ Granting
Server
Authenticalion Server System. An authentication server system, shown in Figure. 11.39, is somewhat similar to the
ticket· granting sySie.m except that there is no ticket granted. No login identification and password pair is sent out of
the client workstation. The user authenticates to a central authentication server, which has jurisdiction over a
domain of servers. 11te central authentication server, after validation of the user, acts as a proxy agent to the client
and authenticates tbe user lo t.he application server. This is lransparenl to tbe user, and the clienl proceeds lo
communicate with the applicatio.n server. This is the architecture of Novell LAN.
•
Se111lce
t
Appllcotfon
Server/ - A I Ahenlicato n -
Survlce
Authentication Using Cryptographic Functions. Cryptographic authentication uses cryptographic functions. The
sender can e ncrypt an nuthentication request to the receiver, who decrypts the message to validate the identification
of the user. Algorithms and keys are used to encrypt and decrypt messages, which we will address now.
The one-way 1nessage transfer system is non-iote.ractive. For example, if Rita receives an email &om a person who
claim~ to be Ian, she needs to authenticate Ian as the originator of the mess.~ge, as well as to ensure that nobody has
tampered with the message. This coukl also be the situation in the case where Ian sends a sell order from his
mobile station in his car. Ted could intercept the message and alter the number of shares or the price. We will treat
all these under thecaiegory of secure mail systems.
There are three secure mail systems-privacy-enhanced mail (PEM), pretty good privacy (PGP), and X.400-based
mail system [Kaufman et al., 1995). All three schemes are var.iations of the signed public-key cryptographic
communication discussed in Section 11.5.4 and s hown in Figure 1137. We will describe PEM and POP in lhis
section. X-400 ls a set of specitlcations for an emru1 system defined by the lTU Standards Committee and adopted
by OSI. It is a fi'amewo rk rather than implcr.nentalion-ready s pecification. We will also review SNMPv.l secure
communication that we covered in C hapter 7 as it bears a close resemblance to the message lronsfer security.
Privacy-Enhanced Mail (PEM). Privacy-enhanced mail (PEM) was developed by IETF, and spec ifications are
documemed in RFC 1421-RFC 1424 . H is intended to provide PEM using cod-to-end cryptography between
originator and recipient processes [R.FC 1421]. The PEM provides privacy enhancement services (whar else!),
which are defined as (1) con.fidenl'iality, (2) authentication, (3) message integrity assurance, and (4) non-
repudiation of origin. Tho crypt.ographlc key, called the data encryption key (DEK), coukl be either a secret key or
a p ublic key based on the specific implementation and is ihus tlellible. l:lowever, ihe originnting and terminating
e nds must have co mmon agreement (obv iously!).
Figure 11.40 shows three PE M processes defined by rETF: MIG-CLEAR, MIC-ONL Y, and ENCRYPTED based
on message integrity and encryption scheme. Only the originating end is shown. In all three procedures, reverse·
procedures are used to extract the message and validate the originator 10 and message integrity. The differences
bet ween tbe three procedures are dependent on the extent of cryptography used and message eocodlng. The
message integrity code (MlC) is generated as discussed in Section 11 .5.4 on digital signature and inclnded as part
of emai I in all three procedures.
The specification provides two types of keys-a dat.a-encrypting key (DEK) and an inte.reKchange key (IK). The
DEK is a random number generated on a per message basis. l 'heDEK is used to encrypt the message text and also
to generate an MIC, if needed. The lK, which is a long-range key agreed upon bctween the sender and the receiver,
is used to encrypt DEK fur transmission within the message, The fK is either a public or a secret key based on the
type ofcryptographic eKCbange used.
If an 83)111lmelric public key is used to encrypt the meso;age. then the sender cannot repudiate ownership of the
IJle'ssage. Legal evidence of message transactions is stored in the da111, which ore used in applications such as e-
comme.roe. Another common duu:acteristi.c of these procedures is the first step in converting the user-supplied
plaintext to a canonical message text representation, defined as equivalent to the inter-SMTP representation of
message text. l11e final output in each procedure is used as. the text portion of the email in ihe electronic mail
system.
Figure I 1.40(a) shows the MlCCLEAR procedure and is. the simplest of the three. The M!C generated is
concmenated with the SMfP ieKt and is inserted as the text portion in the email.
In the MIC-ONLY procedure, shown in Figure 11.40(b), the SMTP telo:t is encoded into a printable character set,
The printable character set consists of a limited set of characters that is assured to be present at all sites and thus
make Lbe intermediate sites transparent ro l.he message. The MIC is concatenated witb the encoded message and is
fed to the email system.
Figure 11.40(c) is the most sophisticated of the three procedures. Tite SMTP text is padded, if needed, and
enerypted. A public key is the best choke here, because it g uarantees the originator ID. The encrypted message,
encrypted MIC. and the· DEK are all encoded in printable code tl pass through the mail system as ordinary teKt.
They are concatenated and fed to the email >)'Stem.
Preny Good Privacy (POP). Pretty good privacy (POP) is a secure mall pacj(age developed by Phil Zimmerman
thai is available in the public domain. Figure I I .41 shows the various modules in the POP process ai the
originating end. The reverse proceM occurs at the receiving end and is not shown in the figure. POP is n package in
the sense tbat it does not reinvent the wheel. lt defines a clever procedure th:u utilizes various available modules to
pcrl'orm the fuoctions needed to u:ansmit·n secure message. such as email.
The signamre generntion module uses MDS to genernte a hash code of the message and encrypts it with the
sender's private key using an RSA algorithm. Either IDEA or RSA is employed to generate the encrypted message.
IDEA is more efF.:ient than RSA, but secret key maintenance is necessary in contrast to RSA' s use of a public key.
The enorypted message is compressed using ZIP. The signature is concatenated with the encrypted message and
converted to ASCII furmat using the Radix-64 conversion module to make it compatible with the email system.
PGP is similar to ENCRYPTED PEM with additional compression capability. T he main difference between I'GP
and PEM is bow the publ ic key is administered. In I'GP, it is up to the owner. In PEM, it is furrnally done· by a
certification authority (the Internet Policy Certification Antbority (PCA) Registration Antbority). In practice, POP
is used more than PEM. Bodt POP and PEM provide more than a secure mail service. We can send any message or
frle.
SNMPv3 Security. We dealt with secure transmLo;sion in SNMPv3 in Chapter 7. Althougll an NMS -management
agent behaves like a client-server sys1em, Ihe security feawres are simi lar to the message transtcr oryptograpby.
We will compare the processes studied in Section 7.7 to message transfer cryptography. Figure I 1.42 shows a
conceptualized representation of Figure 7.13 for an outgoing message.
Either the authenticutio.n key or preferably a different encryption key is used to generate all encrypted scopedPDU
by the privacy module. This is similar (but not identical) to encryption ofthe message in PEM and PGP.
The USM module prepares tbe who Je. message with the encrypted scopedPDU and other parameters. The
authentication key and the wbole message ore used as inputs to generate HMAC. which i.s equivalent to !he
signature in PEM and PGP. The aulhentication module combines t!Je signature and the whole DJessage to output !he
autl~entic8led whole message. In an incoming message, the authenticati.on module is provided the whole message,
autlJentication key, and tiJe HMAC as input to validate the authentication.
In the curre.nt Internet environment, we cannot leave U1e subject of security withour n~entioniog tl~e undesired and
unexpected virus attack on networks and bosts.lt is usually a program l.hut, when executed, causes harm by making
copies and inserting them into olhcr programs. lt comaminates a network by imponing an infected program from
outside soun:es, either online or via disks.
The impact of virus infection manifests itself in many ways. Among !he serious ones are preventing access to your
hurd disk by infecting the boot track. compromising your processor (an outs.ide source controlling your compu!er),
flooding your network wilh extraneous traffic !hat prevents your hosts from using it, etc.
Generally, viruses are recognized by patterns and virus checkers do just that. Apan from the common sense of
preventive measures, it is wise to have tiJe latest virus ciJeCkers installed on all your hosts. It should be scheduled to
run periodically. It also checks the inputs and outputs oft he processor fur possible virus infection.
Accounting management i.s probably the least developed function of network management application. We have
dio;cussed the gathering of statistics using RMON probes in Chapter 8 and in Section 11.3 .4. Accounting
managemenl could also include the use of individual hosts, administ:rative segments, and external traffic.
Accounting of individual hosts is useful fur identifying some hidden costs. For example, the library function in
universities and large corporations consumes significant resources and may need to be accounted for functionally.
This can be done by using the RMON statistics on hosts,
The cost of operations for an information management services department is based. on Lhe service that it provides
to !he rest of the organization. For planning and budget purposes, this may need to be broken into administrative
group costs. ll1e network needs to be configured so that all traffic generated by a department can be gathered from
monitoring segments dedicated to that department.
Externallraffic for all institution is handled by service providers. The tariff is negotiated with the service provider
based on !he volume oft:raffic and traffic patterns, such as peak traffic and average traffic. Internal validation of !he
service provider's billing is a good management pmctice.
We have elected to treat repon maoageOJent as a special category, although it is not assigned a special functionality
in the OSI classification. Reports for various application functions-configuration, fuult, performance, security,
and accounting-could normaUy be addressed in those sections. l11e reasons for us to deal with reports as a spec in I
category are !he fOllowing. A well-run network operations center goes uoooticed. Atteot"k>n 1$ paid normally only
when there is a ~Tis is or apparent poor service. ll is imponant 10 genem1e, analyze, and distribute vnrious reports to
the appropriate groups, even when the network is running smoothly.
We can classify such reports into three categories: (I) planning and manageme.nf reports, (2) system reports, and
(3) user reports.
Planning and management reports keep the upper management appmised as to the status of netwo(k and system
operations. lt is also helpful for planning purposes. Budgeting needs to be done for capillll and operational
expenses. Table 11.1 lists some of lhe planning and management reports under different categories. Since the
information management services departlnent's main product is service, it is important to keep the management
apprised of how the quality of service meets the SLA (more on Section 11.9). Reports on this category inelude
n.etwork availability, systems availability, problem reports, service response to problem reports. and customer
satisfaction. Trends in traffic should address traffic patterns and volume oftmffic in the internal network, as well as
external trnffic. Information h::chnology is conslantly evolving and hence management should be kepi apprised of
upcoming technology and the plan fbr .migration 10 new technology. Finally. for budgeting purposes, lhe co.st of
operations by function, use, and personnel needs to be presented.
Systems a.vallabllity
Problem reports
Service response·
Customer satisfaction
Use
Personnel
The day-to-dny functioning of engineering and operations requires operatioD-Oriented reports. Traffic, milure, and
perl'ormancc· are the important cc~~~egories, as sbown in Table 11.2. A pattern analysis of these reports will be
helpfu l in tuning the net\mrk for optimum results.
Traffic load-external
System failures
Performance Network
Servers
Applfcatlons
Users are partners in 11etwork services and. should be kept informed as to how well any SLA is being met. Some
service objectives are met by joint efforts oft he users and the infonnalion management services depanmeot. Table
11.3 shows some typical user reports. The SLA normaUy includes network availability, system availability, traffic
load, and ·performance. In addition, users may require special reports. For ellample, the administration may want
reports on pnyro II or persoMel.
Sy5tem availability
Traffic load
Performance
We discussed network and system management too.l'i in the last chapter. ln this chapter, we oove.red the application
tools and techno logy geared 10ward network and system maongemeot. For these to be Sltcccssfully deployed in an
operationn I environment, we need 10 define a policy and preferably build that into the sysrem, i.e, implement
policy management For example, ne1work operations center personnel may observe an alarm on !he NMS, at
which time they need 10 know what act·ion lhey should lake. This depends on what component fililed, severily or
criticality of the failure, when the failure happened, etc. In addition, they need to know who should be informed
and how, and that depends on when the milure occurred and whal SLAs have been contracted wiih the user. We
illustrated this wii·h an example of CBR in Sec1ion 11.43, where a policy restrain! was used to inc-rease the
bandwidth as opposed to reduciJ1g loll!l in resolving a trouble ticket.
As we mentioned in Section 11.5.1 on security management, policy plays an equally imponanl, if not greater, role
as the lcchnicalarea. Without policy establishment and enforcement, socuriry management is no1 of much use.
Our fOcus here is not the administrative side of the subject, although il is important, b1.11 with the technical aspects
of policy imple.mentatbn in network management. Figure 11.43 is policy manageme.ot arcbitecntre proposed by
[Lewis, l996] for net work management. It consists of a domain space of objects, a rule space consisting of rules, a
policy driver thai controls action to be perfOrmed. and action ~-pace !hat implements the actions and attributes of the
network being controlled
Polley On\181
The objec1s in the domaln space are events such ns alarms in fauh management, packet loss In perrormance. and
authenticatbn failure in security management. The objects have attributes. For example, atlributes of alarms are
severity, type of device, location of device, etc. Attributes of packet loss can be the layer at which packets are lost,
the percentage loss. etc. Rules in the rule space define the possible actions that could he taken under various object
conditioos. It is the same as in RBR, with if-then, oonditio~adion. The policy driver is d1e control mechanism,
which is similar to !he inference engine. Thus, the objects in lhe domain space and the SCI of rules in the rule space
are combined for 8 policy decision that is made by tbe policy driver. It is worth bringing the distinction between 8
rule and poli.cy here.. In the ope.rntions center, a mle could be Lba1 all network milures should be reponed to the
engineering group. A suttement that this applies to !he operations personnel at the network management desk
mnkes the rul.e a policy. Because the responsibility is assigned 10 specific individuals, failing to do so will be
blamed on the person on duty at that time. The action space executes the right-hand side of the mles by ch811ging
the atlribULes of the network and/or executing an external ac1ion. to resolving the throughput problem using the
CBR technique in Section 11.4.3, we discussed the options regarding lhe actions to be taken. Wbelhcr network
load should be controlled or more bandwidlh be allocated is a policy decision. 1l1is can 'he implemented as an RBR
in lhe policy rule space. An example ofexternal action could be to page engineering fOr a severe network failure.
We have illustrated implemens.ing service level management In Chapter I 0 on TMN with operations s ys tems. An
operations system, in general, does an exclusive or special-purpose function. With the availability of e lement
management and NMSs, it is time for the arrival of a generaliz~ service level management. Service level
management is defined as the process of (I) .ide.nti.t)'ing services and characterL~tios associat~ with them. (2)
negotiating an SLA, (3) deploying agents to monitor 81ld control the perfonnance of network, systems, and
application compo nents, and (4) producing service level reports [Lewis, 1999]. Lewis compares the definition o f
service leve I management to quality of service (QoS) management defined by the Object Modeling Group (OMG).
The c haracteristics associated with .services are service parameters, service levels, component parameters, and
component·-to-service mappings. A service pa.rarnetcr is an Index into the perfOrmance of a service-for example,
the availability of a busioess application for a custome.r. The business application depends upon various underlying
components -for eX3Jllple, network devices, systems, 81ld applications o n the systems. Thus, there is a one-to-
many mapping between the service parameter and the underlying component parameters. The availability of the
business application in the SLA can be de1ined in terms of the availability of these underlying components. ln th.is
case, the availability service parameter is a function ofthe availability component parameter.
An SLA is a contract between the service provider and the customer, spec eying the service.s to be provided 81ld the
quality of those services that the service provider promises to meet. The pricing fur the service depends on the QoS
commitment
The objective of service level management is to ensure customer satisfaction by meeting or exceeding the
commitments made in the SLA and to guide policy management. ln addition. it provideJ> input to the business
management system.
Summ11ry
We. have learned in this chapter how to apply all the knowledge we have g~~ined in the book to practical situations.
We have dealt with i!Je five categories of OSI application functions, namely configuration, fault, performance,
security, and accounting.
Configuration management invol.v-es, in addition to setting and resetting the parameters of network comporJents,
provisioning of the network and inventory management. Operatk>n syst ems perform the latter function~. Network
topology management is concerned witb discovery and mapping of the network for operations that can be used to
monitor them from a centraliud operations cemer.
Fault detection consist s of mull detection and mull isolation. Similarly, performance degradation involves detection
and isolation. We dealt with these s ubjects in a simpllstic manner in the early part of tbe chapter and in a more
comp lex mri.nuer in the latter pan. We discussed the e merging topic of correlation technology and the various
correlation techniqlrecs that have been implemented in systems. They correlate events or alarms, which arrive from
multiple sources, and determine the root cause of the problem. A knowledge base built upon heuristic experience,
as we ll as algorithmic procedures, is u~ in such systems, either fortheselection of inpats or for reasoning.
While we addressed Lbe issues of perrormance management, we discussed the pe rformance metrics and the
imponam role of performance statist ics in network management.
Security management played a small part in SNMP management, but plays an extremely sensitive and critical role
in overall network management. We bave deak with this in detail in this chapter. We covered the importan.ce of
policies and prooedure.s. We looked ai the various means of how information can be accessed, tampered with, or
decstroyed. These are done by unauthorr.led and perverted personnel We also learned how to protect, if not
completely at least partially, against suclt attacks. ln this context, we. discussed various authentication and
authorization procedures. There are .sophisticated cryptogJ:Bphic methods to tr:ansport iofonnntion across unsecured
cbannels to ensure secure communicailin. We talked about secret and public keys in cryptography to accomplish
this. We briefly addressed the issue of bow to protect our networks and sy stems against the growing menace of
virus aitnck.s.
From a business management viewpoint, we discussed the methods of using the statistical data gathered &om the
network to generate accounting applicailins. Reports play an essential .role in the management of infom1ation
services. We described the three classes of reports: planning and management, system, and user. We gave
examples ofthe types of repo rts that are useful in each class.
Many of tl-.e network and service management decisions are policy based. We discussed how this could he built
into the system that would help per.sonoel who are expected to implement those policies
We brought this c bapter to a conclusion discussing service level management. Service level management helps
sntisfy customer needs. A service level agreemeut between the·customer and the service provider defines the needs
of the customer and. the commitments.of the service provider.
Exercises
1. You are asked to do a study of the use pattern of 24,000 workstations in an aGademlc institution. Make the
following assumptions for your study:
You are pinging each station periodically. The message size in both directions is 128 bytes long, The
NMS yo u arc using to do tbe study is on a 10-Mbps LAN, which fuuctions with 30% efficiency. What
would be the freque.u cy of your ping if you were not to exceed 5% overhead?
3. The autodiscovery In some NMSs Is done by the network m~nagement system startlne with an arp query to the
local router.
4. You are responsible for designing the autodlscovery module of an NMS. Outline the procedure and the software
tools that you would use.
5. Redraw Figure 11.4 and figure 11. 5 for VIAN based on tP address.
6. You are the manager of a NOC. Set up a procedure that would help your operators ·track the failure of a
workstation that l.s on a virtual LAN.
7. What Ml8 object would you monltorfor measuring the collision rate on an Ethernet LAN?
8. Ethernet performance degrades when the collision ratio reaches 30-4096. Explain how you would use the 802.9
MIB (RF£ 1398) to measure the collision ratio of an Ethernet LAN. We w ill define the colllslon ratio of the LAN as
(total number of colllslons/number of packets offered to the IAN) measured on the Ethecmet Interface.
10. a. 1l1e trap alarm thresholds are set at two levels-rising and falling. Explain ll1e reasoning behind
this.
b. Define all the RMON parameters to be set fur generating and resetting aJarms when the collision
rate on an Ethernet LAN exceeds 120,000 collisions per second nnd falls below 100,000
collisions per second. U.se eventlndex values of I and 2 for event generation fi:>r the rising and
falling thresho kls.
11. Download the MRTG tool and measure the foUowlng performance statistics on a subnetwork:
12. Review RFC 2064 and write·a one· or two-page (maximum) report on the NeTraMet flow meter.
a. RBR rules
b. inference engine actions ro accomplish the following:
Display a yelbw alarm for a component that is one layer higher ( i.e., one component ahead in its
path).
14. Write a pseudocode for MBR to detect failure of the components shown In Figure 11.11.
15. Describe three scenarios that require event correlation and explain clearly why each one needs it.
16. a. Describe (or select one from Exercise 12) 11 :scenario that clearly requires event correlation.
b. Discuss how each method discussed (wilb the exception of the finite stale machine model) woukl
approach the lask.
c. Evaluate each method.
17. a. Derive the minimum number of symploms required to uniquely identify n problems using
codcbook oorrelation.
b. Draw a chart with the number of pro blems on the x-axis and the number of symptoms on the Y·
axis.
18. The causality graph for a network is shown In Figure U.44.
19. a Assume that inn monoalpbabetic cipher encryption scheme. bolh alphnbet and digits (0-9) can
be used interchangeably. Suppose an intruder tries to decipher it knowing the algorithm, but not
the key. How many attempts would it take on the average to decipher the mes.;age?
b. If you are given a powerfial computer with a nanosecond instruction period to decipher the
message, could you do it in your lifetime? How confident are you with your aoswe.r?
20. State·threelmportant differences In·the ch~racterlstlcs between authentication and encryption algorithms.
21. a You own the public key of lan. What functions of secure email can you perrorm with that?
b. Is it safe for you to include your public key with your email address? Draw a compari!lon to
regular mail.
22. Using mdS uti lity under FreeBSO 2.x (or mdSsum under LINUX), generate a message digest of a flle provided by
your instructor.
23. Describe the procedure at the originating end when lan wants to send a secure message using PEM
.simultaneously to both Rita and Ted. He communicates with Rita using a secret key and with Ted using a public
key.