The HP Openview Approach To Help Desk and Problem Management
The HP Openview Approach To Help Desk and Problem Management
HP OpenView Author: Ken Wendle Version: 2 This white paper refers to the HP OpenView ITSM 5.6 product
Key Customer Benefits .....................................................................................................16 Conclusion ........................................................................................................................17 Appendix 1: The IT Infrastructure Library (ITIL).........................................................18 Appendix 2: The HP ITSM Reference Model .................................................................19
Introduction
On a visit to Wonderland, Alice is running frantically as if on a treadmill. She complains to the Cheshire Cat, hovering nearby, that she doesnt seem to be getting anywhere. The Cheshire Cat tells her that she has to run faster! Alice exclaims that shes already running as fast as she can! The Cheshire cat coolly replies, You have to run as fast as you can just to stay in one place. In order to get ahead, you have to run faster than as fast as you can! Sounds a bit like todays business environment, doesnt it? Alice could just as easily be your corporation or organization. The Cheshire cat could just as easily be that marketplace out there. Just when we begin to think the pace of business cant get any faster, it does. We are rapidly approaching what Bill Gates calls, Business at the Speed of Thought. If one were to list the items making the most dramatic impact on the pace of businesses today, that list would likely include the drive to Globalization, the Internet and E-Commerce revolution, EDI, shifting demographics, and rapid technological advancement. In today's morphing economy, the digital revolution is redefining the competitive forces in the economy with breathtaking speed. As these technologies are increasingly converging, they are redefining the very core of business processes. Innovation becomes the key driver for success in this era of rapid transition. Speed, flexibility, the ability to absorb change and to see and respond to whats coming down the road - NOT react to whats already happened - are the hallmarks of success in the present chaotic business environment. One keen market observers suggestion? Invest heavily in Crystal Balls. In other words, to keep up with rapid technology and business changes, organizations must invest in organizational processes which enable them to look further down the road to recognize emerging technologies, assess the impact to their company, quickly adopt the new technology and to absorb the changes it creates. To move from fire-fighting mode to a reactive one. From reactive to proactive. From proactive to preventative. The message is clear: be adept, adopt and adapt, or die. Change is not a choice. Growth and ability thrive by making change your ally is. To realize that what you do today is building your organization of tomorrow. Finally, consider this. If you think being a customer or user of this technology is tough, trying being one of those whose job it is to support the technology once its deployed. Now you begin to see another facet of the problem.
Do you believe that? Overall demand for technology support will increase or decrease within the next 12 months? Complexity of that support will increase or decrease within the next 12 months?
Associated costs of that support will increase or decrease within the next 12 months?
Now ask yourself this: What are you doing about it? To not have a game plan - to not be in the process of preparing today for your Help Desk of Tomorrow with its higher demand and more complex, more expensive technology support - is like the person who leaps from the top of a 20-story building and declares at the 10th floor that everything is O.K. so far. Many corporations are ignoring the facts staring them in the face.
The HP OpenView IT Service Manager product suite exists of 4 products. 1. HP OpenView ITSM Asset Manager 2. HP OpenView ITSM Help Desk Manager 3. HP OpenView ITSM Configuration and Change Manager 4. HP OpenView ITSM Service Level Manager
The IT Infrastructure Resource Planning road is a road that focuses on Asset Utilization, Asset Optimization and eventually Asset Synchronization. The business objective is cost control and the main focus will be reducing the Total Cost of Ownership (TCO) of the IT environment. Organizations that take this road will focus on controlling their costs related to IT Assets. After a successful implementation of Asset Management, the organizations will seek to ways to leverage of their control by utilizing the assets information at operational processes such as the service desk. Another step could be the implementation of Change Management processes to optimize the Asset Management process and implement "just in time" principles for asset supply and asset inventories. The last stage of the Infrastructure Resource Planning Road is the implementation of Service Level Management processes, focusing on relating the now controlled costs to their benefits to the organization.
enough to celebrate a milestone is often hard to come by. Its been observed by at least one industry observer that companies want the Help Desk of tomorrow with the budget of yesterday. Fortunately, top corporate management is discovering - and many more need to discover - the strategic value of their Help Desk. They are also realizing the need to better coordinate other parts of IT in support of their Help Desk. This takes the right people doing the right things with the right tools to enable them: the correct blend of People, Process and Technology. The Gartner Group, in and attempt to help support organizations better understand where they are and where they should be heading, identify and describe a time-line or phases of support maturity.
Firefighting
T
I M E
People: Operator, multiple help desks Process: Trouble ticket tracking Technology: Call tracking systems People: Generalist, single point of contact Process: Problem management/resolution Technology: Midlevel problem mgt. Tools People: Resolution agents, self-help, CSD Process: Knowledge, change, service-level mgt. Technology: Enterprise CSD systems People: Valued brokers, CSO/centers of excellence Process: Change, user mgt., value chain integration Technology: Integrated, distributed object model
to causing changes, i.e., those implemented changes which, in turn, created incidents. This may highlight communication issues, or an ineffective change process, or by the lack of a welldefined back-out plan. Preventative: By fully integrating various disciplines, an IT organization can enter the Preventative phase of IT service management. It is necessary to identify and define IT services and the various components of which these services are comprised. It is further necessary to have SLAs with customers identified as users of those services. Defining IT Infrastructure parameters (response and availability) and the time frames (service hours) is required so that if something is going to be changed, you know what to change, when it is to be changed, who to notify, and the importance and impact of that change. This allows for the determination of dependencies, affected resources, etc. Defined services and their related SLAs also show how many people are being impacted by any service outage, scheduled or unscheduled. Creating this Service Layer as opposed to simply infrastructure improves communication between IT and customers, helps determine escalation paths and helps automate the workflow process. This will most definitely improve problem analysis and change control process by having a better category structure of business services as opposed to simply CMDB items. Better root cause analysis is possible by drilling down to look only at the finite parts of the infrastructure making up the service.
Incident Management
It seems that every city has a stretch of highway where accidents seem to occur on a regular basis. The highway patrol is usually the first on the scene, quickly followed by other emergency vehicles: ambulance, fire truck, tow truck, etc. The first order of business is to attend to the injured. Next is to get the traffic moving again. This is essence Incident Management. The over-riding goal of Incident Management is to get the customer back to a productive state as quickly as possible: to restore normal service operation (optimally as defined by the relevant SLA). To best achieve this, each incident is identified, recorded (logged), classified (categorized) and tracked until normal service is restored (reconciled). Should the incident become a problem (as described below) it should be managed properly through final resolution.
According to a study performed by Meta, the calls received by support organizations can be broken down into a so-called three layer cake (see figure below).
The bottom layer represents those calls which are repetitive in nature, i.e., are received and handled over and over, day after day, week after week. If graphed over time, the result would be a horizontal line. These calls are candidates for partial or complete automation: automatic password resetting, automatic application restart, FAQs for how to questions, etc. An organization which finds itself focussing on this layer is generally fire-fighting or reactive. Tools available in Help Desk Manager to also in addressing these types of incidents include Check lists, which provide the correct questions to determine the correct answers. Also very helpful is Standard calls, frequently occurring calls which can be at least partially automated to make call entry faster and more consistent. The middle layer consists of those calls, which are repetitive, but only for a specific time. If graphed over time, one would see peaks and valleys as calls came in. These could be caused by a rollout of a new application or system, an upgrade, a quarterly closing cycle, etc. These calls are candidates for utilization of so-called knowledge base, canned or historical. A wellcoordinated training program also helps reduce this type of call. An organization focussing on this layer is generally in the reactive or proactive level, as change management disciplines come into play to help minimize the impact of change. Tools within Help Desk Manager to assist in addressing these incidents include a Full Text Search (FTS) or Context Sensitive search of call history, although even this may become somewhat burdensome as more and more calls are logged and indexed. Therefore, it is possible to limit a universal search by creating a Knowledge Pool in order to limit searching to that pool. Case Based Reasoning (CBR) is also provided to give the ability to launch, and pass description information to obtain access to predefined cases, or known situations and their solutions. Task manager and escalation (progress monitor) is extremely valuable in order to correctly and efficiently dispatch incidents and problems to the correct resources. The top layer consists of those one time, isolated, or non-repetitive incidents. Those incidents, which seem to come out of left field and will often blind-side an organization. Every support organization has stories of such incidents: the time someone dropped a server without backing it up, the thunderstorm which knocked out the LAN. If graphed, they would appear as random points, with little or no correlation to one another. Organizations focusing on this layer are in either the proactive or predictive phase. On the surface, it appears little can be done to prevent or minimize their occurrence. In reality, the implementation and integration of solid change management processes, and service level management, goes a long way in minimizing the occurrence and/or impact of such incidents. This is perhaps one of the greatest strengths
and unique capabilities of the ITSM Help Desk Manager. This ability also is extremely valuable in addressing and minimizing incidents within the other two layers. In many IT organizations, there is a serious lack of support from Level 2 and 3 organizations (i.e. Systems Development and Project Management and Infrastructure Support) dumping new products and services and changes onto Level 1 without any formal accountability for the quality of their work. At best, this is poor organizational structure, at worst, its chronic victimization. Regardless of what layer an organization focuses on, the LOGGING of 100% of all calls received is VITAL. The justification for this is simple: for all intents and purposes, the calls, which are not logged, did not happen. The integrity of the help desk is severely damaged by a hit or miss attitude toward logging calls. Finally, you cannot create a report of all calls that were not logged. It becomes very difficult, if not impossible, to justify training, staffing, or equipment based upon a partial picture. If this becomes an issue, Help Desk agents may ask that inevitable question: Do you want me to solve the problem or log the call? The answer of course is Both. If difficulties arise in logging calls, chances are the problem is with the tool(s) being utilized. Incident management also is vital in terms of cost. Various studies have concluded that it is proportionally less costly to resolve calls on the first line than level 2, less at level 2 than level 3, and so on. To assist the front line agent, ITSM Help Desk manager incorporates SMARTLINK, which may be used to launch a remote control or self-healing application.
Problem Management
Somewhere people are gathering information and analyzing that accident, what may have caused it and how it may relate to other accidents which occurred along that same stretch of highway. They analyze, among other things, traffic patterns, the time of day, weather conditions at the time. From this analysis, they seek to determine the ROOT CAUSE of the accidents. This is essence of Problem Management. Without Problem Management, incident management becomes a never-ending process of doing the same thing over and over and over. Problem Management is concerned with asking why? and determining the real underlying cause of the incident(s) in order to prevent future reoccurrence. Problem management deals with: Identifying and recording the problems encountered Performing severity analysis Providing for the proper support effort Investigating and diagnosing the problems Defining solutions and proposing requests for change (RFCs) Problem management also addresses the issue of error control, following the progress of known errors until they are effectively eliminated by the successful implementation of a change to the infrastructure. (See Integration with other OpenView products below) When the role of the local fire department was viewed as simply putting out fires, this was the criteria by which they were measured and for which they were rewarded. There was, therefore, no incentive to prevent fires. Needless to say, this focus was dangerous, destructive, and defied logic. Today, thankfully, fire departments see their greatest role as preventing fires, spending the vast majority of their time in working to minimize the risks of fire and educating the public in how to learn not to burn. Many support organizations still see themselves as fire fighters. Their value is perceived by how many calls they take how many problems they solve. Problem Management squarely addresses this issue. The best customer service is resolving a problem before the customer is
10
even aware of it. It is the call not taken, the fire that never starts that that best serves the customer and the business.
Knowledge Management
As discussed within Incident Management, Knowledge Management becomes valuable in addressing issues within the various phases, but most notably within proactive and into the preventative phases. Knowledge management at one time seemed to be limited to off-the-shelf canned knowledge and the historical knowledge of past calls or incidents. (How has someone else solved this problem? or What did we do the last time?) While there is indeed value in these, a vast number of incidents and problems cannot be resolved with this knowledge alone. The definition of knowledge must be expanded. In fact, the GartnerGroup definition of KM is very expansive. They recommend it be applied to user programs, vendor products and services to clearly delineate KM from other information management approaches: KM is a discipline that promotes an integrated and collaborative approach to the creation, capture, organization, access and use of an enterprises information assets. This includes databases, documents and, most importantly, the un-captured, tacit expertise and experience of individual workers. The trend toward wanting to empower the inexperience help-desk analyst or even end users to solve problems is gaining momentum. At the micro level, this can result in reducing the redundancy of the effort of people having to solve the same problem over and over. Knowledge can be compared, in many ways, to water. When you need it, nothing else will serve the need as well. Knowledge can be stagnant, polluted, or frozen (ever try to learn something only to be thwarted in the effort?). It can also evaporate (ask any manager whos had a key staff member leave). For knowledge (and water) to be most effective, it needs to be clean and flowing and available to those who need it when they need it. Too often a Help Desk sees only the knowledge of how to solve problems as the most important. However, information and knowledge of what infrastructure items are in place is also vital, as well as knowledge of what, if any, changes took place over the weekend or last night. Industry consultants stress that evaluation of tools and processes for the service desk should include a coordinated approach to not only problem resolution, but also asset tracking, configuration management, and change management. In order to help facilitate steps toward better knowledge management, ITSM Help Desk manager not only provides links to canned and historical knowledge, but expands this with integrations, (discussed in more detail below) which provide greater information to assist the support function.
Configuration Management
In the industry today, there is still some confusion around the difference between Asset Management and Configuration Management. Asset Management focuses on complete life-cycle management of an IT asset, whereas Configuration Management focuses on the production life cycle of an IT asset. Configuration management focuses on how IT Assets interconnect and relate to each other so that it produces IT Infrastructure Services to the business units. In a very real sense, Configuration Management is a specialized discipline - a sub-set - of Asset Management. Configuration Management isnt concerned so much about the cost of an Asset, but more on its role in IT service provision and the quality of its performance. Asset Management tools typically do support component-configuration management support, for instance having the ability to track the components of a desktop (memory, CPU etc), the tools typically do not support the complex functionality of identifying how IT Assets exactly interact and deliver services to the business. From an IS Operations management perspective, Configuration Management is the information needed too fully empower the Help desk Management and Problem Resolution Staff.
11
In short, configuration management focuses on IT Infrastructure that is in production. The definition according to ITIL is as follows: Configuration management is a discipline that gives IT Management precise control over IT assets by allowing IT Management to: Specify the versions of Configuration items (CI) in use and in existence on an IT infrastructure and information on. - The status of these items (e.g. in live use, scheduled for live use, scheduled for upgrade etc). - Who owns each item (the individual with prime responsibility for it). - The relationships between items (components, connections etc). Maintain up to date records containing these pieces of information. Control changes to the CIs by ensuring changes are made only with the agreement of appropriate named authorities. Audit the IT infrastructure to ensure it contains the authorized CIs and only these CIs. The items that may be brought under configuration management control include hardware, software, documentation, telecommunication services, computer center facilities and any others the organization wishes to control.
Work Management
Work Manager provides the functionality to have a centralized management console for all work related to IT Management processes. Work Manager is used in the change management process to identify IT staff available to carry out tasks related to the change management process. Work orders have a classification structure, a status, priority, target setting, progress and a scope. The classification structure for example is: Risk and Impact Analysis, Appointment or Approval. The scope could be a help desk ticket, a problem ticket or a change request. Configuration Items can also be linked to work orders.
Reporting
The reporting capability, lacking in many call tracking systems, has put many help desks on the defensive. Often its much easier to put the information in than get it out. Also, many help desks implemented their help desk software without much thought as to what they really needed to get out in the back end. In addition, there are insufficient processes in place to provide for consistent entry. The old adage, garbage in, garbage out definitely applies here. The problem is without solid reporting, it is difficult to justify updating these out-dated systems. Without acquiring the update, management simply cannot get the reports they need. ITSM Help Desk recognizes that reporting is a vital capability. Sine the nature of the beast, is that many support organizations find themselves driving relentlessly day after day, effective reporting is necessary to review the path behind them. The needs of Management and audit and review demand reporting be created to show efficiency (that things are being done right) and effectiveness (the right things are being done) for incident analysis and Management reports. By utilizing a powerful reporting tool, Business Objects, both standard and ad hoc reporting becomes a very straightforward process.
12
13
External Repository
Extractor
ASCII File
IRM Engine
Reconciliation Database
Report
NNM
DTA
CMDB
SMS
...
Overnight Changes
Unauthorized Changes
Network and System management tools such as HP OpenView Network Node Manager, HP OpenView Desktop Administrator and Microsoft SMS are capable of detecting hardware and software devices that are installed in the IT Infrastructure. The Configuration Management Database classification structure identifies to what degree the organizations wishes to manage their environment. Based on the classification structure an "extractor" for the NSM tools can be made. (Default extractors are NNM, DTA and SMS). The extractor will produce a list of detected configuration items that are uploaded via the IRM engine in the reconciliation database. This database can be compared with the configuration management database and a list of changes is produced. In principle these changes are unauthorized, since the change manager process did not model these changes. Based on the report organizations can take action and identify potential hazards to IT Service provision. Based on the Reconciliation management database organization can also automatically verify if changes have been implemented as planned.
14
The integration of Service Level Manager Help Desk manager has the following major functions: - Logging of calls related directly to services - More efficient use of limited resources - Monitoring of services based upon defined service levels Logging of calls related directly to services Users of technology are not really interested in the components which create the services they use. Nor do they express problems they are having in those terms. A customer is much more likely to say email is broken, than I believe that one of the email servers performance has degraded. By linking calls, incidents and problems to Services, a bigger picture can be created. Also, business impact can much more quickly be determined. More efficient use of limited resources Service Level Agreements, by definition, have time-lines and priorities associated with them. By being able to immediately determine business impact, a correct amount of resources can be focussed to address the issues. Monitoring of Service based upon defined service levels Pre-defining service levels allow one very important benefit: it allows a support organization to become much more proactive. By setting thresholds comfortably within the limits of the SLA, a support organization can be notified with a sufficient amount of time to respond before the customer is even aware of a potential problem.
15
The above figure explains that a particular CI, a Windows NT Server, is subject to a change. This particular CI is part of three IT services, the E-mail service, the HR Business Service and a Backup service. These services have different impact levels to the business and have different service hours. From the graph it is possible to identify that the best timeslot to implement the change is Tuesday. Risk and impact assessment becomes more sophisticated by having the ability to view the IT services that could be affected as a result of the change.
Asset Manager
HP OpenView ITSM Asset Manager is focussing on the complete life-cycle management of IT Assets. Asset Manager will allow organizations to optimize their Asset utilization from a cost perspective. When integrated with HP OpenView ITSM change and configuration manager, the Asset Manager and CMDB can share the same data. When changes are scheduled and new CIs need to be ordered since they are not in stock or not available at the IS Organization, Asset Manager can help change managers to identify if the needed assets are available in the entire organization. If needed, Asset Manager can make sure that assets will be available and can optimize the purchasing process by focussing on cost efficiency, the best brand and the best supplier. Asset Manager will allow change manager to reuse software licenses that could still be installed on disposed desktops. Asset Manager will allow changes to be implemented more cost effective and is focussing on decreasing the Total Cost of Ownership of the entire organization.
16
Improved Customer (user) Productivity less user disruption, shorter duration Lessons from Experience trend analysis to prevent failures Improved Productivity of Support Staff less disruption, less duplication of effort Higher Reputation for IT Infrastructure increased quality ensures more stable infrastructure Greater Control of IT Services through Management Information
Correct use of automation can incrementally reduce the costs of the Help Desk. Depending upon support complexity Gartner has predicted that through year-end 2001, annual reactive help desk costs for enterprises not using automated tools will be between $183 and $1,713 per supported
PC. They further concluded in their research that problem resolution technologies will provide one of the best opportunities to improve quality of service levels, principally because they can ease the skills and staffing shortages within IS support organizations. This is because problem resolution technologies can help improve qualitative and quantitative metrics. Their conclusion is that through 2001, assuming a successful implementation, problem resolution technology will reduce support costs by as much as 16 percent.
Conclusion
Two forces are pulling the IT support function in two opposite directions. On one side, cost pressures are challenging IT organizations to do more with less and lower total cost of ownership. On the other, internal customers of IT services are demanding and expecting improved service quality. Of course, the art of IT support is to do both at the same time. A lack of a Service Culture is common in IT organizations. The Help Desk - perhaps more than any other function within an IT organization - has the greatest Customer Service awareness. They are therefore in the best position to help foster and model this attitude throughout their organization. It is not a coincidence that the Service Level Agreement trend was initiated in many companies by their Help Desk organization. Unfortunately, the Help Desk cannot effectively champion this message alone. Correct Help Desk Management processes can help IS organizations build an environment that is more prepared to absorb and control the challenges that they are facing. It is a best practice to understand and approach this discipline with People issues, Process issues and finally Technology issues in mind. Where ITIL can be the guidance to implement the processes, HP OpenView delivers the technology based on ITIL and the HP ITSM reference model to support your processes. With the processes established and technology deployed to support those processes, it is now up to IS Organizations to create the customer-focussed culture within the organizations they serve. In making decisions which impact this critical area especially in the area of enabling technology - it is important to keep in mind the Support Maturity Phases: fire-fighting, reactive, proactive, preventative. Decisions made today will impact, for good or ill, the ability to effectively and efficiently move from one phase to the next. Preparing for the help desk of tomorrow indeed begins today. Perhaps another visit to Wonderland provides us with a final, yet important perspective: Cheshire PussAlice asked `Would you tell me, please, which way I ought to go from here?' `That depends a good deal on where you want to get to,' said the Cat. `I don't much care where--' said Alice. `Then it doesn't matter which way you go,' said the Cat.
17
The HP OpenView IT Service Manager product has been developed around ITIL service management concepts. The product reflects and supports the integrated nature of the ITIL processes and the integrated product suite is unique in the industry. The ITIL guides are designed to be just that: guides. ITIL advice is not meant to be definitive, and should therefore be interpreted for every IT organization which desires to benefit from this body of work.
18
In the middle of these 4 disciplines are configuration management and change management as the connecting and controlling processes. Configuration and Change Management guard and control the production environment that produces and delivers IT services to the business. The interaction of configuration and change management with processes as incident, problem and service level management is a critical success factor in implementing a working solution.
19
Figure 2 and 3. The HP IT Service Management Reference model For more information or a complete 30-page white paper please visit www.hp.com
20
Acknowledgements The development of the Hewlett-Packard OpenView Help Desk Management white paper was indeed a team effort, involving many people residing in different countries. Special thanks to: Karel van der Poel, Michel N'Guettia, Susan Callahan and Mark McNamara Also, a special acknowledgement to some great folks on the front lines: Bob Hustedt, Susan K. Smith, and Greg R. Smith Finally, for their insights and support, much gratitude to some who truly lead the field: David Cannon, Gary Case and Char LaBounty Ken Wendle
21