Recent Question/Assignment
Assignment 3
INF 60007 Business Information Systems
IT Incident Management Process at ANZ Bank
Individual Assignment
• Due – Week 12
• Weight – 30%
• Length: 1500 words (max) and Process Diagram
• Please submit your presentation through TurnitIn as per directions provided on Canvas.
Introduction
IT services permeate all aspects of an organisation’s services including the provision of customer services and the running of the business itself. When these services are affected by failures and other occurrences, it is vital that processes and systems are in place to ensure that the impact to the business is minimised.
In ITIL terminology, an ‘incident’ is defined as an unplanned interruption to an IT service or reduction in the quality of an IT service or a failure of a configuration item that has not yet impacted an IT service (for example a high temperature warning in a Data Centre).
The purpose of this assignment is to define the Incident Management process within ANZ Information Technology (IT) function and to define the roles and responsibilities of key actors.
ASSIGNMENT TASK
1) Process Diagram (not included in the word count)
This assignment requires you to read the given Incident Management Procedures. From these procedures you are required to complete the partially completed swim lane process diagram.
Recommendation: Use the swim lane template (.ppt) provided on the third assignment page. This template includes the partially completed swim lane diagram – it will save you recreating the model.
2) Description of Incident Management Roles and Responsibilities (750 words)
You are also required to derive a description of the roles and responsibilities for the following actors:
• End User,
• First Line Analyst,
• Incident Manager, and
• Incident Response team.
This represents the written part of the assignment. You have up to 1000 words to complete this part of the assignment.
3) Critical Reflection on the Formalization of Organisational Processes (750 words)
Based on your reading of the Incident Management process model (swim lane), answer the following question: What is the motivation for organizations to formalize business processes for managing IT incidents?
Incident Management Procedures
As an integral part of the Incident Management process, the following procedures document the steps that specify how to achieve each activity in the processes.
This section has been broken down to the following stages of the Incident Management process.
• Incident logging
• Investigate and diagnose
• Resolution and recovery
• Review & close
• Tracking & monitoring
The numbering of these procedures corresponds to that in the Process Diagram on the previous page.
Incident logging
IM.101a: Initiate Contact to First Line Analyst
This step is where the customer first initiates contact with the First Line Analyst (normally the IT Service Desk)
The IT customers are defined in the IT Service Catalogue and the following table details methods by which they can contact the Service Desk as well as other mechanisms for identifying incidents.
Interface Technology - Tools
Phone Technology
• IVR Interactive Voice Response
• ACD Automatic Call distribution
Walk Up Some customers (particularly within IT) may be able visit the Service Desk via walk up
Email Microsoft Outlook
Web Portal Self-Help via HEAT SRM
The Service Desk is the primary contact point for reporting all incidents and for logging service requests. The contact details and business hours of the Service Desk are 8:00AM -5:30PM, after hours 5:30PM-8:00AM Monday-Friday.
Proceed to IM.102: Capture Initial Details.
IM.102: Capture Initial Details
The First Line Analyst should first check whether this is a new Incident, or a customer requesting an update or providing more information on an existing incident. The First Line Analyst may also look up the customers Incident history to see if the reported incident has been previously logged (or is an ongoing, persistent issue)
If the customer is contacting the First Line Analyst for an existing incident:
• Ask the customer for the incident reference number and
o Perform search for the incident and provide an update to the customer or o Update the incident record with the information provided by the customer
If this is a new incident, then the First Line Analyst should log the call and will need to capture the customer’s details and any relevant information about the issue the customer is currently experiencing (using the appropriate template).
The information regarding the issue will facilitate the interrogation of the Knowledge base for any related Knowledge articles. Information that needs to be gathered will include: Preloaded Information (Updateable if required)
• Incident ID **
• Customer name *
• Job Title
• Customer ID / Student ID (if applicable) **
• Employee ID (if applicable) **
• Contact Details (Primary and Alternate/Preferred) **
• Location **
• Organisation/Branch/Department **
• Date & Time Received **
• VIP Flag (if applicable) ***
Collected Information
• Service(s) Affected *
• Asset ID or name of Configuration Item (s) affected (Application or hardware experiencing the fault)
• Description of the fault – What is the incident, service request or change? What are the symptoms? What possible causes could there be? *
• Any Error Message*
• When occurred/How often*
• What, if any, actions have been taken so far (what has the customer done?) *
• Customer’s Availability*
Note: * Denotes mandatory field
** Denotes automatically generated field (to be validated) *** Denotes VIPs
Proceed to IM.102a: Is this a Service Request?
IM.102a: Is this a Service Request?
This step is required to make sure that the call is an incident, and not a service request. This is an important step, however it needs to be understood that the customer will not know or care about this clarification.
If the call is about a Service Request (A Service Request or Request is defined as formal request from a customer for something to be provided and differentiates from an Incident being something that is broken. See definitions.), utilise the Service Request process. If the call is about an incident, progress to IM.103: Capture Incident Details.
IM.103: Capture Incident Details
More detail can now be added into the incident record. Information that needs to be gathered will include:
• Brief Description* – which acts as a “title” for the incident. (E.g. “Faulty workstation at Desk No. XX”)
• Information* – more detailed information about the incident. This may include error codes, symptoms, actions the client has already taken, when it happened and how often it is happening. This may include: o The date and time the incident was actually received o A description of the incident?
o The Serial Number or base number of Configuration Item (s) affected (if applicable) *
o What, if any, actions have been taken so far (what has the Customer or First Line Analyst done?) *
N.B. This information could be gathered via scripts to ensure the First Line Analyst obtain the correct information for support staff to resolve the incident.
• Service(s)* – to which the incident is affecting
• Status* – on logging a new incident, the Status will be automatically set to New. For a complete list of incident status values see Appendix 1 Note: * Denotes mandatory field
IM.103a: Incident Categorisation
The appropriate categorisation of incidents is vitally important, as it will help determine the exact type of incident being logged, where the incident should be escalated to (if the First Line Analyst cannot resolve it), and it will also enable appropriate workflows to be utilised within the HEAT toolset.
This will also be important later when looking at incident types / frequencies to establish trends for use in problem management, supplier management and other ITM activities.
Multi-level categorisations are used within HEAT to identify the categories that can be associated with an incident.
The capability to track chosen categories as they change throughout the lifecycle of an incident will also prove useful when looking for potential improvements.
The current categorisation scheme in HEAT is broken into operational categories and product categories as per below:
• Initial Operational Category – which is the first attemps to determine the cause for the incident the incident
• Initial Product Category – which is the significant product impacted by the incident. Proceed to IM.103b: Incident Prioritisation.
IM.103b: Incident Prioritisation
Another important aspect of logging every incident is to agree and allocate an appropriate prioritisation, as this will determine how the incident is handled both by support tools and support staff.
Prioritisation is determined by taking into account both the urgency of the incident (how quickly the business needs a resolution) and the level of business impact it is causing. The prioritisation scheme is defined in Appendix 1 – See Prioritisation Scheme.
The priority of an incident will determine the response and resolution timeframe service levels and how long before it needs escalation to resolver groups. For details on this timing please see IM.6.3
It will also determine the timeframe for hierarchical escalation and is used to determine whether the incident is a Major Incident.
IM.103c: Is this a Major Incident?
The First Line Analyst must now determine whether the incident is a major incident or not. See the Major Incident Process document for details on what constitutes a major Incident
If it is thought to be a major incident, it should be escalated to the Major Incident Manager, who will decide if the Incident should be managed under the Major Incident procedure or continue to follow the normal Incident lifecycle.
If the Major Incident Manager determines the incident to be a major incident; progress to the Major Incident Management procedure.
If it is not a major incident proceed to IM.104: Initial Diagnosis.
If it can be assigned, proceed to IM.105: Escalate Incident to group.
Investigate and diagnose
IM.104: Initial Diagnosis
The aim of this step is for the First Line Analyst to analyse the incident and identify possible ways of resolving the incident as quickly as possible.
The First Line Analyst will investigate the incident the Incident Matching capability within HEAT and any information available to them. This could include but not be limited to:
• Knowledge articles specific to the incident symptoms
• Any Known Error records in the Knowledge Database
• Information from any third party vendors
During the investigation the First Line Analyst will determine the best course of action to resolve the Incident within the agreed time limit without help from other support groups.
If not directly resolved on the phone, the analyst should inform the customer of their intentions, give the customer the Incident record number and attempt to find a resolution.
Proceed to IM.104a: Can it be resolved at first level within defined timeframe?
IM.104a: Can it be resolved at first level within defined timeframe?
Timeframes have been set (by priority) for the time the Service Desk can work on incidents before escalating to 2nd or 3rd level support. These timeframes are documented in Incident Timeframes.
If there is a Knowledge Article that drives the lifecycle of the incident, then the Incident will follow the predefined workflow. Information regarding the activities of the incident will be documented in the Knowledge article or Template.
The First Line Analyst will use HEAT and other management tools and processes to investigate and attempt to resolve the incident
• Run through standard diagnostics to assess incident, or consult documentation for common tasks
• Use the knowledge base to determine if a workaround or a solution has been documented
• Check the Incident and Problem lists for possible related events
• Check for recent changes that may have been implemented that may be the cause of this incident. The following change repositories are available to the First Line Analyst:
o HEAT IT Service Management tool o Release Notes and Emails
• Attempt to re-create event
• Search any FAQ’s and knowledge bases (internal or vendor) for a resolution. The following knowledge repositories may be available:
o Remote Management Tool o HEAT Knowledge Management System
o Internal Wiki Pages o External Web Pages o Procedures Manuals
If a relevant Knowledge Article or Known Error has specific instructions or solution regarding the management or resolution of the incident exists, or the First Line Analyst has the required technical skills, the incident should be assigned to the individual First Line Analyst and the status changed to ‘In Progress’. This ensures that the Target Response Time (TRT1) clock is stopped. Proceed to IM.111: Resolve and Update Record.
If the First Line Analyst cannot resolve the incident within the timeframes, it will need to be escalated to the appropriate resolver group. Proceed to IM.105: Escalate Incident to group.
IM.105: Escalate Incident to Team
Each group will be defined within HEAT as a resolver group, and each group will have a queue where records can be placed for action by that group.
Incidents can be escalated to groups in two ways:
• Auto assignment based on Division, Location, Service, Categorisation or, where the HEAT system sends the Incident record directly to the resolver group queue; or
• Manually through escalation from the First Line Analyst or other group, where the resolver group (such as Desktop Support) is to be selected inside the Incident record.
If the resolver group is a 3rd party, the incident will be escalated to the resolver group who are responsible for them.
Sufficient information must exist within the incident record, including:
• Customers contact details and availability
• Information relating to the symtoms of the incident
• Information gathered during the initial diagnosis
• Information around any activities undertaken
The First Line Analyst will determine the appropriate resolver group to escalate the incident to and assign the incident to the relevant resolver group queue by selecting the team in the Support Group field in the HEAT IT Service Management tool.
The incident should be saved (The Status will change to ‘Assigned’).
Proceed to IM.106: Monitor Queue/ Receive Incident.
IM.106: Monitor Queue/ Receive Incident
Support Teams have the responsibility of monitoring their group queue during defined business hours. A member of the team will be the nominated Queue Monitor or it may be managed through a roster based system.
All Incidents escalated to the team must be reviewed to determine whether or not they have been correctly forwarded. To determine this, the Queue Monitor determines that the Acceptance Criteria is met;
Acceptance Criteria
• The incident is within scope of the Service Catalogue
• The incident has been escalated to the support group with the right skills to resolve the Incident
• The assigned priority and category is in accordance with the defined guidelines
• All required information is documented in the Incident record
If the Incident is not within the scope of the Resolver Group then it is sent back to the Service Desk.
If the Incident is escalated to a third party vendor, stewardship of the Incident stays with the resolver group, they are accountable. The third party will become responsible for the execution of resolving the Incident.
Proceed to IM.106a: Escalated to correct group?
IM.106a: Escalated to Right Team?
The Incident Response Team will determine if they can proceed with attempting the resolution or, whether they need to involve other teams or third party suppliers in the investigation and diagnosis of the Incident.
Alternatively the Incident Response Team might want to involve a higher management level if the normal level of management or authority cannot facilitate the resolution.
If the Incident Response Team is the right group to have assignment of the Incident the queue manager will assign it to the appropriate team member.
• Proceed to IM.107: Assign Incident to specialist
If the Incident Response Team needs to escalate to another management level, they keep stewardship of the Incident until a decision has been made to assign it to other Incident Response Teams
If the Response Team is not the right group to have assignment of the Incident they will need to send it back to the Service Desk for reassignment.
• Document the reasons why the Incident does not met the acceptance criteria for the Response Team
• Proceed to IM.106b: Request Incident Re-escalation.
IM.106b: Request Incident Re-escalation
If an incorrect escalation has been made, the queue manager will need to send it back to the Service Desk for re-escalation. It is important that appropriate information is provided to the Service Desk as to why the incident has been sent back to them. Proceed to IM.105: Escalate Incident to Team
IM.107: Assign Incident to specialist
Once it has been determined that the correct Response Team has been escalated to, the queue manager will assign the incident to a specialist based on:
• The specialists technical capability
• Workload of the team specialists
• Priority of the incident
Progress to IM.107a: Assigned to correct specialist?
IM.107a: Assigned to correct specialist?
The specialist will now determine if they can proceed with attempting the resolution or, whether they need someone else.
If the incident has been assigned to the wrong specialist:
• Document the reasons why the Incident should not be assigned to them and assign it back to the resolver group for the Queue monitor to reassign.
• Proceed to IM.109 Request Incident Re-assignment.
If the incident has been assigned to an appropriate specialist:
• Change the status of the Incident to ‘In Progress’ and save the Incident. This ensures that the Target Response Time (TRT1) clock is stopped.
• Proceed to IM.108: Investigation and Diagnosis
IM.109 Request Incident Re-assignment
If an incorrect assignment has been made, the specialist will need to send it back to the queue manager for re-assignment. It is important that appropriate information is provided to the Queue Manager as to why the incident has been sent back to them. Return to IM.107: Assign Incident to specialist
IM.108: Investigation and Diagnosis
The Resolver Group during investigation will determine the best course of action to resolve the Incident. During the Investigation, the resolver group may need assistance from other teams in the investigation; these teams could also include 3rd Party vendors. In all cases, the Incident record will reside with the Resolver Group.
When the Resolver Group specialist commences work on the Incident they must;
1. Notify customer that work has commenced by setting the status to “In Progress” 2. Perform diagnostics using management tools and processes to investigate incident:
• Check for other instances in the incident or problem database
• Attempt to re-create the fault in a production support or test environment
• Utilise online or hardcopy documentation
• Utilise vendor knowledge bases
• Peer support
• Use other available diagnostic tools
3. The Resolver should convey any anticipated delays directly to the customer and update the ticket with the communications detail.
4. Log progress of the resolution within the record to enable the Service Desk to provide the customer with updates.
5. Service Desk can request further details from the Resolver. If unable to receive the needed information the Service Desk can escalate the call to Management.
6. When the service can be restored the Resolver should proceed to IM.108a: Is Root Cause Known?
IM.108a: Is Root Cause Known?
If the root cause of the incident is not known (even if resolution is possible), a ticket will need to be raised in Problem Management for identification of the error. The Resolver will need to update the Incident ticket, including linking the two tickets.
The Resolver should advise the customer that a Problem ticket has been raised and any delay likely to occur to the customer. Update the ticket with the communications detail.
After raising the problem record the Resolver should continue working on the Incident Record and proceed to IM.109b: Is Change Required?
If no Problem ticket is required, proceed directly to IM.109b: Is Change Required?
IM.108b: Is Change Required?
Some resolution can be affected without making any changes (such as resetting a PC), however some resolutions may require a change to be made (such as resetting a server).
During resolution of the incident, the Resolver must determine whether or not a change is required as part of the resolution. The Resolver should update the Incident ticket with the change details, including linking the two tickets. See the Change Management process scope for what constitutes a change.
If a change is required to resolve the incident, raise an RFC and utilise the Change Management process to affect the resolution. Then proceed to IM.1011: Resolve and Update Record.
The Resolver should advise the customer that a Change ticket has been raised and any delay likely to occur to the resolution. Update the ticket with the communications detail.
If resolution can be affected without the need for a change, proceed to IM.111: Did Vendor resolve the Incident.
Resolution and Recovery
IM.111: Resolve and Update Record
The resolver is responsible for ensuring that all information relating to the resolution of the incident is recorded. If the incident is resolved they are required to:
1. Change the Status to ‘Resolved’
2. Enter a detailed Description into the incident record including actions taken to resolve 3. Confirm the impacted Service (and Configuration Item (CI) if available)
4. Resolution Categorisation The identified cause of the incident (e.g. Customer Error).
5. Resolution - Detailed solution/workaround (If applicable link to the Known Error or Knowledge Item)
6. Validate – The HEAT tool will send an email to the customer to validate resolution of the incident, however direct contact with the customer may be required
7. Part of the resolution procedure is for the Resolver to determine if there are any:
i. New scripts that could expedite the lifecycle of similar incidents
ii. Actions or resolution details that should be captured for future use via knowledge
iii. Any issues that were experienced with current knowledge articles that could be improved on
8. Proceed to IM.112: Communicate with Users
IM.110 Did the Vendor resolve the Incident?
This activity is actioned by the Resolver of the incident if an external vendor resolved the record. The Resolver (assignee who coordinated the external vendor) is responsible for performing this procedure.
IM.111: Resolve and Update of Record
The resolver is responsible for ensuring that all information relating to the resolution of the incident is recorded, and should follow the requirements defined in IM.111: Resolve and Update Record.
The Resolver is then responsible for Communicating with the Customer to ensure that the incident is resolved to the customer’s satisfaction. (This may be in the form of an automated email or the resolver group liaising directly with the customer )
• Proceed to IM.112: Communicate with Users.
If further action is required by the customer to confirm resolution.
This action could be any number of things, such as:
• Retrying the task they were attempting when the incident occurred to validate resolution
• Taking further action where the customer is part responsible for the service (such as the Library Management System application)
If no action is required, proceed directly to IM.114: Communicate with Customer(s).
If action is required, this needs to be documented in the record and tracked. Proceed to IM.111: Resolve and Update Record Review & Close
IM.112: Communicate with Customer(s)
The Resolver will send an email to the Customer upon change of Status to ‘Resolved’. The clock will be stopped. The email will contain details of the incident and the resolution action(s) taken, among other information.
Note- IT intends to automate communication with customer and close via an automated email notification from the HEAT IT Service Management tool on ‘Resolution’ of the incident. This email will inform the customer that the Incident will be closed within 2 weeks, unless the they contact the Service Desk to indicate otherwise.
If the ticket has been re-opened or it has breeched the SLA the Resolver may need to contact the customer direct to validate satisfaction with the resolution.
The HEAT tool will now monitor the incident and if no correspondence is recieved to the contrary, the resolution can be considered successful and the incident can be closed.
However, if the customer contacts the Service Desk to the contrary or the Service Desk determine that the resolution was not successful through other means, then it must be sent back for further work.
Proceed to IM.114a: Close Authorised?
IM.113: Close Authorised?
If the incident is not resolved to the Customer’s satisfaction the First Line Analyst must:
1. Change the status of the incident back to ‘Assigned’
2. Clarify and document any new information provided by the customer
3. Determine who resolved the incident
a. If the Service Desk resolved the incident, return to IM.104: Initial Diagnosis.
b. If the Service Desk did not resolve the incident, move back to IM.106: Escalate Incident to group to assign the incident to the queue of the team which resolved it.
If this contact is unconnected but relates to a new incident
1. Update the existing incident record to reflect that a new incident is required
2. Progress to IM.115: Close Incident to finalise the existing incident
3. And then move to IM.103: Capture Incident Details to log a new Incident
If the incident is resolved to the Customer’s satisfaction the HEAT system will automatically close the incident after 2 weeks, proceed to IM.115: Close Incident
If the customer is still not satisfied, after a number of attempts to resolve the incident, the Service Desk staff may escalate the incident to the Service Desk Supervisor.
The Incident record should be checked for the following qualitative parameters:
• Closure categorisation: should initial classification be rectified?
• Incident documentation: sufficient level of detail
• Ongoing or recurring problem: Link Incident record to Problem record
• Quality of the information in the record (Can everyone understand what occurred, how, and how it was dealt with?)
• Identify any learning actions. By reviewing the resolutions for learning actions the Service Desk staff may be able to resolve the incident next time. Where learning actions are identified, knowledge items and work arounds are to be documented and included in the Knowledge Base
Note- IT intends to select a 2% sample of incidents to check the quality as per above.
IM.114: Close Incident
The automated workflow engine will automatically close the incident after 2 weeks.
IT intends to send a customer satisfaction survey to the customer when the incident or service request is closed. This will consist of a link to the survey system in every closure email. This is looking to measure the customer experience and determine if they are happy with the service provided.
When this step is completed the process has been completed. End Process.
Process Design
Incident Management Process
Incident Management Process Page-2
Incident Management Process Page-3
Other Processes Yes
End User
First Line Analyst
Incident
Manager
Incident
Response Team
Incident Management Process Page-4
Other Processes
End User
First Line Analyst
Incident Manager
Incident
Response Team