Updated May 30th, 2022
This guide will help you understand incident management and how to deal with unexpected events to minimize potential losses.
One thing to remember — the problem is not the occurrence of incidents but how effectively you can handle and resolve these issues without significantly affecting your business.
What will you learn:
- Why do you need incident management
- How to categorize incidents
- Incident management life cycle and its best practices
Let’s dive into understanding how incident management can help you deal with unexpected events and keep your business safe.
Table of Contents
- What is an incident?
- What is incident management?
- The need for incident management
- Incident management life cycle
- Best practices for incident management
What is an Incident?
In the information technology space, the ITIL (Information Technology Infrastructure Library) defines an incident as any unplanned event that could interrupt or reduce the quality of an IT service.
This includes events that may not disrupt a service completely but impact its quality, e.g. slow internet speed or viruses consuming processing power.
What is Incident Management?
Incident management is the process of identifying, managing, and analyzing such incidents to restore service operations to normal with minimum impact on the business.
Much like how you have your own processes and tools within your own life to prevent theoretical misfortune, such as:
- Making sure your phone is plugged in or having a separate alarm clock to keep you from being late for work, or
- Installing a smoke detector in your apartment to prevent a fire or reduce the potential damage.
For the IT services in your business, this would include implementing firewalls and detection systems to protect and monitor your systems.
The Need for Incident Management
Incidents can disrupt your business operations, lead to inactivity, and even contribute to the loss of data and production.
Here you have two examples:
- In 2010, the Stuxnet worm destroyed multiple centrifuges in Iran’s nuclear power plant. It was not a remote attack but spread through an infected USB. A simple unauthorized access led to a huge political and national crisis with losses in the millions.
- A more recent incident is that of the exploitation of the printer spool service in windows systems, dubbed as PrintNightmare. A combination of remote code execution and privilege escalation enabled the attacker to take control of the system.
Here is the deal…
Being part of the Incident Management team does not mean only acting when there is a fire to put out, but creating and refining preventative processes to reduce the chances of an incident.
From the printer not working, to service being completely down – each incident does not carry the same impact level. Each event needs to be categorized in order to be efficiently resolved.
This is done by keeping multiple variables in mind:
- Impact: The effect of an incident on your business services or processes
- Priority: Variable used to define the importance of an incident. You can usually define it as Low, Medium, or High.
- Time period: The agreed expected response time and resolution time of the target event. This is usually incorporated in the SLAs and defined for each phase of Incident Management.
- Urgency: How long it takes for an impact to affect your business significantly.
Usually, an ‘Impact’ and ‘Urgency’ matrix can help you assign a final level to an incident. A high-impact incident may have low urgency and vice versa and needs to be defined by your organization.
An incident with high impact and high urgency is known as a Major Incident.
Incident Management Life Cycle
There are many standards like ITIL, NIST Incident Handling Guide, PCI-DSS, etc. that define Incident Management processes, but broadly you can divide the multiple phases into three main stages:
- Pre-Incident is mostly administrative and focuses on detecting and identifying an incident
- Incident Response actually mitigates and resolves the incident that has occurred
- Post-Incident wraps up the process and usually focuses on generating detailed reports and lessons learned.
Let’s have a closer look at the various stages of an incident.
1. Pre Incident
Identification & Logging
- Identification: This stage identifies that an incident has occurred. It is usually carried out with monitoring and detection systems in place. However, this does not necessarily ensure that an incident will always be detected beforehand.
- Logging: After identifying an incident, you need to keep track of it throughout its lifetime until the incident is resolved. You can usually generate a ticket against the incident with information like the date and time and its impact.
Logging and documenting help keep track of previous incidents, which you can view later for various purposes like auditing, trend analysis, or forensics.
Classification & Prioritization
- Classification: This step is essential in resolving the issue and is usually graded according to the requirements of your organization. For example, an incident can be categorized for hardware or software and further sub-categorized into printers, servers, etc.
Simplicity is key here; if you create too many categories and subcategories, it can quickly become unmanageable.
- Prioritization: This step assigns a level to the incident based on both its impact on your business as well as its urgency. An incident with low impact and high urgency has higher priority than an incident with high impact and low urgency.
2. Incident Response
Investigation & Diagnosis
First, you need to investigate who needs to be involved in resolving the incident and performing an initial diagnosis to understand the problem. Can the IT team resolve the incident? Does executive management need to get involved?
Resolution & Recovery
Easier said than done, but this step is as simple as finding a solution to the incident and ensuring that your business services and operations resume as soon as possible.
3. Post Incident
After the incident has been successfully resolved, you can close the ticket. Next, you can generate reports to ensure that it is not a recurring incident. Finally, you can set meetings with required members of your organization accordingly.
Best Practices for Incident Management
- Define Incident Management procedures, policies, and protocols for communication during an incident. Also, define guidelines for detecting, assessing, documenting, reporting, and responding to an incident.
- Develop an Incident Response Checklist that can help guide an employee or customer in identifying an incident.
- Establish an Incident Response team with skilled members. You have to define roles and responsibilities for each member. The team should have representation from other departments as well.
- You have to create a process to inform involved or impacted parties with the cooperation of the legal team.
- You can automate the classification and ongoing status of incidents to reduce the chances of errors and save time. Besides being efficient, this also helps your keep track of multiple active incidents.
- You should develop a training program to test your Incident Management plan and practice security procedures. It would be best if you also created an awareness campaign for your employees
- An analysis of past incidents can help you identify any recurring events and narrow down any vulnerable areas of your organization. You could also establish forensics (or third-party services) for the analysis and investigation of incidents.
Want better incident management?
StandardFusion is an end-to-end GRC software that you can use to develop an Incident Management plan centred on your organization’s information security and compliance requirements.
Contact our team and set up a demo to see how you can develop your own incident management plan for any scenario.
An Incident Management Plan ensures customer satisfaction through quick and efficient response, analysis, and logging of an incident. This makes it an essential tool for any service-based organization.
Do you have any other questions? Contact our team and we’ll be happy to help you.