Disaster recovery refers to the IT components of the business that, in times of a disaster, need to be safeguarded so that business can be continued. Disaster recovery is more a preventive plan set in motion prior to the organization and implementation of the business than a series of actions that are followed once the disaster hits the company. Given that most companies are, to a large extent and in many ways, reliant on their IT system, and that collapse of IT system has ramifications beyond the company, disaster recovery has become a significant part of planning to today's organization.
Disasters can be classified into two areas:
Natural disasters -- for example floods, hurricanes, or earthquakes where mitigation measures ahead of time can work towards avoiding or reducing data loss and IT cessation.
Man-made disasters -- such as terrorism where surveillance and avoidance planning can also work towards mitigating and reducing possible determinable results.
Most large companies spend between 2% or 4% of their budget on disaster recovery planning, with the hope that they will avoid large losses were their business to collapse and cease to function, even for a short period of time, due to IT loss of operation and immobility to access data.
The basic principles of planning for recovering from a disaster
Disaster recovery planning (DRP) is a sublet of business continuity plan (BCP) and the two are integrated together when forming a plan for recovering from disaster. Essentially, three measures are involved. These are:
1. Preventative measures -- which are strategies that are planned and worked through to prevent an event form occurring
2. Detective measures -- where controls are put in motion for detecting potential disasters ahead of time
3. Corrective measures -- where controls are planned ahead of time in order to correct and restore the system (business continuity) planning after the disaster has occurred.
The entire plan is outfitted with recovery point objectives (RPO) (i.e. goals for recovery) and recovery time objective (i.e. optimal time limit in which recovery should occur) (RTO). All of this must be matched with the IT systems and infrastructure itself. Each IT infrastructure is then outfitted with specific plans that are most suitable for its recovery and for its system. Both RPO and RTO metrics need to be aligned t a suitable budget, and to be carefully planned in accordance with that budget particularly since some solutions are more expensive than others. Some businesses, again, may have to revert to more expensive solutions depending on their reliance on their IT system.
The most common strategies for data protection include the following:
High availability systems where data and system is replaced in an off-site area where there is continuous access to data and systems
Replication of data to an off-site system
Backups made to both a disk on- and off-site, or backups made directly to an off-site disk.
Backups made to a tape and sent at regular intervals to off site locations.
Precautionary measures that organizations implement include the following:
Anti-virus software and other security systems
Backup generator and/or uninterruptible power supply to keep the system going in the case of a power failure
Local mirrors of systems and use of disk protection technology
Why disaster recovery is important
With some companies possessing customers worldwide and operating in extremely competitive atmospheres, it is crucial for these organizations to respond instantly and speedily. The organizations not only have to ensure that their customers are satisfied but they have to delver timely and quality treatment. Data of these customers and issues related to them would have to be easily and rapidly accessed and, with some businesses, IT downtime may spell a spiral of ramifications where other clients are, in turn affected. Accordingly, downtime, if occurring, would not just have to be explained to one client but to exponential other individuals. For these reasons, the organization has to do it its utmost to prevent disaster, or, in the case, of disaster, deal with it as speedily and as efficiently as possible.
Another factor that demands instant and continuous processing is the nature of the work itself. Some companies have clients who rely on the organization for their computer-related problems or for data that is computer based and information has to see to these clients on a regular standard. If the IT base is unavailable, the organization may experience a significant loss of one or more - or even a chain of -valuable customers.
A break in this work may set off a toppling effect, where other companies, experiencing loss too, set off a spiral of destructive events that may cause significant loss and negative reputation to the organization.
Since the organization's well being and reputation are based entirely on technology, it is, therefore, important that all this technological apparatus that it is accustomed to using be set up for it ahead of time in case of disruption.
The processes and procedures for planning and implementing disaster recovery
There are various options. The first is relocating to an alternate vendor facility where the organization would have to implement a contract with a vendor for which the organization would have pay a monthly service of, for instance $25-$35 for the 'seat'. The vendor guarantees the location to be available for the company during a disaster, and the vendor provides a trailer at this alternative site where all IT equipment such as the required laser printers, PCS and phones, fax machines, access to conference rooms, copiers, switchboards, and LAN servers are available.
The second option is relocating to an internal alternate facility at the company itself. Here, the advantages are that the organization would not have to pay monthly payment; space would have been available, and the alternate facility would have been fashioned prior to disaster so that all equipment would be ready for company to use in case of disaster. The disadvantages, of course, are that the disaster (such s a fire or earthquake) may impact the whole business, impacting this internal site too)
The third option is finding an external alternate site at the time of disaster. Whilst this may be cheaper for the organization, the benefits depend on the extent to which the organization is reliant on IT as part of its business. To some whose business depends on IT (such as being in constant contact with other clients via their IT system and having to keep up minute-by-minute with Internet news), such a scheme may be ruinous to the organization since precious time would be consumed in locating a vacant facility and purchasing and setting up the needed equipment. Long lead times on some of the digital equipment may accrue; and challenges include the fact that the organization must recover systems with software that is (hopefully) backed up and stored offsite
The fourth option is that employees may work from their home. This is advantageous in that it is simplest and least costly to the organization. However, employees need more than PC laptops for them to function equitably. They need a complex system of technological devices. Furthermore, a competitive and complex company necessitates synchronous work and contact. Employees' security certificates must be current, and their Internet connection must be assured to be highly efficient and working smoothly and rapidly for this latest plan to succeed.
The roles individuals and management play in disaster recovery planning, testing, and implementation
The disaster recovery process consists of an a priori plan of defining rules and processes in order to ensure the smooth going of the operation. This means coordination of a planning group as well as performing risk assessments and audits as well as recovery strategies and developing verification criteria.
Stakeholders of the company as well as key people of each division should be members of the team.
Roles that the individuals play are in identifying possible risks and making plans to mitigate the risks as well as exposure to these risks. Further plans are conducted to work towards restoring the operability of systems in the event that hey has collapsed.
The IT Disaster recovery planning process consists of 6 steps where the team does the following:
1. Develops the business contingency planning policy and business process priorities
2. Conducts a risk assessment
3. Conducts the business impact analysis
4. Develops business continuity and recovery strategies
5. Develops business continuity plans
6. Conducts awareness, testing and training of the disaster recovery plan (DRP)
7. Conducts DRP maintenance and implementation.
The importance of testing and ongoing maintenance
It is critical that continuous testing and ongoing maintenance be conducted of the DRP. This is so because organizations undergo continuous change and disasters are often unpredictable and unexpected. The organization does not want to be caught off guard leading in its possibility of losing valuable data and having to fold its business. Plans, therefore, must be up-to-date and change management processes must be constantly reviewed to ensure recovery plan maintenance. Some times, plans may have to be modified and changed altogether…