Ideas on Development of Disaster Recovery Procedures?

Subject: Ideas on Development of Disaster Recovery Procedures?
From: Geoff Hart <ghart -at- videotron -dot- ca>
To: TECHWR-L <techwr-l -at- lists -dot- techwr-l -dot- com>, tremblay -dot- lyse -at- tremblayprudly -dot- com
Date: Fri, 16 Jun 2006 09:33:30 -0400

Lyse Tremblay wondered: <<I am responsible for coordinating the development and creation of disaster recovery procedures for various technological areas for the business continuity project for which I am the Project Coordinator.>>

Here's an interesting exercise you should try: Ask each person involved in the project to tell you what they'd do if they came to work tomorrow and discovered a large, smoking hole in the ground where your workplace used to be. <g>

Other than making them grin, the real goal of this exercise is to determine how they would rebuild their part of your business from zero if they had to do so. It's a much more rigorous exercise than simply tallying up the software installed on their computers, and will also involve careful study of the work they actually do--which is often significantly different from what's written in their job description and from what they _think_ they do. A lot of daily tasks become so familiar we forget we're doing them.

This exercise will lead to obvious conclusions such as the need for offsite backup, but also to less-obvious things that only the frontline workers would think of. For example, a few years back I read about a laboratory that had developed a genetically unique line of mice that were used extensively in medical research. Their lab was destroyed by a hurricane (if memory serves), and everyone was very happy about the offsite backup of their computers and of the procedures used to breed these mice--until one of the lab techs pointed out that they'd just lost 5+ years worth of development work because nobody had thought to create an offsite "backup" of the mice.

<<Basically, I see these as being more actionable steps/procedures that can sometimes performed in parallel but most often performed sequentially by the various cross-functional responsible SME in each functional area.>>

There are also two aspects you must cover for each procedure: First but not necessarily obvious, the preparation step. Second, the recovery step. For example, for computer data, the preparation step involves identifying every bit of data (including things like Windows registry data and Word's Normal.dot template that nobody ever remembers to back up) that must be backed up, then identifying a safe location for offsite backup of the data and setting up a procedure to ensure that those backups are made ***and verified***. (Verification is another step many people skip.) The recovery step involves clear instructions on how to actually recover the data, and it's arguably the easy part.

Note that procedures must not exist only on paper; they must "live". This means you need to find a way to make them part of each person's actual work so that they become as natural and predictable as the morning cup of coffee. Also note that these procedures must be tested to destruction; for non-critical instructions, it's less important if somebody screws up because you can play around until you fix the problem. In a disaster, you don't have that luxury.

Also consider splurging on a really good editor. Disaster recovery is done by a bunch of people stressed out to the max, often under ridiculously tight deadlines. Instructions that are a bit turgid but still comprehensible under ordinary circumstances can pose impenetrable barriers in a crisis.

<<For example: database server fails and not only does the server needs to be recovered but the databases and data that resided on that server would also need to be recovered and synchronized.>>

This kind of documentation problem requires the SME to carefully list all the likely ways something could fail, and some of the unlikely ones too. Each disaster mode may require a different solution.

In addition to figuring out how to recover, it pays to invest some time figuring out how to prevent the disaster in the first place. For example, I recall reading that industrial strength databases (e.g., banking data) use a two-phase commit with rollback, which basically means the original data is written somewhere safe, and the update of the data does not proceed until the software confirms that this backup of the data was successful; only then does the software make the change, and if the change was not successful, it "rolls back" to the original situation using the backup and notifies a human that there's a problem. But I've seen many small-scale databases programmed without this level of security.

<<My goal is to ensure standardized formatting and content of these procedures. I am researching through this group what would be the best way to write and present these in a standardized way?>>

The "best" way is one that your actual staff can use effectively in a crisis. Thus, spend some time with them figuring out how they would typically proceed, creating something to support that use, then testing it to ensure that it works. Where it doesn't work, find out why and revise accordingly.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --
Geoff Hart ghart -at- videotron -dot- ca
(try geoffhart -at- mac -dot- com if you don't get a reply)
www.geoff-hart.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

WebWorks ePublisher Pro for Word features support for every major Help
format plus PDF, HTML and more. Flexible, precise, and efficient content delivery. Try it today!. http://www.webworks.com/techwr-l
Doc-To-Help includes a one-click RoboHelp project converter. It's that easy. Watch the demo at http://www.DocToHelp.com/TechwrlList

---
You are currently subscribed to TECHWR-L as archive -at- infoinfocus -dot- com -dot-
To unsubscribe send a blank email to techwr-l-unsubscribe -at- lists -dot- techwr-l -dot- com
or visit http://lists.techwr-l.com/mailman/options/techwr-l/archive%40infoinfocus.com


To subscribe, send a blank email to techwr-l-join -at- lists -dot- techwr-l -dot- com

Send administrative questions to lisa -at- techwr-l -dot- com -dot- Visit
http://www.techwr-l.com/techwhirl/ for more resources and info.


References:
Ideas on Development of Disaster Recovery Procedures: From: Lyse Tremblay

Previous by Author: Tools: "Windows Genuine Advantage"
Next by Author: RoboHelp auto save?
Previous by Thread: Ideas on Development of Disaster Recovery Procedures
Next by Thread: Re: Ideas on Development of Disaster Recovery Procedures


What this post helpful? Share it with friends and colleagues:


Sponsored Ads