Archive for Disaster Recovery

The Job of the Business Continuity Planner

Many professionals that I talk to seem to think that the Business Continuity Planner’s job is to ensure their company can recover from business interruption events.  Now, this may just be an argument in semantics or me simply splitting hairs, but I don’t quite see it that way.

In my way of thinking, the Business Continuity Planner’s job is to make sure that management is informed of risks, potential impacts resulting from those risks and the costs/benefits of options available to mitigate or respond to those risks, so that management can make informed and intelligent decisions about what mitigation and recovery strategies to invest in.  And, when those decisions are made, the Business Continuity Planner is responsible for helping manage and coordinate the implementation and testing of those solutions.  But, it is senior management’s job to ensure that the company can recover from business interruption events.

In my mind, the worst thing that can happen to a Business Continuity Planner is not that the company cannot recover from an incident, but that senior management is justified in saying, “But no one told me that this risk existed and these implications could occur”.  If the Business Continuity Planner can show that the risks were identified, the impacts clear and viable solutions presented that management chose not to invest in, then the Business Continuity Planner had done his/her job.

We cannot force management to invest in business continuity or disaster recovery solutions, but we can let them know, with no uncertainty, what is potentially at risk should they not invest in, or under-invest in, business continuity and disaster recovery solutions.  Our jobs are to ensure that there are no surprises about what might occur and what the impacts might be should a business interruption event occur.

Prior to management making decisions to invest in solutions, the Business Continuity Planner’s job is to gather information, research risks and solutions, perform cost/benefits analysis and communicate our findings to the proper decision makers.  We are often research analysts and salespeople.  And, it is a difficult sale to make – asking management to invest capital from a limited available cache in our programs as opposed to other programs being pitched by other department managers.

Part of the risks we must inform management about, goes beyond the risk of disasters, but also includes the risk of being out of compliance with laws, contracts and industry standards.  And, we must be brutally honest about our abilities to respond and recover.  We do this by realistically conducting exercises and tests and reporting back the findings without a bias towards success.

Our jobs are to set expectations consistent with the risk environment and solutions in place today.  It is senior management’s job to decide what risks are acceptable and how much to invest in improving our solutions.  If they do not have all of the right information to make that decision, it is then that we have failed in our jobs.

Having Plans Even If You Don’t Plan to Recover

I once had my lead sales and marketing guy pull in a favor and get me a meeting with the president of a small specialty, food processing company to discuss business continuity planning and the potential of us helping in the development of a program for this firm.

As soon as I walked into the conference room, this gentleman announced, “Joe, I really don’t know what there is for us to discuss, the fact of the matter is, we have this one location with a lot of expensive and unique equipment.  If a disaster takes us down, we simply go out of business.  There is no way, short of building a whole new factory for us to get up and running again.  And, quite frankly, that would just be too expensive and not practical.”

Now, of course, I talked to him about the value of having data backup and recovery plans for all of his computer resident data and infrastructure, but, he felt he had all of that in place and was confident with his IT recovery solutions.

So, instead of trying to convince him that he should have some sort of business continuity plans, I told him that even with the strategy of “shutting down and going out of business”, you want to make sure that you do that right – and, that that strategy also requires pre-planning, pre-provisioning, and exercising.

For example, there are things you need to do to go out of business properly:

  • You may still have accounts receivables to be collected.
  • You probably have accounts payables that need to be met.
  • You probably still owe your employees their last paychecks.
  • You have bank accounts and other financial matters that must be closed.
  • You might have salvaged equipment to be sold.
  • You might have legal obligations that need to be addressed.
  • For customers with unfulfilled orders, you might want to help them find another company that could help them.
  • And more.

You don’t just simply stop functioning as a business; there are things that must be done to dissolve the entity.  And, these things will require some people to be active and some tasks to be performed.

Your plan should include strategies for:

  • Getting your trusted advisors together;
  • For communicating with employees, suppliers and customers;
  • For addressing financial and legal matters;
  • And others

I think we were both surprised that by the end of our meeting, we were shaking hands on a project to team up and document his business continuity – or, should I say – business cessation plans – which we now know as his Crisis Management Plan.

The moral of the story is, even if your strategy is not to invest in recovery solutions, which, in some cases might be the most prudent strategy, your firm still needs a Crisis Management Plan to see that strategy properly employed.

At the end of the day, we had another satisfied client.

Business Continuity Planning in 140 Characters or Less

As I have mentioned in some recent blogs, I am now immersed in the world of Twitter.  The challenge of tweeting is trying to get a message across in 140 characters or less.  This is especially difficult when much of your audience does not know your jargon and you need to spell out many of the words to make a coherent point.

At first, I tried to find famous quotes from others about planning or disasters or emergencies and response.  I found a few, many of which I had posted earlier in this article on “planning”.  But, after a while, I had to challenge myself to come up with some business continuity, emergency response, crisis management, and disaster recovery related tweets of my own.

In this blog, I am simply going to share those tweets that I have come up with so far – and, if I must say so myself, I think a few of them are pretty good for 140 characters or less, but I will let you be the judge of that.  I tweet a lot about current events and other topics; this blog only includes general quotes about the field in which we practice.  I hope you find one or two you like.

And, if you do like them – re-tweet them.  And, please feel free to follow me on Twitter, @jpflach

Joe Flach Tweets from Past Weeks:

Planning ahead is important; practicing ahead is vital. A script w/out rehearsals doesn’t prepare you for opening night.

Knowing how to respond before the disaster strikes saves precious time in figuring out how to respond after it strikes.

Disasters happen. Recoveries have to be orchestrated.

I believe in the power of prayer – except when it comes to business continuity, then I believe in the power of planning.

How you respond to a crisis may adversely impact your company more than the crisis itself. Add Communications & PR teams to your plans.

The disaster that impacts your company may also impact employee’s homes – make sure continuity plans include alternate workforce options.

There r heroes who rush into burning buildings to save ppl and heroes who improve fire prevention and evacuation plans. The latter is easier

If you are not worried about the impacts of a disaster on your company, then who in your company is?

The fear of failing a business continuity test results in masking many a program’s weakness and promoting a false sense of security

It takes 1 to plan, many to be prepared. Train, educate and exercise your programs.

There is no one right way to prepare for a disaster – but, not preparing for one is clearly the wrong way!

There is a thin line between being unprepared for a disaster and being negligent. Don’t put it to the test: be prepared.

When the fire alarm sounds, people do not reach for the “Fire Alarm Response Manual”. Same should be true when you “Declare” a disaster.

If disasters strike when least expected, then make sure you always expect one.

Risk mitigation programs do allow for calculated risks. That is why most cars have only 1 spare tire instead of 4.

Business Continuity Planning is not about preventing any loss following disaster; it is about limiting losses to a defined, acceptable level

The only “failed” emergency response test is one in which you do not discover ways to improve your program

The best way to handle a disaster is to stop it from happening. Create Disaster Prevention and Risk Mitigation Plans.

“The good Lord willing and the creek don’t rise” is a fun colloquial saying and does not a good business continuity plan make.

Continuity Plans are like backup parachutes – hardly ever needed but you don’t want to operate without one.

Many people experiencing a crisis simply freeze because they have not been conditioned how to respond. Break the ice and conduct training.

Incidents become disasters for those who are not prepared.

The only thing worse than having no emergency response plan is thinking you have one when you don’t. Be honest: don’t promote false security

 

There are more, but I think that is enough for now.  Did you find one you like?

Are RTO’s Stagnant? Should They Be?

In many business continuity programs, there are known and established Recovery Time Objectives (RTO) for business processes and for IT applications.  More times than not, these RTOs are static and the response and recovery programs are built around these numbers as they came out of a Business Impact Analyses (or were merely assigned based on an educated guess).

I just wonder if it is reasonable to assume that recovery priorities remain the same throughout time.  And I am not necessarily questioning whether they remain the same over time – but are they the same at different points in time throughout the business year or business cycle?

For example, is it reasonable to assume that our recovery priorities, or RTOs, might be different if the disaster occurred at month end or at year end as opposed to some other time in the year?  Might our recovery priorities be different if we are in the middle of launching a new campaign or product or service?

And, could the disaster itself influence our recovery priorities?  Could RTOs be different if we experience a data center only disaster versus a disaster that also impacted our workforce?  Could our recovery priorities be different for a single-site disaster versus a multiple site disaster?  Could we have different RTOs if we knew that the downtime was going to be hours or days versus weeks or months?

Now, I hate to over-complicate things in the planning process.  I am always warning folks to avoid paralysis by analysis in the planning process, but I think these are legitimate questions to pose to mature and solid programs that are looking to continue to improve and strengthen their recovery posture.

This is also why I think it is important for your recovery program to include a well thought out and implemented Crisis Management component that gets the right decision makers together and empowers and enables them to make changes to the recovery process as the situation, at that time, dictates.  So, maybe we maintain our single RTO, but we have the infrastructure in place that can accommodate changes in our recovery priority if and when needed.

Just something for the experienced planners to thing about and challenge their teams to consider in the maintenance and improvement process.

Having Plans vs Being Prepared – Avoid the Oops

I have recently posted a couple of blogs discussing the difference between Planning and having Plans.  In this blog, I want to explore the difference between Having a Plan and Being Prepared.

I have been in a number of environments where I thought the organization had great business continuity or disaster recovery plans – but, I did not believe that they were prepared to recover from a business interruption event.

Most plans rely on a number of “enablers” that have to be in place in order for the plan to be successfully executed.

First and foremost, the physical environment that the plan relies on has to be in place.  I have gone into a number of situations where the Executive Teams were convinced that their planning team had put great plans in place and I had to be the one to tell them that the plans were based on infrastructure not yet put into place.  “Yes, your plan is to recover applications in an alternate recovery site, which is a terrific plan … but you have not invested in or built out that site yet.”  Oops.

Secondly, the plan must be socialized and known by those who must manage to the plan.  I have seen some great plans sitting on shelves, known by only those few who wrote the plan – but all the people that would have to oversee and manage the execution of the plan had never read or been educated on the plan.  Oops.

And, third, in order to really be prepared, you must test, exercise and drill the plan.  It is through tests that you validate the correctness of the plan; through exercises that you discover ways to improve the plan; and through drills that you condition people on how to respond when executing the plan.  I have been in many environments where the plan may be understood by everyone, but never physically put into action to see if it will actually achieve the intended results.  Oops.

So there is much more to being prepared than to simply having a good plan.

Having just passed the anniversary of the D-Day invasion, perhaps that will serve as an example of what I am talking about.  There were relatively few people that actually “planned” the invasion.  And only a few more that were educated on the plan.  But, 10’s of thousands of others that had to “prepare” for it, in order for the plan to work.

It only takes a few people to create a good plan … but, it takes an entire organization to be prepared.

Don’t let your good plans fail because of an oops.

More Thoughts on Planning and Plans

Mike Tyson is quoted as saying, “Everyone has a plan, ‘till they get punched in the mouth.”

How well do your plans stand up to the punch in the mouth?

Field Marshal Helmuth von Moltke put it this way, in a more familiar quote, “No plan survives contact with the enemy.”

In our case, the enemy is the disaster or business interruption event we are planning for.

And, Arthur C. Clarke, had this observation, “All human plans [are] subject to ruthless revision by Nature, or Fate, or whatever one preferred to call the powers behind the Universe.”

The point is; whatever you had in mind when developing your business continuity, emergency response or disaster recovery plans, the event you will have to respond to will be nothing like what you envisioned.  Now, I know many of you are thinking, “That is why we do not plan for particular scenarios, we plan for the impacts of scenarios!”  But, I still say, you cannot plan without certain assumptions and certain biases about how the response will take place or how the crisis will unfold – and, I suggest, it won’t happen that way.

This is why I always like to look for evidence in a plan that you have provided the framework for decision makers to get together, make changes to the plan as needed, and, have the means to communicate these decisions to those who need to know this information.

I happen to believe in what Lester Robert Bittel had to say about planning, “Good plans shape good decisions.”  But, it is important to understand that not all decisions are made ahead of the event and the good plan must lay the foundation for at-time-of-disaster decisions to be made to adjust the plan based on how the enemy is responding.

Now, I happen to make a good living from helping organizations create, document and test crisis management, emergency response, business continuity and disaster recovery plans.  So, I would not dare under-emphasize the importance of planning – but, like some of the quotes I will share below – I think the value gained is in the planning process and not so much in the plans.

Dwight D. Eisenhower said it this way, in a quote that is often repeated, “In preparing for battle I have always found that plans are useless, but planning is indispensable.”

Dr. Gramme Edwards paraphrases it this way, “It’s not the plan that’s important, it’s the planning.”

Indeed!  It is in the planning process where we build out solutions, implement recovery capabilities and exercise our abilities to respond.  This is the real value and the enablers that will allow us to survive the business interruption event.  The written plan, with step-by-step instructions for how we operate, sometimes for weeks after the event – will hardly ever be referenced and certainly, not referenced after the first 24 hours.  I do believe that those decisions we made before the event that provide action steps within the first few hours of an event can be valuable – but once decision makers get together and have the luxury of a little time to figure out where we currently stand – decisions made before the event occurred will have less value.

The capabilities we have in place because of the planning process will be the key to our survival.  How we utilize those capabilities will require flexibility based on the event itself.

Winston Churchill said, “Those who plan do better than those who do not plan even thou they rarely stick to their plan.”

I think that is a much better way of saying what I mean!

I do run up against “pride of authorship” when I evaluate written plans – and I understand and completely empathize with that.  I am guilty of the same.

But, Publilius Syrus says, “It’s a bad plan that admits of no modification.”

I do believe in the power of planning.  And, I agree that planning is essential.

Although attributed to many different people, I think Tariq Siddique says it best and simplest, when he states, “If you are failing to plan, you are planning to fail.”  (This quote is often attributed to Benjamin Franklin, who may have said the same thing or something very similar.)

And, I couldn’t agree more with Sun Tzu in his The Art of War, when he suggests, “Plan for what is difficult while it is easy.”

This is why we must plan before the disaster.  Not only because we do not have the luxury of time to plan afterwards, but because the planning process is easier lacking the chaos and confusion that will accompany the disaster.

But, remember, it is the planning that is important and the resulting capabilities put in place during the planning process.  The plans themselves, may not be what is needed to get you through the particular crisis you are responding to.

Hillel J. Einhorn states, “In complex situations, we may rely too heavily on planning and forecasting and underestimate the importance of random factors in the environment. That reliance can also lead to delusions of control.”

I think our plans need to allow for the flexibility to respond to these random factors.  And, yes, I do think some of us have “delusions of control” when it comes to assessing our state of readiness.

I want to end with two more thoughts on planning.  I have witnessed so many programs lacking progress because of their desire to create the perfect plan.

George Patton is quoted as saying, “A good plan today is better than a perfect plan tomorrow.”

I agree.

And, lastly, when exercising our plans and our recovery capabilities, I so often find planners who like to assign pass/fail grades to the tasks.  I like to rely on what Thomas Edison had to say about failures, “I have not failed.  I’ve just found 10,000 ways that won’t work.”

There, I think I have reached my quota of quotes.  If you made it all the way to the end of this blog – I applaud you.  Thanks.

If you have a favorite quote to share with us, please do so by adding a comment.

Disaster Recovery Planning vs. Disaster Recovery Plans

So often, when we are engaged to review existing business continuity and disaster recovery plans, we find volumes of “plans” with very important planning information but very little in the way of action plans for at-time-of-recovery activity.

By this, I mean, many “plans” include information discovered in the BIA and Risk Analyses.  There are tables and reports on what the impacts are for being down, what the requirements are in a recovery center, how many desks are needed in a recovery site, special equipment requirements, special forms, vital records listings and locations, what the critical applications are, RTO’s, RPO’s, vendor listings, employee listings, and on and on and on.

All of this information is CRITICAL INFORMATION for designing a recovery solution, but is of no real value at time of an incident.

At time of disaster, I need to know how to engage the plans and how to employ the capabilities that are provided –based on all that information listed above.

In my opinion, this information should be segregated.  When a business interruption event occurs, I do not care what the findings were in the BIA or RA – all I want to know is what is in place now, how do we get to it and what do we do when we get there.

I review many plans that pass the weight test but are so full of “noise” and so loaded with information that they become too bulky and are not usable as an action plan for what we do.

Sometimes it can be as simple as separating the two parts of the plan – many times, the “action plan” component is missing altogether.  This is sometimes especially true when a database software tool is used.  The database reports look so good and fill up so many pages, people think that that is the plan.  No, that is a collection of information needed to ensure we put the proper capability in place, but is not the action plan for how we employ that capability.

Practical, pragmatic, easy-to-use action plans are hard to come by, but, what I am most interested in finding when asked to review an organization’s level of response preparedness.

Do not confuse a compilation of information gathered in the planning process as being your disaster recovery plan.

Are We Prepared for the Next Disaster?

I found and listened to this NPR radio story titled, “Is the U.S. Prepared for the Next Disaster?”.  Even though this interview was conducted a year ago, I think the message is still valid and important.

I think the interviewee, Craig Fugate, does a good job in identifying a problem with past disasters being a failure to engage the proper level of support through a formal request for assistance.  Although Mr. Fugate doesn’t use this term, I like to label these the “triggers to engage”.  One of the biggest problems with the response to Hurricane Katrina was that Federal authorities assumed the trigger to engage was a call from the local authorities, whereas the local authorities thought the trigger to engage was the event itself.  While Federal agencies were waiting to be asked for help, local agencies were sitting and waiting for the help to arrive.  Meanwhile, crucial time was slipping by and the losses and damages were escalating.

I was glad to learn that FEMA now self-engages not only when an incident occurs but also when the threat of incident rises.

I think this is an important lesson to learn and address in our own plans.  I think it is important to identify and practice those “triggers” for engaging certain components in our Emergency Response, Business Continuity and Disaster Recovery Programs.  What are the “triggers” for: putting vendors on alert; communicating with employees; mobilizing resources; alerting customers and other stakeholders; declaring a disaster; etc.?

Also, Mr. Fugate notes that having a single entity in charge introduces a single-point-of-failure in the response process.  Whereas, I understand his point, I also think it is important to mention that when you have lots of links in your communication and control “chain” you have lots of opportunity for the chain to break.  If the mayor engages the governor who engages the president – well, there are lots of mis-engagements that can occur.  And, if one link in the chain breaks, all the links that follow are missed.

I agree with Mr. Fugate that we are better prepared today than what we were in the past, but saying you are in better shape today than you were when you were grossly out of shape, does not mean you are in good shape.  Unfortunately, I also believe that the further removed you are from the last significant event, the more likely you are to get back out of shape.  We are never more prepared to respond to a disaster than we are immediately after a disaster occurs.  Lessons learned are fresh in the mind, implementation guidelines and procedures are reviewed, refreshed and rehearsed.  But, as time goes by, we start to, once again get complacent and once again start to slip back into our bad habits.  And, as soon as we start to believe we are in good shape, I start to get more worried.

In conclusion, I think this is a terrific interview with important messages that are worth listening to again.  I encourage you to think about and rehearse the “triggers” in your program and to identify potential weak links in your communications and engagement chains.  And, never allow yourself to believe we are prepared for the next disaster … continue to work on improving your level of preparedness.  After all … how do you think people would have responded to the question, “Is the U.S. Prepared for the Next Disaster?” on September 10, 2001?

Business Continuity Planning – Beyond the Doomsday Scenario

At the Continuity Insights Management Conference 2012 that I recently attended in Scottsdale, AZ, there was a lot of conversation around PS-Prep which bled into the discussion of “Why get certified” or, the more generic question of, “Why perform business continuity planning?”  An oft repeated answer to this question, echoed by business continuity planners around the world is, “Because without a plan you will not survive as a company.”

I think this is a disingenuous answer without any history to support it.  Where exactly is the evidence of this fact?  What historical data can you share with me, or the CEO you are trying to convince, that this is the case?  I am confident that you can dig up cases of small companies that did not survive a disaster, but where is that story about the big guy who did not survive the disaster?

The one and only case study I can think of off the top of my head is Enron, but that was a disaster of a different kind.

Look at BP and the horrific Gulf Coast disaster – they survived.  Did they have a plan in place for this?  Maybe … if so, most professionals would argue against its effectiveness.  Were they certified?  No.

Look at Cantor Fitzgerald, the one company most widely spoke about concerning the extent of their losses during the events of 9/11.  Survived.  With much loss and many significant challenges, but they are still in business.

We found this article that lists 8 Infamous Business Disasters – those companies all survived – albeit some under a new name and different business model, but they did survive.  Now, not all of these cases are the kinds of disasters we plan for, but I can’t find that one poster child event that proves the statement, “Without a business continuity plan, you will not stay in business.”

Now look, I am a business continuity planner.  I make a living out of helping companies put these programs in place.  I want … no, I NEED … CEO’s and Boards of Directors to embrace the need for these plans and to invest in professionals like me to help put them in place.  But, I think we need a better sales pitch than the shallow threat of; this is needed to survive a disaster.

I don’t think we need C-level executives to buy into this all or nothing proposition with business continuity planning.  No, I think that the message should be:  Business continuity plans will allow us to mitigate our losses should a disaster occur. The goal is to ensure the investment we make in our plans and solutions is justified by the potential losses that could occur considering the probability that an event happens.

The losses that could occur is measured by performing a Business Impact Analysis and the probability that an event happens is measured by a Risk Analysis.

We plan because it is a reasonable business practice to protect our assets and our stakeholders against losses that could impact the market value of our company not just if, but when, a business interruption event occurs.  If you want the answer to, “Why get certified”, check out this earlier blog we posted.

We need to sell business continuity planning using business terms that executives can understand and stop with the doomsday scenario selling technique.  At least, that’s the way I see it.

In the meantime, if you can share those stories with me that support the position companies will not survive without plans, I would love to read them.  Thanks.

Recovery Options Evaluation Criteria

When assessing recovery options for both technology and workarea, we recommend creating an options comparison chart with the following evaluation criteria:

  • Costs  (1x, Recurring, ATOD)
  • Accessibility
  • Testability
  • Scalability
  • Flexibility
  • Compatibility
  • Lead Time to Implement
  • Proximity
  • Shared Risks
  • Solution Complexity
  • Burden of Obsolescence
  • Ability to Meet Recovery Objectives

These criteria can be evaluated and scored with a relative value, such as High, Neutral, Low or a tailored value for each criteria, such as Most Complex, Neutral, Least Complex, etc.

Simply as an example, the final “scorecard” might look something like this:

Evaluation Criteria

Internal Recovery Solution

Hosted Recovery Solution

Recovery Service Vendor

Costs
    One Time Costs

$ 5.0 mil

$500k

Minimal

    Recurring Costs

Minimal

$5k/mo

$5k/mo

    At-Time-Of-Use Costs

Minimal

Minimal

$15k + $2k/day

Accessibility

Very Accessible

Mostly Accessible

Limited Accessibility

Testability

Easy to Test

Moderate

Must be Scheduled

Scalability

Limited

Moderate

Very Scalable

Flexibility

Limited

Moderate

Very Flexible

Compatibility

Very Compatible

Very Compatible

Somewhat Compatible

Lead Time to Implement

Long

Moderate

Short

Proximity

Nearby

Options

Options

Shared Risks

Some

Few

Few

Solution Complexity

Moderate

Moderate

Complex

Burden of Obsolescence

On Our Company

On Our Company

On Recovery Vendor

Ability to Meet Recovery     Objectives

Fully

Fully

Mostly

Security

Secured

Somewhat Secured

Somewhat Secured

Availability at time of     Need

Total

Total

At Risk

Facility Management     Requirement

Internal

External

External

 

Each value may bear farther explanation and justification, but the summary is nice to display in a side-by-side comparison format.

If you want a more detailed explanation of any of the Evaluation Criteria or want to suggest others to include in a full analysis, please feel free to share your thoughts by entering a comment.