Tag Archive for BIA

The BIA Insult

So, I came across this quote the other day that someone was using in a presentation about the importance of conducting a Business Impact Analysis (BIA):

“A business continuity plan that is not predicated on or guided by the results of a business impact analysis (BIA) is, at best, guesswork, is incomplete, and may not function as it should during an actual recovery.”

Really?

I understand what they mean and I appreciate this message given to business continuity planners, but, I would hesitate saying this in a board room.  It may not be wise suggesting to the CEO and other senior executives that they do not know their business well enough to tell you what is important to them and what business processes are necessary to keep their organization solvent.

I have long since been of the opinion that business continuity planners have become victims of our own methodology.  I think many of us have lost sight of the why’s and wherefores of what we do and have become too caught up in the whats and how we do things.  And, I think, the BIA is a prime example of this.

Ultimately, why do we conduct a BIA?

I suggest that we perform a BIA to establish the objectives for our Business Continuity program.  We gather and analyze the impacts of a business interruption in terms of financial impacts, reputational impacts, operational impacts, legal and regulatory impacts and other impacts unique to our company or industry.  Armed with this measurable and intangible information, we can make an educated and informed decision about what business processes we need to continue – and, in what timeframe – to minimize our losses and keep the organization solvent following some sort of devastating business interruption event.

I like to break down the standard Business Continuity Methodology into the Strategic Planning Phases and the Tactical Planning Phases.  The Strategic Planning Phases consists of the Risk Analysis, Business Impact Analysis, Recovery Requirements Analysis and Cost Benefits Analysis of viable solutions.  The Strategic Planning part of the methodology helps us define “what” our business continuity plan should achieve.  The Tactical Planning Phases of the methodology define “how” we achieve our objectives.  This includes, implementing the chosen solutions and documenting the policies, plans and procedures.

But, I don’t believe the Business Continuity Planner is always needed to define the Strategy.  I think, in some instances, the “strategy” can be given to us by the CEO, board or other executive management team members.

What if the CEO told you what business processes they want to continue, in what time frames?  Are you going to tell him/her that that would be creating a BCP based, at best, on guesswork?

I know that the methodologies say we MUST CONDUCT a BIA.  But, I think that that requirement is a little bit tangled up.  I think it is absolutely correct to say, before you can  successfully implement a viable and effective business continuity plan you must establish your recovery time and recovery point objectives; you must identify and categorize your business processes in terms of criticality and importance to the sustainability of the organization and the ability to satisfy the corporate mission; you must know the dependencies and requirements that support those critical processes to ensure a complete and holistic recovery solution – but, I am not sure a BIA is always what is needed to get these “strategic” parameters.

Yes, I have been in many a situation where the leadership team was not comfortable in establishing these objectives without the support of information gathered and analyzed through an in-depth BIA.  I have also seen many a business continuity planning team chastised for spending months on gathering and analyzing information simply to conclude in telling management teams what they already knew.  And, I have seen business continuity programs fail at time of an event because they were predicated on the findings from a BIA that were never verified and matched against management’s expectations, which were significantly different from what the information gathered suggested.

Now, I am not against BIAs.  I have made a nice living by conducting many a BIA over the past 20 years, and I do believe they are valuable and necessary tools – just not in every case.    I caution business continuity planners not to become so married to the methodology that you lose sight of what the objectives are for each methodology component.  If the objective of a BIA is to establish the continuity and recovery objectives of your business continuity program and the executive team in your company knows and are willing to sign off on recovery and continuity objectives that are given to you – do you really need to conduct the BIA?

In any case, I don’ think I would ever suggest that a business continuity plan not based on the findings from a BIA is guesswork, especially if the guesses are coming from the Executive Management Team.  I just know that if you came into my company and told me that a team of business continuity planning specialists are needed to identify what our critical processes are, I would be showing you to the front door.

Are RTO’s Stagnant? Should They Be?

In many business continuity programs, there are known and established Recovery Time Objectives (RTO) for business processes and for IT applications.  More times than not, these RTOs are static and the response and recovery programs are built around these numbers as they came out of a Business Impact Analyses (or were merely assigned based on an educated guess).

I just wonder if it is reasonable to assume that recovery priorities remain the same throughout time.  And I am not necessarily questioning whether they remain the same over time – but are they the same at different points in time throughout the business year or business cycle?

For example, is it reasonable to assume that our recovery priorities, or RTOs, might be different if the disaster occurred at month end or at year end as opposed to some other time in the year?  Might our recovery priorities be different if we are in the middle of launching a new campaign or product or service?

And, could the disaster itself influence our recovery priorities?  Could RTOs be different if we experience a data center only disaster versus a disaster that also impacted our workforce?  Could our recovery priorities be different for a single-site disaster versus a multiple site disaster?  Could we have different RTOs if we knew that the downtime was going to be hours or days versus weeks or months?

Now, I hate to over-complicate things in the planning process.  I am always warning folks to avoid paralysis by analysis in the planning process, but I think these are legitimate questions to pose to mature and solid programs that are looking to continue to improve and strengthen their recovery posture.

This is also why I think it is important for your recovery program to include a well thought out and implemented Crisis Management component that gets the right decision makers together and empowers and enables them to make changes to the recovery process as the situation, at that time, dictates.  So, maybe we maintain our single RTO, but we have the infrastructure in place that can accommodate changes in our recovery priority if and when needed.

Just something for the experienced planners to thing about and challenge their teams to consider in the maintenance and improvement process.

The Adjusted Recovery Confidence Factor – Repeat Blog

Over the past few weeks I have actually had a few people ask me to send them the link to an earlier blog I posted about an Adjusted Recovery Confidence Factor.  Since there actually seems to be some interest in this idea – and, since I am really busy working on client deliverables – I have decided to take a blog short cut today and simply redirect you to an article we posted a few months back.

The Adjusted Recovery Confidence Factor

We had much less blog site traffic when this was originally posted so maybe its not a bad idea to put it out there again.  Any thoughts?  We would love to receive more comments on our postings.

Thanks.  And now, back to work.

The Recovery Requirements Analysis

I have been in more than a few BIAs or business continuity planning sessions when it is like pulling teeth trying to get business managers to identify the applications and/or other requirements and resources they need to minimally perform their mission critical business processes.

This is especially true when working with financial traders.  First, they under-estimate their need for tools and resources, believing that as long as they have a phone, they can conduct trades.  But then the list of requirements grows and grows.

A typical requirements analysis session with traders might go like this:

ME:  What do you minimally need in an alternate site to conduct your business?

TRADER:  I don’t need anything.  Just give me a phone and I can trade anywhere.

ME:  So, all you need is a phone?  You can trade with just a phone.

TRADER:  That’s right.  Just a phone.  Well, I also need my data feed.  But, just a phone and a data feed.

ME:  Nothing else?

TRADER:  Well, I need a phone, my data feed and the blotter system.  Just a phone, data feed … oh, and I need trade tickets.  Just a phone, my data feed, the blotter system and trade tickets.  That’s all I need.  Oh, and I need my directory of phone numbers… and a recorded phone line.

Does this sound familiar?

Here, check out this link to a secretly videotaped, recovery requirements session I conducted with one business manager:

Recovery Requirements Analysis Video.

Okay, so I am being funny.  But, if you have done this for as long as I have, I am sure you shared in the laugh.  I have used this routine in a few public speaking sessions I have done on business continuity planning.  It is always a good trick for getting my point across and keeping the audience awake.

And maybe, just maybe, I am being a Jerk!

Establishing RTOs

I think there is a common mistake that we, as business continuity planners, make when working with our business partners to determine RTOs for processes and applications that support them.

I think we do a good job in using the findings from our Business Impact Analyses (BIA) to help identify the Most Critical, Critical and Essential business processes (or whatever labels you happen to use) to ensure that these processes are what we recover first, but, I think when we work with these areas to define Recovery Time Objectives (RTO) we do not properly establish the post-disaster performance objectives.  I think that most of us allow our business partners to establish their RTOs based on the assumption that they will be operating at or close to business as usual.

Sure, we instruct them to try to establish the minimum requirements and consider work arounds and the such … but, to achieve what end?  How many of us first ask senior management if there will be any changes to our management objectives following a serious business interruption event?  Will revenue or income targets be adjusted?  How much additional costs and expenses can we incur?  Will response or service targets be adjusted?  Margin targets adjusted?  ROI?  ROE?  Or, any other management metrics adjusted because we are in crisis mode of operations?

Although this goes against my overall philosophy of trying to simplify things, I think it would be beneficial to establish three modes of operation when establishing RTOs with our business partners.

  1. Survival Mode
  2. Sustain Mode
  3. Business as Usual Mode

The goal of Survival Mode operations is simply to keep the company solvent.  Forget trying to be profitable; forget growth targets; forget avoiding all penalties, fines and service interruptions – what, minimally, does the company need to do to not jeopardize the solvency of the firm?

The goal of Sustain Mode operations is to satisfy the commitments we have today with our current customer base.  What do we need to do to keep our current customer base satisfied and meet the regulatory and contractual obligations we already have in place.

And the goal of Business as Usual is … well, just what the words say.

I think if we could get senior management to define the management objectives for each mode of operation and how long the company can operate in each mode, the RTOs we establish will be much more realistic.

I work in many environments testing their RTO capabilities where, when short time-frames are missed, they report this as a failed exercise but, the business areas ultimately say, we could have lived with the delays.  I think our RTOs, in general, are much tighter than they need be if we think about Survival first, then Sustain and then BAU.

I know, I know, I know … for those of you cursing me out; yes, there are some real crucial business processes that legitimately have very short RTOs (or require immediate failover with no downtime), but I think that pool of requirements is much smaller than many of our programs suggest.

So, yes, I think we do a good job focusing on Most Critical job processes, but I don’t think we establish the right mindset in gathering the requirements to support them after a disaster.

I welcome all comments to the contrary or, heavens forbid, in support of this concept.

Business Continuity Planning – Beyond the Doomsday Scenario

At the Continuity Insights Management Conference 2012 that I recently attended in Scottsdale, AZ, there was a lot of conversation around PS-Prep which bled into the discussion of “Why get certified” or, the more generic question of, “Why perform business continuity planning?”  An oft repeated answer to this question, echoed by business continuity planners around the world is, “Because without a plan you will not survive as a company.”

I think this is a disingenuous answer without any history to support it.  Where exactly is the evidence of this fact?  What historical data can you share with me, or the CEO you are trying to convince, that this is the case?  I am confident that you can dig up cases of small companies that did not survive a disaster, but where is that story about the big guy who did not survive the disaster?

The one and only case study I can think of off the top of my head is Enron, but that was a disaster of a different kind.

Look at BP and the horrific Gulf Coast disaster – they survived.  Did they have a plan in place for this?  Maybe … if so, most professionals would argue against its effectiveness.  Were they certified?  No.

Look at Cantor Fitzgerald, the one company most widely spoke about concerning the extent of their losses during the events of 9/11.  Survived.  With much loss and many significant challenges, but they are still in business.

We found this article that lists 8 Infamous Business Disasters – those companies all survived – albeit some under a new name and different business model, but they did survive.  Now, not all of these cases are the kinds of disasters we plan for, but I can’t find that one poster child event that proves the statement, “Without a business continuity plan, you will not stay in business.”

Now look, I am a business continuity planner.  I make a living out of helping companies put these programs in place.  I want … no, I NEED … CEO’s and Boards of Directors to embrace the need for these plans and to invest in professionals like me to help put them in place.  But, I think we need a better sales pitch than the shallow threat of; this is needed to survive a disaster.

I don’t think we need C-level executives to buy into this all or nothing proposition with business continuity planning.  No, I think that the message should be:  Business continuity plans will allow us to mitigate our losses should a disaster occur. The goal is to ensure the investment we make in our plans and solutions is justified by the potential losses that could occur considering the probability that an event happens.

The losses that could occur is measured by performing a Business Impact Analysis and the probability that an event happens is measured by a Risk Analysis.

We plan because it is a reasonable business practice to protect our assets and our stakeholders against losses that could impact the market value of our company not just if, but when, a business interruption event occurs.  If you want the answer to, “Why get certified”, check out this earlier blog we posted.

We need to sell business continuity planning using business terms that executives can understand and stop with the doomsday scenario selling technique.  At least, that’s the way I see it.

In the meantime, if you can share those stories with me that support the position companies will not survive without plans, I would love to read them.  Thanks.

Critical Data: Don’t Overlook the Hardcopy

I know we like to think we now work in a paperless society, but the fact is, we do not.  There are still plenty of industries and processes that rely on hardcopy documentation for historical records and in support of daily operations.  Business Continuity and Disaster Recovery programs often overlook these vital records as they focus on technology and electronic medium – I caution you not to fall into this same trap.

In know this to be true, especially in airlines, medical and educational organizations as well as in some financial services and other industries. 

For example:

Airlines are required to maintain and have access to all mechanical and maintenance records for each and every aircraft that they fly.  In many instances maintenance initiatives issued by various agencies are printed and given to the mechanics and engineers who then make handwritten notations and sign off on the printed form.  These printed forms, with their notations, become the official record of the maintenance activity in compliance with the initiative.  Should this physical, hardcopy record be destroyed or lost, the plane (or an entire fleet of planes) will have to be grounded until the maintenance check is performed once again and a new record created.  Some airlines maintain these records in a single location and do not scan or digitally record the information (keeping costs down, you know).  Should the facility housing these documents go up in smoke, it could take months or longer to recreate the audit trail for those planes – which, by law, must be grounded until proof that all the maintenance initiatives have been completed.

Many medical offices maintain a slew of forms and doctor reports in handwritten form.  Just notice all the filing cabinets up and down the halls in your doctor’s office.  These records are seldom scanned or stored electronically and are susceptible to numerous risks and threats.  The same is true for school records and other information gathered in handwritten forms.

Financial services firms and brokerages still house plenty of hardcopy documents in the form of payment instructions and customer documentation that could cause plenty of financial exposure and compliance irregularities if lost or destroyed.

For those of you who think that we operate in a paperless society, just take a look around and count the number of filing cabinets still in use.  What do you suppose is kept in all this space?  And, what would be the cost or impact to the organization if they were permanently destroyed?

Now, I am not saying this is true in every environment.  Certainly there are many, many offices and industries that truly have no exposure to hardcopy documentation and information.  I am just suggesting that your risk analyses, impact analyses and recovery requirements analyses do not simply overlook this potentially critical information base and include consideration of this potentially risky business practice.

Backing up or electronically scanning and storing hardcopy documentation, especially historical documentation, may be something your organization needs to look into.  There are plenty of vendors that can help you achieve this end.

Recovery Time Objectives: The Bigger Picture

A few of you didn’t take kindly to a blog I wrote a while back that suggested some of us business continuity planners have fallen victims to our own methodology.  Well, get ready to be offended once again.

This time, I want to take a look at the Business Impact Analysis (BIA) process and how we establish Recovery Time Objectives (RTO) – be they for business functions or software and applications.

In this case, I think we have fallen victims to our questionnaires.  Now, of course, some questionnaires are much more detailed and better than others, but I think they all fail from the problem that we do not put our questions in perspective of the bigger picture.  Ultimately, these questionnaires come down to the question of, “How long can we go without … doing something, or running something?”  Like I said, some questionnaires do a pretty good job of also gathering the justification for the ultimate answer, but…

I think the savvy business manager is the one who everyone else thinks is a pain in the asking.  The savvy business manager will stop short of answering these questions until he or she knows what the corporate position is on business targets during a crisis.  I would resist answering these questions until I knew what the Executive Teams’ expectations were for my department.

In other words, I would want to know; During a crisis…

  • Are our revenue targets adjusted?
  • Are profit targets adjusted?
  • Are margin targets adjusted?
  • Or, whatever business metrics I am measured against – are they adjusted?

I think most BIAs start and end with middle management answering individual BIA questionnaires, when, in fact, they should start with Executive Management establishing a Crisis Management Business Plan establishing the acceptable business targets to be achieved during a crisis.  Armed with that information, middle management has a more realistic shot at providing valid answers to our questionnaire.  Right now, every business manager is making their own assumptions about what Senior Management is expecting and these are likely not consistent across the board.

Furthermore, I think most planners simply accept the BIA answers provided with little push back.  Look, I’ve been a planner for a long time – I know exactly how easy it is to be so excited just to get any answers back that you do not dare challenge the results.  But, how often have you seen situations where business managers say they cannot be down for more than 4 hrs and yet close the entire office for a day or more during a snow storm?  Or, there is a function performed by 3 staff members and at time of crisis they say they need all three to be up and running in 4 hours – you mean none of these people ever take a vacation?  Again, it goes back to the original problem – it all depends what they think they need to achieve during a crisis.

Now before you jump down my throat – I do get that during a crisis you may not be functioning the same as normal.  You may be doing some things manually, requiring more labor.  I am just suggesting that sometimes we need to push back a little and have the managers support their answers and make sure they have thought things through logically.

Now on the opposite side of the spectrum, I was working for Comdisco during the World Trade Center bombing in 1993 and I worked very closely with two financial services firm recovering from that event.  On the Monday following the bombing – the first business day following the event – these companies experienced a call and transaction volume almost 10 X their normal volume!  So they, in fact, had some functions in which they really needed more than 100% of the workforce recovered.  I think, as planners, we may need to also push back on some departments to make sure they have taken into consideration the possible changes in work flow and volumes, given the fact that they had a disaster.  Insurance companies are just one example of organizations in which the disaster itself could be a catalyst for increased work activity.

It just seems to me that sometimes, and I don’t mean everyone does this, but sometimes, the BIA really simply becomes a Business Impact information gathering tool and we forget to do that “A” part – we forget to analyze the answers provided.

So, in summary, I think we can sometimes help the process along if we first get Senior Management to establish adjusted business targets for operations during crisis before asking middle management how long they can be down; and, I think we could do a better job challenging some of the answers we get back to our, sometimes, ambiguous questions. 

Okay, there you go, now let me have it and tell me why I’m wrong.

The Recovery Time Objective Debate Continues

The Recovery Time Objective debate continues over on a LinkedIn discussion board.  Really folks, I don’t know what is so hard to comprehend here!  I think some people are just trying to be difficult as a means to show they are smarter than everyone else.  Me, personally, I prefer the KISS method – Keep It Simply Simple (I know it is usually said another way, but I wanted to avoid labeling people).

Simply put, the RTO measures the time objective for moving from Point A to Point B where; Point A equals the moment when a business process (or technology resource, if used for IT Disaster Recovery purposes) stops functioning and Point B equals the point when the business process (or, you know) must start functioning again to avoid jeopardizing the solvency of the organization.

It is an OBJECTIVE – that word is part of the acronym – why is it so hard to comprehend?

Yes, yes, yes, the event that interrupts the process or service will definitely influence when the recovery process starts, or what recovery tactic you decide to take – but the OBJECTIVE remains the same.  Fine, fine, fine, so you have an emergency response team that is responsible for assessing the damages and determining whether or not to declare a disaster, but the OBJECTIVE remains the same and the clock is ticking.

Hopefully, your proven recovery capability is less than your recovery objective.  In that case, the Recovery Time Objective minus the Proven Time to Recover equals the time your Emergency Response Team has to gather, evaluate the situation, and declare the disaster in order to ensure your RTO is met.

RTO – PTtR = Maximum Time to Declare

Your Emergency Response Team needs to be aware of all of these factors while performing their response tasks.

You do not decide the RTO or the PTtR at time of disaster – it is too late.

The RTOs are established in the BIA process.  The PTtR are established through a series of tests and exercises.

I do not disagree with most of what people are arguing in the discussion thread – I just disagree with the words they are using in the argument.  You are overcomplicating the point and mixing apples with oranges.  Sometimes I think it would be better to just throw out the common terms in use today and come up with new terms at each company that do not have a preconceived notion of what they mean.  Then define the new terms the way you want to use them so everyone in that organization has a common understanding.  That may be throwing out the baby with the bath water, but it might stop me from pulling out what little hair I have remaining while reading this agonizing discussion thread.

The Business Continuity Planner’s Job

Although this concept may prove frustrating for the business continuity planning professional, I suggest that our primary job is not to make sure the enterprise can recover critical processes in a timely manner following a business interruption crisis, but, rather, our primary job is to identify the risks and threats that could cause a business interruption event, the resulting impacts to the organization should those threats be realized and the options (and costs) of addressing these threats.  Now, there may be a subtle difference in the two sides of that statement and, you may need to re-read that sentence a couple of times to fully understand what I am suggesting, but I often see business continuity planners get frustrated because they cannot appreciate the difference.

I believe, that our first job as business continuity planning professionals is to provide senior management with the data and information that allows them to make an informed and intelligent decision on what to do based on this information.  If, senior management, armed with this information, decides to accept the risks and potential impacts – and, signs off on that strategy – so be it.  Every organization has its own risk acceptance, or risk adverse, personality and may make polar opposite decisions faced with the same risk and impact profile.

The worst thing that can happen to a business continuity planning professional, proving we did not do our job, is if a situation occurs and senior management is justified in saying, “No one ever told me …

… that a disaster in our data center would take us out of business for months”, or

… that a fire in our call center in Anytown would take down all our customer service capability”, or

… that our primary distribution center was located in a flood plain”, or

…  

If we are in a position to say, “No, we told you, but you elected not to invest the funds necessary to mitigate the risk or position us to recover from it”, then, although we may still be the scapegoat, we can feel satisfied we did our job.

Now, once we inform management of the risks, potential impacts and various options for addressing the situations, our job then becomes to implement, document, test and exercise the strategies and solutions they have approved.  Hopefully, we can influence management to take the course we, as professionals, believe they should follow.  If not, then, rather than just complain that management doesn’t understand, we either need to gather more information to influence a different choice or, do our best to implement and document the strategies management elects to employ.

It can be frustrating working for an organization that is willing to accept risks and bet against the chance that a business interruption event will occur, but our job is primarily to make sure they are making these decisions based on all the facts and understanding of what their decisions could mean should a disaster occur.