|EXECUTIVE SUMMARY |
| DISASTER ARE UNPREDICTABLE , and one disruption may cause others, so a firm should test its preparedness plan to make sure it will do what it’s supposed to: locate the firm’s people, obtain equipment and support, access job-file and system backups and put staff to work in an alternate location.
A TEST OF ITS CONTINUITY PLAN is good for finding and fixing a firm’s problems before issuing plan documentation; after making operational changes involving staff, equipment or location; after coordinating with a landlord and/or tenants association; and in conjunction with local police and emergency management organizations.
AN INCIDENT MANAGEMENT TEAM will assign priority to functions, ranking their importance from highest to lowest and determining the resources the firm must have to be able to restore operations within the desired time frame, such as within 24 to 72 hours of the disaster event, within 30 days and indefinitely.
AN ENTITY ALSO HAS TO ESTABLISH the relative importance of the following areas: physical facilities, communications (internal and external), computer and data processing and staffing requirements.
A FIRM DOESN’T HAVE TO SIMULATE a full-blown companywide disaster to exercise a recovery plan. The expense of lost productivity would be substantial, and it’s difficult to systematically identify the weaknesses when you test every component simultaneously.
THE FIRM SHOULD ANALYZE each trial-run exercise’s results and change the recovery plan to incorporate what is learned. Then it should redo the exercise to determine whether the improvements produce the desired results.
|ED McCARTHY is a freelance writer in Warwick, Rhode Island, who specializes in finance and technology. His e-mail address is email@example.com . |
ou learn about it on the morning news: Overnight, a fire at the restaurant next door spread to your CPA firm’s building. The offices escaped the flames, but water and smoke damage will make them unusable for several weeks. Your firm is prepared: It has a disaster recovery plan that will allow it to contact the employees, clients and vendors important to its operations; it has off-site data backup; and it has equipped its staff members to work from home offices. You call the information technology (IT) manager, and she activates the backup computer center. You notify the other managers, who contact their people.
As you’re surveying the damaged office, your cell phone rings. The IT manager reports that employee requests for help with their remote connections are swamping her staff. The heavy volume is slowing efforts to get systems back online—and there’s another problem: You changed vendors for several key computer programs two months ago. The data transfer in the office went smoothly, but there are glitches with the backup data. “We can resolve the problems,” the IT manager says, “but I’ll need to shut down the staff support lines for two days.” On top of that, it’s tax season.
Disasters are unpredictable by definition, and one disruption may cause others, so it’s wise to test your firm’s preparedness plan regularly to make sure it will do what it’s supposed to: locate the firm’s people, obtain equipment and support, use job-file and system backups to access data and put staff to work in an alternate location (for more information, see “ Before the Deluge—and After, ” JofA , Apr.03, page 57). Here’s how to test and refine a disaster recovery plan.
Respondents to a survey of more than 600 business continuity and disaster recovery professionals in New England reported only 16% of companies tested their continuity plans more than twice a year and 10% tested their plans less than once a year.
Source: Portal Publishing Ltd., www.envoyworldwide.com , 2004.
If your CPA firm hasn’t walked through a simulated scenario to “stress test” a disaster recovery plan, you won’t uncover its weaknesses until you’re in the midst of a crisis, says Philip Jan Rothstein, president of Rothstein Associates Inc., a Brookfield, Connecticut, business-continuity-consulting company. In fact, your firm doesn’t have a plan if it’s untested, he says. “Participants should view the tryout as one part of the recovery process.” Exercise the plan at several stages to find and fix problems, particularly
Before issuing final documentation.
After making operational changes involving staff, equipment or location.
After coordinating logistics with your landlord and tenants association.
After consulting with local police and emergency management organizations.
Its neither necessary nor practical to simulate a full-blown disaster to exercise a recovery plan, says Rothstein. The expense of lost productivity would be substantial, and its difficult to systematically identify the weaknesses when you test every component simultaneously. Instead, the managing partner should gather the people who would be needed in a recovery to do a tabletop exercise based on a mock scenario (see Elements of Scenario Testing, ” below).
|Elements of Scenario Testing |
|Disaster-plan scenarios should take into account
Purposes and objectives of the test.
Type of test.
The plan sections or components being tested.
Duration of the test.
Constraints and assumptions.
Main events of the test.
|Scenarios should identify and describe
The type of disaster that has occurred.
Extent of damage or disruption to the facility and area.
What recovery capabilities are available.
What personnel and equipment are available.
Status of backup or recovery resources.
Time of the event.
|Scenarios must test
Temporary operating procedures.
Backup and recovery procedures.
|Source: Disaster Recovery Testing: Exercising Your Contingency Plan, edited by Philip Jan Rothstein, Rothstein Associates Inc., 1994. |
Once you’ve decided to try such an exercise, you should aim to keep the session brief. “Have the participants sit down for an hour,” Rothstein says. “Keep the atmosphere unstructured and casual. Give them the scenario, the circumstances and the critical considerations. Assign someone to take notes or tape the discussion to transcribe later. Then ask them, ‘What do we do now?’”
Participants frequently spot potential problems as they consider their departments’ planned responses to the crisis, Rothstein says. For example, a manager might mention that under the proposed scenario she would need to be at a particular location at a specific time to complete a required task; other participants notice the plan calls for her to be across town at the same time and point out the discrepancy. As a result the team assigns someone else to handle those recovery responsibilities. “Participants can see where the difficulties are before they spend lots of time and money,” Rothstein says.
Sid Edelstein, CPA, managing director of operations at CGSolutions, the business-technology-consulting arm of Cornick, Garber & Sandler LLP, New York City, says a firmwide test of a disaster recovery plan is preferable to testing in increments. Much depends on the firm’s size—a total test is easier to do at a small firm, for example. Edelstein agrees that “to be effective, a business continuity plan must be a living document that accurately reflects a company or firm’s total logistical objectives, recovery procedures and resource requirements.”
WHAT TO TEST
Each firm has a hierarchy of operating factors critical to its recovery. Your “incident management” team will assign priority to functions, ranking their importance from highest to lowest and determining the resources the firm must have to restore operations within the desired time frame—for example
Critical: Full recovery required within 24 hours of disaster (communications, for example).
Urgent: Full recovery required within 72 hours (such as access to files for work in progress and billing).
Important: Full recovery required within 30 days (human resources records, for example).
Other: Recovery not required or may exceed 30 days from disaster occurrence (such as client tax files more than seven years old).
Your organization also has to establish the relative importance of the following areas:
Physical facilities. Where will your employees work if the office is unavailable?
Communications (internal and external). How will staff members locate and talk to one another?
Computer and data processing requirements. Do you have laptops and off-site backup systems? Where is equipment stored? How do you access the backup?
Staffing requirements. Whose work do you need first? Can the firm provide payroll and benefits continuity?
Stanley Weiner, CPA, CFE, Cornick, Garber & Sandler, says other areas to test for potential problems are
Backup tapes. Have a procedure for testing and restoring backup tapes to ensure the firm can locate and read them when they’re needed.
The hot site. Don’t take a hot site (a technology-equipped emergency office) for granted. Run test programs and update hardware and software as necessary.
Notification procedures. An incident management team needs to test its notification procedures for critical staff as well as equipment.
|Checklist for Managing E-Data |
The more time and effort a firm puts into its disaster recovery plan, the faster it will recover from a catastrophe. Test document- and data-recovery policies and procedures every three months. When your firm updates its plan, consider
Backup scheduling and testing. Are you using the most efficient data backup methods? Do you have a standard protocol for testing your backups to verify success? Is it documented so an alternate backup operator knows what to back up and when?
Backup retention period. Do you have a formal policy for backup retention? Have you sought legal advice to verify the acceptable retention period for sensitive client data?
Off-site backup retention. Do you store certain backup media at an off-site location? Are there appropriate security measures to protect your off-site media from theft or damage?
Recovery. Does your policy stipulate the acceptable recovery period? Can you recover hardware and data in that amount of time? If you have the means to use it, do you have spare hardware (drives, cables, power supplies or even entire desktop or server machines) that can be used to bring you back up to speed quickly? Is your configuration documented well enough to ease recovery efforts?
Source: Update Your D isaster Recovery Plan, John D. McCall, MCP, Boomer Bulletin, Boomer Consulting Inc., www.boomer.com , 2 004.
GET FIRMWIDE INPUT
Have a formal team responsible for creating, updating and executing the recovery plan, Rothstein advises. The team should encompass members from all departments, including management. They or their designees will conduct successive tests, first evaluating a single business process, department or even a specific function within a department. Participants examine clearly delineated components to understand what works and what doesn’t work as recovery unfolds. If a team spots weaknesses in one process, its job is to think through how it could affect other functions.
“That approach serves several purposes,” Rothstein says. “One is to identify the failures before you get too far along in the planning. The second is to get people familiar and comfortable with the exercise. Third, it warns you that something is not going to be effective in a larger context. If the recovery process fails in a department with five people, for example, what’s going to happen when a plan fails in a department with 500 people?”
Selecting the person or team to audit the exercise results requires careful consideration because the reports can cause organizational friction. Using an independent third party such as a business continuity or insurance consultant can reduce these problems and bring a fresh perspective to the exercise. Consultants also have an understanding of the industrys best practices, Rothstein says. They may have worked through hundreds of exercises in their careers, and that gives them an objective perspective on a companys tests and results. (See Disaster Preparedness Resources, ” below.)
|Disaster Preparedness Resources
Firms that want to develop a disaster-readiness plan can learn more about the process from the following resources.
Business Continuity Institute
PO Box 4474, Worcester WR6 5YA
Phone: +44 (0)870 603 8783;
+44 1886 833555
Falls Church, Virginia
Phone: (703) 538-1792
Online Disaster Recovery Bookstore
|Business continuity resources |
ARMA International: The Association for Information Management Professionals, www.arma.org/resources/disaster_recovery.cfm .
The Association of Contingency Planners, www.acp-international.com .
Business Resumption Planning by Edward Devlin, Cole Emerson and Leo Wrobel (CRC Press, 1997), www.crcpress.com .
The Business Survival Newsletter, www.rothstein.com .
Contingency Planning & Management, Witter Publishing Corp., www.contingencyplanning.com .
Disaster Recovery Journal, St. Louis, www.drj.com .
Federal Emergency Management Agency, www.fema.gov .
Loans from the U.S. Small Business Administration, www.sba.gov .
|Source: “Managing Effective Disaster Recovery” by Stanley Weiner, the CPA Journal, www.cpaj.com , December 2001. |
IT'S A PLAN
The key to successful testing is to monitor each exercise’s results and change the recovery plan to incorporate what you learn. After altering the plan based on trial-run insights, redo the exercise to determine whether the improvements produced the desired results. The exercises aren’t about passing or failing; rather, they identify the weak links in the recovery process so those functions can be improved before documenting and issuing the plan throughout your organization. At that point, to ensure everyone understands the policies and procedures, hold formal training sessions on your contingency plans. Retest every three months or whenever something significant changes. Every two months, ensure that the clients’ phone list and vendor contracts are current. Business continuity planning applies equally to firms, companies and clients, so use your investment in your own organization to offer better service to theirs.
CASE STUDY 1:
LASSUS WHERLEY & ASSOCIATES PC
As part of her firm’s disaster recovery planning, Clare Wherley, CPA, CFP, CEO of Lassus Wherley & Associates PC in New Providence, New Jersey, identifies two categories of threats to her wealth management firm’s operations. Disruptive situations involve the “partial loss or incapacitation of personnel, computer capabilities, communications or facilities.” Disasters involve “the total loss of any single resource or all of them,” she says.
The firm, which has 23 employees, developed detailed emergency response procedures that it calls the “24/7 Initiative.” The plan identifies critical client-focused operations (investment records, tax information and communications records) and internal operations (financial records). The firm tests its “24/7 Initiative” twice each year by deliberately taking down systems on normal workdays. One recent technology recovery exercise assumed that a fire in the office kitchen had spread to the room that housed the network servers and destroyed those computers. Although some of the office workstations still functioned, they had no data or programs available to them. Several members of the staff went off-site to the backup network and attempted to work through a variety of tasks, including
Information retrieval for clients regarding their investment accounts and tax returns.
Trade transaction requests from clients’ securities custodians.
A nother recent recovery exercise focused on communications among staff and with clients. The scenario assumed the firm’s principals were out of town and a disaster disrupted the organization’s telephone service for up to six hours. Staff followed procedures from the company’s emergency response handbook that covered
A communications chain diagram (who calls whom).
Step-by-step guides for contacting office and critical personnel through phones, e-mail and instant messaging.
A special voice-mail account for emergency announcements and reporting employee status and whereabouts.
Access to a password-protected Web site for employees that lists all action plans, guides and employee contact information.
Use of a backup phone line designed for employee access from outside to keep the main phone line open for external communications.
Do such exercises improve the firm’s recovery plan? Wherley points to one unexpected result as proof of their value: The firm had asked several clients to participate in a comparable test. One client’s role was to send an e-mail to his contact person at Lassus Wherley, which he did. However, his contact was out of the office and no other staff members had the password to access her e-mail, so the message went undetected. “Nobody anticipated that problem,” Wherley says. “Now we’re developing a method for resolving the employee’s confidentiality with the firm’s need to access e-mail if required.”
CASE STUDY 2:
TEXAS COMPTROLLER OF PUBLIC ACCOUNTS
Imagine the chaos if a state’s treasury department was forced to halt operations after a disaster interrupted business. The state’s cash balances would begin to dwindle as revenue collections stopped, and payments to taxpayers, state employees and vendors would cease.
That’s the challenge facing Joan Light, business continuity and recovery coordinator for the Texas Comptroller of Public Accounts (TCPA) in Austin. Light oversees disaster preparedness for a large organization: The TCPA employs approximately 2,400 employees and has offices throughout Texas and in major cities across the United States. Her department manages the agency’s business continuity plans, and it has tested each facet of the multiple recovery plans at least once a year since 1993.
The TCPA frequently performs scenario-based exercises. Light’s staff members
Assemble a recovery team from the departments involved with the exercise.
Typically set aside two to four hours as they work through a detailed outline of the events.
Know scenario details such as the date and time of day when the disaster occurred, the extent of the damage and the impact on their facilities and staff.
O ne scenario assumes a storm damages downtown Austin. “We tell participants the tornado struck our building and several of our other locations in town,” Light says. “We also assume torrential rains that accompany tornadoes cause the Colorado River, which runs through the middle of town, to overflow and flood our buildings south of the river. Once we define the depth of damage and resource loss, the participants determine what they would have to put together for a recovery.”
Have the exercises improved the TCPA’s recovery plans? Light is firmly convinced they have. “Each scenario has brought us some knowledge of contingencies that we needed to build into our planning process, including those for key people who are lost. You can build a plan and you can update that plan, but you cannot ingrain in your staff how to recover without practicing frequently,” she says.
CASE STUDY 3:
Although PricewaterhouseCoopers LLP (PwC), the U.S. firm of the eponymous worldwide organization, has its corporate headquarters in New York City, the accounting firm has about 100 offices around the country. In addition, 90% to 95% of the firm’s technology is centralized in the company’s data center in Tampa, Florida.
The firm tests its disaster recovery plans for both the IT data center and local offices. Rick Ancona, Tampa-based chief technology officer, says PwC uses a “modular” approach to test its IT recovery plan in conjunction with a “warm site” (also called a hot site, a technology-equipped emergency office) it has contracted. One exercise redirects the firm’s network traffic to the warm site and tests components such as messaging and groupware services.
Because the IT staff manages multiple applications and hardware configurations, the recovery plan exercises must account for numerous ongoing changes to the system. “You must have a robust process for managing change,” Ancona says. “When a change occurs with any of our components—even regular upgrades—we schedule that component for retesting.”
||PRACTICAL TIPS TO REMEMBER |
Have a formal team responsible for creating, updating and executing the recovery plan.
Gather the firm’s key people for a recovery to do a “tabletop” exercise based on a mock scenario.
Keep each session brief and the atmosphere unstructured and casual to elicit spontaneous input.
Give participants a scenario, the circumstances and the critical considerations; then ask, “What do we do now?”
Assign someone to take notes or tape the discussion to transcribe lessons learned later.
After altering the recovery plan based on trial-run insights, redo the exercise to determine whether the improvements produce the desired results.
At that point hold formal training sessions for all staff.
Retest every three months or whenever something significant changes. Test every two months to ensure that client phone lists and vendor contracts are current.
A fter the 9/11 attacks, PwC’s senior management called for a comprehensive crisis management plan that would address disaster planning for all the firm’s offices. That directive led to the creation of a multidisciplinary team that convenes virtually to help regional offices manage and recover from unexpected events. The team is managed by the global security department and includes staff from human resources, legal, risk management, travel and meetings, and internal and external communications located around the country. Team members and offices are linked by a variety of communications tools so members can communicate in a crisis.
“We’ve conducted five disaster drills since 2002,” says Stephen Malloy, security operations manager. “We try to imagine a scenario that would affect a particular office. For example, for our first drill we imagined an earthquake hitting San Francisco. We tried to think through the potential impact on our people in that office, on the infrastructure of the office itself and how we would manage and deal with those events.”
The drills do not require participation from the entire office staff—only the local management and the infrastructure team participate. Malloy reports that drills are very realistic for the participants.
“During the recent East Coast power outage, one of our colleagues mentioned she wasn’t sure if it was a real event or a drill,” he says. “I took that to mean our drills are very lifelike. I think the lessons paid off when we had the blackout.”
The Scottish poet Robert Burns reminded us the best-laid plans of mice and men often go awry. Untested disaster recovery plans face a particular risk of encountering unexpected problems because they are activated during a crisis. By exercising your contingency plan before a disaster actually strikes, you reduce the likelihood of one problem compounding another.
Disaster Area Practice Guide. Available only as a free download through the CPA2Biz Web site ( https://www.cpa2biz.com/ResourceCenters/Tax/Individual/
Disaster Recovery: A Guide to Financial Issues. This is a resource for disaster survivors developed by the AICPA, the National Endowment for Financial Education and the American Red Cross. It can be downloaded at www.redcross.org/services.
To order bound copies, visit www.cpa2biz.com/store , call 888-777-7077 or fax 800-362-5066 and refer to product no. 017231JA.
Management of an Accounting Practice Handbook (loose-leaf), chapter 214, “Coping with Physical Disaster” (# 090407JA). Online version, e-MAP (# MAP-XXJA).
Emergency Business Planning: Are You Prepared for Disaster? (# 731163JA).
The AICPA Web site’s Resources for Disaster Recovery section has links to disaster preparedness agencies ( www.aicpa.org ). Among the resources it links to are the FEMA Emergency Management Guide for Business and Industry, the Institute for Business and Safety and the U.S. Small Business Administration Disaster Assistance.
The Venetian, Las Vegas