When crafting Business Continuity and Disaster Recovery Plans (BCP, DRP), the roles and capabilities of local Public Safety organizations (Police, Fire Department, Ambulance, et. al.) play an important role. In working with a Client on their DR plan, and following the post mortem of the recent disruption of The Planet’s data center in Houston, TX, I’ve found some interesting food for thought related to the roles of Public Safety in DR planning.
Whether you chalk it up to “the best laid plans of mice and men,” Murphy, or what have you, the event at The Planet is a lesson in realizing that DR plans have critically linked dependencies on processes and procedures that are completely external to the enterprise.¬† Public Safety organizations have their own response procedures for emergencies, and in some cases those can clobber your DR plan unless accounted for in the creation of the DR plan. ¬†Let’s take a closer look at the recent event at The Planet in Houston.¬†
The event started with an explosion and fire in the electrical equipment within the data center.¬† Three walls of the electrical equipment room were blown from their original position, and the under-floor cabling on the first floor was destroyed.¬† No one was injured.¬† The fire destroyed the electrical gear where the utility service enters the building as well as the transfer switch and distribution panel that serves the first floor of the data center.¬† While some of the data center’s customers were impacted by the core event, there was plenty of server space on the second floor that could continue to operate.¬† However, the local Fire Department ordered all power to the building shut down for safety reasons, and would not allow operation of the generators.¬† So here we have an example of a robust data center, with good contingency provisions and the capacity to serve customers, but the business is shut down because of the impact of the Fire Department’s approach to the incident.
This is not the only example of this sort of occurrence.¬† In fact, Alabanza‘s data center in Baltimore, MD was shut down by the local Fire Department due to first in 2003, a generator fire and then in 2004, because of a nearby underground fire.
In another example, late last year a vehicle collided with a utility transformer that brought down power to a Rackspace data center in Dallas, TX.¬† Generators kicked in as engineered and the chillers were cycled back up.¬† The facility then switched to the secondary utility feed.¬† At that time however, the utility power was shut down in order to give safe access to the crash victim.¬† This cycling of power and interruption of the chillers necessitated the shutdown of production servers to maintain temperature in the facility.
My point is that while we include Public Safety in our Emergency Response Plans and BC/DR Plans, we sometimes don’t go deeply enough in examining the processes and procedures these organizations use in responding to incidents.¬† Our BC/DR plans can go awry when the way local Public Safety handles an incident is different than the way the organization itself had planned it.¬† We can benefit from the food for thought in these recent, high profile data center outages, and remember to go deep when it comes to understanding how local Public Safety procedures contribute to our DR planning.