Retail colocation data centers, with mission critical facility operations being their core business, commonly have very mature operations frameworks. Methods of Procedure (MOPs) and Standard Operating Procedures (SOPs), and so on are mature and jealously governed.
A typical enterprise data center, often does not have a mature operations framework. Certifiable frameworks give way to tribal knowledge and set of the pants processes.
Here are a few topics we find typically requiring a bit of shoring up when we’re contracted to improve operations consistency.
1. Labeling and Identification
There is no such thing as too many labels when it comes to data centers. Within a relatively small data center there could be hundreds of servers, switches and thousands of patch cords. If one of these had an issue, it would be hard to direct staff to which one to repair, unless they are clearly identified. Even if they are identified on a record, without a label, there is little chance of staff finding the faulty equipment and repairing it promptly.
Labeling can reduce costs and time to replace or repair outages, offering your users higher uptime and better service.
For patch panels and switches, consider using thermal transfer labels; these can take the high temperatures that are present with constantly running equipment. For cables, you might need to use sleeve-style heat shrink labels; these don’t peel or drop off once they’ve been shrunk in place.
2. Create And Stick To Procedures
It’s a tedious task and not everyone’s favorite activity, but the creation and implementation of operational procedures is really essential to approach “mission critical” operational capability. Every element and process for setup, breakdown, maintenance and emergencies need to be documented carefully. The procedures need to be tested and all staff trained on them.
The idea is that a new member of staff could pick up the document and use that to successfully and consistently perform the tasks assigned to their role. Data center skills are in short supply. Those of us in the industry are in demand and could be poached by a competitor or lost through attrition. Your data center is likely to experience some staff turnover, and this can have catastrophic risk to operations when processes are maintained by tribal knowledge. Create that set of procedures and enforce compliance.
3. Re-think Physical Security
Data centers are the treasure chest containing the crown jewels of the organization. They need to be protected.
Enterprise data centers are often located in mixed-use buildings, or share space with parts of the organization who are not involved in mission critical operations. In these circumstances, physical security is like Swiss cheese with loose access control measures and perimeters exposed to general personnel and even the public.
Here we have to get serious about access privileges, access control, and reducing the exposure of the perimeter spaces of the mission critical areas.
4. Energy Efficiency
Energy costs are a primary driver of OPEX in the enterprise data center. Enterprises have typically been behind the curve in reducing their energy costs for a number of basic reasons including lack of recognition of the problem, lack of a clear path to savings, lack of expertise in how to achieve higher efficiency, and organizational budget dynamics when it comes to paying the energy bills.
A qualified data center energy consultant (for example, a Data Center Energy Practitioner- DCEP), engaged with the CFO’s support, can be an effective approach for creating and navigating the journey toward energy cost savings.
5. Risk Assessment and Business Impact Analysis (BIA)
Investment in operations improvements can be a challenge, especially for the enterprise data center. Which projects should we lobby for? Which are the most important? Which give the strongest ROI?
A data center consultant with security and business continuity or disaster recovery (BC/DR) credentials can help through a Risk Assessment or Business Impact Analysis engagement. These reveal the likelihood and corresponding impact of risks to the data center, as well as the level of pain that will ensue and how long it will hurt, should there be a prolonged outage. This sort of information can enable an informed data-driven decision with justification for investment in the projects.
Mission critical operations is a tough thing to accomplish, especially for enterprise data centers where the core business lies in other competencies. Consider our tips for data center improvement.
Are you experiencing issues like this? What is your greatest struggle?
Let us know in the comments below.