Peter M. Curtis

Maintaining Mission Critical Systems in a 24/7 Environment


Скачать книгу

is defined as the mean time between failures divided by the mean time between failures plus the mean time to repair:

equation

      High reliability means that there is a high probability of good performance in a given time interval. High availability is a function of failure frequency and repair times and is a more accurate indication of data center performance.

      The most common response to these trends is reactive: that is, spending time and resources to repair the offender. If a utility goes down, install a generator. If a ground‐fault trips critical loads, redesign the distribution system. If a lightning strike burns power supplies, install a new lightning protection system. Such measures certainly make sense, as they address real risks in the data center. However, strategic planning can identify internal risks and provide a prioritized plan for reliability improvements. Planning and careful implementation will minimize disruptions while making the business case to fund these projects.

      As technological advances find their way onto the data center’s raised floor, the facility will be impacted in unexpected ways. As equipment footprints shrink, free floor area is populated with more hardware. However, because the smaller equipment rejects the same amount of heat, power and cooling densities grow dramatically, and floor space for cooling equipment increases. The large footprint now required for reliable power without planned downtime – e.g., switchgear, generators, UPS modules, and batteries – also affects the planning and maintenance of data center facilities. Over the last two decades, the cost of the facility relative to the computer hardware it houses has not grown proportionately. Budget priorities that favor computer hardware over facilities improvement can lead to insufficient performance. The best way to ensure a balanced allocation of capital is to prepare a business analysis that includes costs associated with the risk of downtime. This cost depends on the consequences of an unplanned service outage in that facility and the probability that an outage will occur.

      Proper data center design and operations will protect the investment and minimize downtime. Early in the planning process, an array of experienced professionals must review all the factors that affect operations. This is no time to be “jack of all trades, master of none.” Here are basic steps critical to designing and developing a successful mission critical data center:

       Basic Steps to Building a Critical Data Center

       Determine the needs of the client and the reliability of the mission critical data center.

       Develop the configuration for the hardware.

       Calculate the air, water, and power requirements.

       Determine your total space requirements and expected future space requirements.

       Validate the specific site: Be sure that the site is well located and away from natural disasters and that electric, telecommunications, and water utilities can provide the high level of reliability your company requires.

       Develop a layout after all parties agree.

       Design the mission critical infrastructure to (N+1), (N+2), or higher redundancy level, depending on the risk profile and reliability requirements.

       Once a design is agreed upon, prepare a budgetary estimate for the project, ensure a sufficient contingency is included in the budget at this point due to many unknowns and price escalations.

       Have a competent consulting engineer prepare specifications and bid packages for equipment purchases and construction contracts. Use only vendors that are familiar with and experienced in the mission critical industry.

       After bids are opened, select, and interview vendors and contractors. Take time to carefully choose the right vendors. Make sure you see their work; ask many questions; verify references and be sure that everybody is on the same page.

       Update the project budget with proper adjustments from the received vendor and contractor bids. Contingency can be reduced now that base costs are received but not eliminated! Unforeseen construction issues, last‐minute design changes, or Owner changes will cause cost increase to occur.

      3.7.1 Data Center Certification

      Certification involves categories such as sustainable sites, water efficiency, energy & atmosphere, materials & resources, indoor environmental quality, renewable power, innovation & design, and regional priority. A 100‐point scale with 10 bonus points is assigned to these categories, and there are four levels of certification: certified, silver, gold, and platinum. The most points can be earned in the Energy & Atmosphere category. They are given for building designs that are capable of tracking building performance, managing refrigerants in order to reduce CFCs, and renewable energy use.

      Many data centers have attained LEED certifications in an effort to curb energy consumption and reduce GHG’s. One of the first data centers in the world to achieve LEED Platinum was the Citigroup Data Center in Frankfurt, Germany, in April 2009. Some features implemented to achieve certification were the use of fresh air free cooling, reverse osmosis water treatment in cooling towers to reduce water use, a vegetated roof, a vegetated green wall irrigated using harvested rainwater, extensive use of server virtualization, and a data center layout that reduces required cabling by 250 km. New data center designs should take these practices and expand on them in order to achieve LEED Certification.

      Another internationally used rating system similar to LEED is the Building Research Establishment Environmental Assessment Method (BREEAM), which is based in the United Kingdom. As you can see from its title, BREEAM is an environmental assessment method for