Peter M. Curtis

Maintaining Mission Critical Systems in a 24/7 Environment


Скачать книгу

documentation, and testing and commissioning. It is the responsibility of employees at all levels of a hierarchy to communicate and develop best practices that will strengthen their business.

      Below is a list of questions that you may wish to ask yourself about the mission critical infrastructure you are supporting with regards to reliability and resiliency. Your answers to these questions should help to shed some light on areas where you can improve your operations.

      Needs Analysis/Risk Assessment

      1 How much does each minute, hour, or day of operational downtime cost your company if a specific facility is lost?

      2 Have you determined your recovery time objectives for each of your business processes?

      3 Does your financial institution conduct comprehensive business impact analyses (BIA) and risk assessments?

      4 Have you considered disruption scenarios and the likelihood of disruption affecting information services, technology, personnel, facilities, and service providers in your risk assessments?

      5 Have your disruption scenarios included both internal and external sources, such as natural events (e.g., fires, floods, severe weather), technical events (e.g., communication failure, power outages, equipment, and software failure), and malicious activity (e.g., network security attacks, fraud, terrorism)?

      6 Does this BIA identify and prioritize business functions and state the maximum allowable downtime for critical business functions?

      7 Does the BIA estimate data loss and transaction backlog that may result from critical business function downtime?

      8 Have you prepared a list of “critical facilities” to include any location where a critical operation is performed, including all work area environments such as branch backroom operations facilities, headquarters, or data centers?

      9 Have you classified each critical facility using a critical facility ranking/rating system such as the Tier I, II, III, and IV rating categories?

      10 Has a condition assessment been performed on each critical facility?

      11 Has a facility risk assessment been conducted for each of your key critical facilities?

      12 Do you know the critical, essential, and discretionary loads in each critical facility?

      13 Must you comply with the regulatory requirements and guidelines discussed in this chapter?

      14 Are any internal corporate risk and compliance policies applicable?

      15 Have you identified business continuity requirements and expectations?

      16 Has a gap analysis been performed between the capabilities of each company facility and the corresponding business process recovery time objectives residing in that facility?

      17 Based on the gap analysis, have you determined the infrastructure needs for your critical facilities?

      18 Have you considered fault tolerance and maintainability in your facility infrastructure requirements?

      19 Given your new design requirements, have you applied reliability modeling to optimize a cost‐effective solution?

      20 Have you planned for rapid recovery and timely resumption of critical operations following a wide‐scale disruption?

      21 Following the loss of accessibility of staff in at least one major operating location, how will you recover in a timely manner and resume critical operations?

      22 Are you highly confident, through ongoing use or robust testing, that critical internal and external continuity arrangements are effective and compatible?

      23 Have you identified clearing and settlement activities in support of critical financial markets?

      24 Do you employ and maintain sufficient geographically dispersed resources to meet recovery and resumption activities?

      25 Is your organization sure that there is diversity in the labor pool of the primary and backup sites, such that a wide‐scale event would not simultaneously affect the labor pool of both sites?

      26 Do you routinely use or test recovery and resumption arrangements?

      27 Are you familiar with National Fire Protection Association (NFPA) 1600 – Standard on Disaster/Emergency Management and Business Continuity Programs which provides a standardized basis for disaster/emergency management planning and business continuity programs in private and public sectors by providing common program elements, techniques, and processes?

      “In a world that everything is connected, anything can be disrupted.”

      Brad Smith

      Our nation's antiquated energy infrastructure and dependence on cheap fuel presents significant risks to our security. The transportation sector alone keeps the economy moving, and accounts for about 2/3 of all U.S. oil consumption. Electricity, on the other hand, plays a uniquely important role in the operations of all industries and public services. The loss of electricity for any length of time compromises data and communication networks and digital electrical loads providing physical and operational security for all mission critical infrastructures. Without oil and electricity, our economy and security come to a complete halt.

Graph depicts the U S Primary Energy Sources.

      (Courtesy of Eia.gov.) Source: U.S. Energy Information Administration, US Primary Energy Sources.

      Mission Critical facilities may take a similar path, albeit on a smaller scale. By introducing renewable energy sources on‐site, facilities may reduce their dependency on the grid and improve resiliency in the event of an outage by complementing conventional energy with alternatives.

      Computer hackers pose a significant threat to our information technology systems. There have been instances where hackers have gained access to electric power plants, and possibly triggered major power interruptions. These events demonstrate how vulnerable and fragile our critical infrastructure really is. The electric grid is not the only area we need to be concerned with; the government, military,