You are currently on IBM Systems Media’s archival website. Click here to view our new website.

MAINFRAME > Administrator > Backup and Recovery

Determining the Operations of Redundant Capacity Components for a Data Center Facility

The higher availability requirement of a data center is directly proportional to the capital and operational cost of building and managing the facility. Therefore, its design and operational profile must consider the business needs and tolerance to risk of the organization building and relying on the data center. Depending on the business needs and the Tier level desired—operating all equipment under normal circumstances or only operating the N equipment—a 24-7 data center can be successfully implemented.

This article explores the operations of redundant capacity components in a Tier III and Tier IV data center and then provides some insight beyond Tier.

Striking a Balance

Tier helps ensure the level of investment in a data center meets the business needs of the specific business entity while minimizing overdesign and overinvestment. Within a specific Tier level, there may be data centers that just meet the minimum requirements while others may have significant enhancements. These differences meet the different business needs of data centers within a given Tier level.

Tier III requires that critical capacity components can be isolated for repair, replacement or upgrade without impacting the critical load. Thus Tier III can be achieved with all units, including the redundant capacity components, operational under normal circumstances or with the redundant capacity components idle. The issue is that the systems must be able to isolate a given unit, allowing the redundant units to pick up the load without impacting the critical load.

The added complication is that the engine generator system must be sized to accommodate the design. If the redundant unit is offline under normal circumstances, the engine generator system will have a higher load when the redundant unit is started before the primary unit is isolated. Remember Tier III does not consider faults or failures, but planned activities. Additionally, Tier III requirements cannot be accomplished by having a redundant unit on site and not installed.

For Tier IV, the systems must be capable of autonomously responding to any single fault or failure without impacting the critical load. This typically means the redundant capacity components must be active to pick up the load when a primary unit fails.

Outside of Tier III, some efficiency decisions can impact the capability of a data center to support uninterrupted operations. Because this is a possibility, all efficiency decisions must be contemplated in light of any possible impact to the uninterrupted operations of the data center in question. However, because not all business needs are the same and not all data centers are the same, no hard and fast rule or recommendation can be enforced as to how to operate.

The Role of the Operating Profile

Let’s consider an example of a bank operating a Tier III data center. The bank’s data center operations manager may decide to run all units under normal circumstances and control on set-point. When a unit fails, the bank then doesn't rely on a switching operation (unit turning on) to carry the load. This is because the energy savings achieved by idling the redundant units may be significantly overshadowed by the costs of an impact to the critical IT processes.

However, a disaster recovery site may opt for all energy savings measures if the site is not intended to support operations except during a failure of a primary center or during planned activities in the primary center.

Thus the operating profile of a data center must support the business needs of the organization including considering its tolerance to risk. With that in mind, an operational profile of operating all units under normal circumstances or idling the redundant units can be acceptable depending on the business needs of the data center organization.

It would not be recommended to rely on equipment located on site but not installed as the only form of redundancy for a 24-7 data center operation. Most systems cannot tolerate operating at an N-1 capacity profile long enough to allow a unit to be removed and another to be installed. Additionally, installing a unit in a critical system without properly commissioning the installation and the overall system introduces additional risks. Some equipment provides little notice before failure and the skilled personnel required to uninstall and install a backup unit may not be available.

Review Requirements

Operations of redundant capacity components raise questions when the data center managers think of energy efficiency issues with all components kept fully operational. Due to the variables in any given situation, it is always critical to review the specific data center and the client requirements before making any recommendation.

IBM’s data center consultants can ensure that the design and operational profile of a data center considers the business needs and risk tolerance of the client. To learn more about how IBM can help you progress on your journey to greater data center efficiency, contact your IBM representative or visit the following website:

Syed Ahsan Baqi is an experienced engineering consultant.

Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


Active/Active Sites Helps Distribute Workloads During Outages

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters