You are currently on IBM Systems Media’s archival website. Click here to view our new website.

POWER > Systems Management > High Availability

Recent Trends in High Availability Show Soaring Cost of Downtime

HA reliability survey

The Human Factor

The good news: server OS reliability has continued to improve across the board. In 2016, it’s apparent that hardware reliability has improved over the years too. The not-so-good news: this year it became obvious that one of the biggest reasons enterprises face downtime is human error.

“That is the No. 1 reason,” DiDio says. “It’s the issue that has the most negative impact on reliability, availability and downtime for server hardware and OS platforms.”

Figure 4 shows how human error compares with other issues, such as flaws in the server OS, an overworked IT department and the popularity of bring-your-own-device technology.

ITIC broadly defines human error to include technology and business decisions. Failure to upgrade servers when an organization needs to accommodate more data or falling behind with security updates are examples of human error. But it’s also human error when an enterprise doesn’t allocate enough funds for equipment purchases or doesn’t devise a plan to address remote access issues.

The IT field is in a dynamic period of virtualization, moving to the cloud and seeing the IoT come into play, but people still play a pivotal role.

Human errors are likely also tied to security flaws. “Organizations are not doing enough to secure their networks in many instances,” DiDio says. Even with data security breaches hitting companies more frequently and headlines similar to “How My Mom Got Hacked” spreading the news on ransomware in The New York Times (, it may seem strange that organizations aren’t fortifying their security measures more. But as DiDio explains, organizations perceive security hacks as inevitable. “We are becoming numb to it,” she says.

Security awareness is present among many IT administrators, but there’s a bifurcation between them and C-level executives. Imagine this scenario: When executives examine where to spend capital expenditure money for a year, IT administrators recommend doing vulnerability testing or hiring extra security administrators. When the executives ask, “Well, have we had any problems?” there may be no incidents to report. So, executives move on to what seem like more pressing concerns and instruct the IT administrators to check back in three months.

“It’s an accident waiting to happen,” DiDio says. “That’s like saying, ‘I’m going on a road trip and I know a couple of my tires are bald and my brakes are worn out.’ ”

HA Trend Roundup

The ITIC survey reveals the following trends to stay competitive:

  • No tolerance for downtime. Many enterprises also require a minimum of 99.99 percent availability and realize that being constantly connected is the culture employees expect.
  • Data security breaches are the biggest threat to network reliability
  • The cost of downtime per hour is skyrocketing.



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


Investments Offset the Costs of Data Loss

Data is Money

A recent survey explores the state of Power Systems resilience


Data Backup Options Balance Risk and Cost

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store