Mission-Critical Servers: Protecting the Nest

March 23, 2017 by Ben Blomberg

We’ve discussed in previous posts how the growth of cloud computing, mobile transactions, and social networking is changing the computing landscape and how organizations will meet the demands of this ever-changing environment.  At the core of this growth, mission-critical servers allow enterprises to work across multiple channels and devices to support their employees, customers, and e-commerce.  As the demand and dependency on these servers increases, so too does the need for guaranteed uptime.

To give you an idea of just how vital it is to keep these mission-critical systems running, a 2016 survey from the Ponemon Institute revealed that the average cost of an unplanned data center outage was nearly $9,000 per minute!  The survey indicated that 25 percent of companies cite UPS (uninterruptible power supply) system failure as the leading cause of an unplanned outage, followed by cybercrime and accidental/ human error, each accounting for 22 percent of outages.  When you compare the data from the 2013 survey, a couple of important points emerge:

  • Accidental/ human error was the same (22 percent), indicating no progress has been made to mitigate failures caused by personnel.
  • Ensuring all equipment - generators, HVAC, and CRAC units - are inspected (or replaced) regularly, involves a human element, so it’s possible that some cases of UPS system failure weren’t reported as human error.

According to an article by Quality Power Solutions, a UPS systems and generator provider, the biggest threat to UPS system failure is the operator.  Furthermore, we discussed in a previous post how accidental deletion was the main reason for SQL Server recovery requests among DBAs and developers, so the fact that human error is one of the top factors in data center outages is not surprising.

Cyber-attacks were the fastest growing cause of data center outages, rising from 18 percent in 2013 to 22 percent 2016, will continue to be a major challenge for data centers in the future.  DDoS attacks have been slightly on the rise, but more importantly, they’re becoming more damaging.  For example, last October, Dyn, cloud-based infrastructure company that provides Domain Name System (DNS) services, was hit by the largest DDoS attack known to date.  Major websites on the East Coast were inaccessible for hours.  It was later discovered that the attack came in the form of a botnet through a large number of IoT devices – namely cameras, residential gateways, and baby monitors.

This kind of attack should make enterprises consider the safety and security of their collaboration tools, such as chat or video conferencing software, in addition to the mobile devices used.  An unprotected app on a device could be accessed by malicious software and breach your network. Worse yet, it could unknowingly access your mission-critical servers and turn them into a “zombie server.”  Even the employees themselves could be a risk.  A 2015 survey indicated that 34 percent of the U.S. workforce is freelance.  Contractors, consultants, and project-based workers should be trained on security practices and be given access to the information they need.  Once their time or project has been completed, access can be shut off.

It’s often the simplest of issues that cause mission-critical servers to be brought offline.  And as long as humans still play a major role in the management of systems, there’s always going to be a chance something happens.  Enterprises need to consider that the tools, software, and devices that their own employees use, could easily become compromised. By implementing standard processes and security procedures, routine maintenance checks of equipment and hardware, adequate training, and experienced staff, you’re able to prevent the worst case-scenario from happening.