To understand hyper-converged storage or hyper-converged systems, you have to start by considering the initial situation: Many company data centres, not only in Germany, but throughout the world, have different IT systems running side by side in peaceful coexistence, and these are kept operational by those responsible with much effort. Different servers, storage devices or individual computers work under many different operating systems, network protocols and storage solutions, particularly in small and medium-sized companies, so that connection and communication between systems becomes increasingly difficult over time.
The concept of converged infrastructure was developed to help companies in their ongoing struggle with all these many different solutions. These solutions are based on out-of-the-box products, creating computers, servers, storage devices and networks on a common basis. The customer takes exclusive advantage of the pre-configured hardware and software packages from a single manufacturer and its partners. Most are pre-configured hardware-based solutions offered by leading vendors such as EMC, Cisco or VMware, which take control and manage the entire company data centre, including the storage capacities. Since many customers have already purchased products from the respective manufacturers in the past, many of the solutions already existing in the company can now continue being operated under the new platform. However, the connection of existing infrastructure from other providers when using converged infrastructure is often fraught with difficulty, if not impossible at all, as this situation has actually not been contemplated in this concept, so that sometimes it is necessary to abandon it completely. This is the price you have to pay as customer to curb the uncontrolled IT growth in your company.
Virtualised data management under a common software interface
Hyper-converged systems, on the other hand, are entirely software-based and built consistently on virtualisation. The management and control of the entire IT system works over a common software interface in this case. This software solution can thus be used to control servers, networks and the attached storage devices, the so-called hyper- converged storages. The interlinking of the components is even more pronounced than in the case of converged infrastructure. And hyper-converged systems and storages have further advantages: they provide speed optimisation, data compression for transport over the network and data de-duplication of the WAN (Wide Area Network), limiting the dataflow something which can be very beneficial for remote backups or data recovery.
Precisely because hyper-converged systems are not only based on software, but also build on the benefits of virtualisation, almost all manufacturers use the technology of the market leader VMware for their software connection. This is especially evident in the case of hyper-converged storages: One of the best-known providers of hyper-converged storages is Nutanix. The same as in the case of its competitors Pivot3 and SimpliVity, the solution is based on to the interfaces for virtualisation provided by VMware. There is only one provider who is currently including their own virtualisation model in their software: Scale Computing, with an open-source solution based on Linux.
The same as converged infrastructure, a hyper-converged system is completely tailored to a manufacturer. It is interesting to note, however, that many hardware vendors provide the possibility of connecting their own products with the software of a partner. For example, EMC, Cisco and even VMware hardware systems are offered as out-of-the-box bundles together with the appliance software of SimpliVity by specialised partners.
Announced and expected for a long time, virtualisation technology leader VMware finally launched their own hyper-converged storage solution last year. With VSAN, VMware expand their virtualisation solution for servers, vSphere EXSi, with the ability to organise and manage storages. With a VSAN system, applications or files stored in virtual machines are stored in a so-called common datastore. At the same time, a VSAN system is made up by up to 32 connected host computers, each with up to seven hard drives and a SSD drive, which is used for buffering dataflows.
What does hyper-convergence mean for data recovery?
The key advantage of the concept – the simple management of storage under one common interface – is also the biggest obstacle for successful data recovery. As seen in the case of the failure of an almost brand new VMware VSAN system in the Netherlands last year, hyper-converged storage is not immune to problems.
In this specific Kroll Ontrack data recovery case, the Virtual SAN comprised a total of three host computers, with five hard drives and one SSD each. The failure of the single SSD drive resulted in the complete failure of the system and the loss of four large virtual machines. Only by means of newly developed software tools was it at all possible to read and extract the structure and dependencies from the VSAN datastores.
For professional data recovery engineers, hyper-converged storage or hyper-converged systems poses the problem for the new “parent” layer of the hyper-converged storage, in addition to the “normal” recovery of the embedded structures of the data packed into the virtual machines. The extra layer adds complexity in an almost exponential way over and above the already-complex set of layers in virtual file systems like VMFSs.
Making sense of this additional layer in the storage system just to access it requires much know-how about the internal structure of the system used. Since in the future even more manufacturers are going to enter this market and each product will have its own proprietary data structures, data recovery engineers will have to keep their knowledge about this field permanently up-to-date and get familiar with these new solutions.
More information about hyper-converged systems and storages may be found here: