How Difficult is NAS Server Recovery?

Tuesday, January 24, 2017 by Michael Nuncic

DST_image_970x300_hero-RAID

So, you think recovering a NAS server is an easy task? Well, think again. NAS servers may be a lower-cost RAID based solution aimed at small to medium-sized companies, but technically, they are as complex and difficult to recover from as some of the high-end storage systems. Since most NAS systems today have modern features similar to those of high-end SANs, like de-duplication, virtualization support and iSCSI targeting, the structure of the data is in various layers that must be recoverable and have the ability for reconstruction to finally get to the “real” data files.

NAS Failure - what happened?

This was the case when a recent customer experienced a failure in their QNAP NAS. The RAID 6 based system consisted of 24 hard drives with a capacity of 6 terabytes each.  During setup, there were two iSCSI LUNs containing more than 40 and 60 terabytes of data.  Each LUN was an iSCSI target of one Windows Server 2012 R2 system and NTFS partition formatting.  Additionally, the de-duplication feature of Windows Server 2012 R2 was used on both LUNs.  When the customer, an IT department of a large real estate group, was experiencing sluggish behavior from the NAS, the decision was made to perform a hard reboot.  Unfortunately, the result of this decision was that both LUNs became inaccessible even though the drive letters were shown.  And far worse for the customer, there was no backup available at all!

Data Recovery Begins

That was when the customer contacted Ontrack.  The analysis from the data recovery process revealed that six layers of data structure had to be handled one at a time.  A QNAP NAS works with a Linux data system, so the specialists had to first reconstruct the QNAP NAS RAID 6 layer to get to the Linux Ext4 filesystem.  In this Ext4 filesystem, fragments of the missing LUNs were located and rebuilt in order to access the 64 TB and 44 TB raw LUN data.  The size of each fragment was about 1 TB each, so more than 100 big data “iSCSI pieces” had to be handled, structured, and put together to rebuild each one of the two iSCSI LUN files.

These iSCSI fragments were originally managed by the QNAP system and combined on the fly, so the Windows Server system believed that it could access two existing LUNs.  The data recovery experts managed to combine these iSCSI LUN files into one single NTFS volume after the iSCSI files were copied to temporary SSD storage.  Since deduplication was also enabled in Windows Server on this system, the engineers had to work on this sixth and final layer to find the affected data to create a usable NTFS volume. This NTFS volume was then copied from the temporary SSD storage onto a newly purchased RAID storage system the customer could easily attach to their network in order to access their data again.

Final Analysis

Even with highly specialized recovery tools, recovering and copying this much data to the customer's new backup storage took several weeks to complete.  The final analysis clearly showed that the NAS server was not setup correctly.  Even though this NAS system had many advanced features, not all of them necessarily needed to be used.  Small to medium servers like this are definitely not as failure-proof as high-end servers, so using the deduplication feature may not have been such a good idea; it ultimately made the data recovery more complex than necessary.  And since hard drives have become more affordable in the last few years, it may have been better to buy additional drives than to make the data structure unnecessarily more complex.