The logical data structure on the disk is managed by the file system. FAT (File Allocation Tables) used to be a very popular file system known from DOS. Nowadays they’ve been replaced almost completely by a much more advanced NTFS. Both systems are mostly used in Microsoft operating systems. Open source operating systems are mainly based on the extended file system or eEXT (currently in its fourth version – ext4) and the less popular Reiser File System (ReiserFS), whose development is tightly monitored, perhaps due to the fact that its author, Hans Reiser, is serving a life sentence for the murder of his wife. However, that’s enough gossip for now – let’s get to the point.
Regardless of who manufactured your hard drive, the logical data operations (writing and reading) will be managed by the file system used by your operating system. In order to start using a newly purchased disk you must choose the file system for it, otherwise the operating system will not be able to use it. This process is called “formatting”. You can then partition the drive into smaller structures called partitions. Information about the partitions is located in the sector called the Master Boot Record (MBR) at the beginning of the disk. It is worth mentioning that one of the partitions contains information to start the operating system (stored in the sector called BIOS Parameter Block – BPB, which exists on FAT or NTFS partitions and contains important information about the logical structure of the partition). A normal user often does not pay attention to those details, even though one frequently comes in contact with them during the installation of a new operating system or when resizing the partition. Damage to the disk boot sector will not remove your data permanently, but the following message might give you a heart attack 😉
No bootable device — insert boot disk and press any key
Where are the files?
“Contrary to what one might think, the operating system does not save files in successive segments. Why? Such a solution would not be safe – minor damage to the disk would increase the likelihood of inability to read specific files. In fact, the segments creating the cluster, and the clusters containing the full information about the file are scattered all over the disk, their distribution is determined by the file system.”
Classic FAT began its career in MS-DOS. It is based on allocation tables that contain information about the distribution of segments on the partition. Therefore, on the surface of the disk(s), they are separated by the system as a logical file structure. All files on the disk are described in the directory using the cluster number, which is stored at the beginning of the file. The system then calculates the physical address number of the sector cluster, i.e. the head, track and sector number on the track. A FAT partition contains information to identify the next segments of the first part of the file, but also the information with the ID number of the next cluster containing the remaining parts of the file, until the last cluster is localised (which is marked by an end-of-file tag).
The much more modern NTFS is also based on allocation tables -information about the data placement on the partition is stored in the Master File Table (MFT), copies of the tables (MFT Mirrors), however, are located in a different place than the original MFT so they are less prone to simultaneous failure. NTFS supports journaling, which has separate catalogues recording all changes to the files. When you save the data, they are first recorded in the log, and later to the drive. This method significantly reduces the risk of losing files during the recording process, which, unfortunately, is a more common drawback of the FAT file system. As you can see in the image above the use of NTFS significantly increases the amount of metadata – information that is not directly related to our files, but that is necessary for the operation of the system. Metadata allows a fairly accurately trace of the history of what has been happening on the computer. We’ll take a look at this in our next post.
File systems vs file size and capacity of the media
The difference between the file systems is also seen in the maximum sizes of the supported media and the files. For example, the oldest FAT12-supported drives had up to 32MB, but nowadays FAT32 can handle up to 16 TB of capacity, the maximum file size is 4 GB minus 1 byte. For comparison, NTFS is theoretically able to handle both files and media in the size of 16 EB minus 1 KB (1EB holds 1,000 petabytes)! In practice, Windows 8 and Windows Server 2012 support a maximum file size of 256 TB minus 64 KB.
The corrupted media that goes beyond our stored files, as discussed in this and previous posts, is therefore shown as containing a large amount of additional information that must be read and interpreted appropriately in order to effectively recover the data. In this context, the data recovery procedure resembles a treasure hunt whereby there is a search for fragments of maps and clues in order to figure out where the treasure was buried.
Believe me, it’s an exciting adventure!
We have now happily arrived at a place where we can answer the question: how is data recovered? In my next post, we will endeavour to answer this question.
Remember to share your comments and feedback!