Monday, April 6, 2015

Recovering image data from JPEG file fragments

This interesting paper was recently published by SPIE. SPIE is the international society of optics and photonics.

"A new technique for recovering fragmented data files can retrieve elements of a JPEG compressed image even when the file's header is unavailable."

"The most basic task of any file system is to manage and organize data in a storage volume. Each file in the store is allocated a list of blocks (the basic unit of access), and when we access a file, the system retrieves the data in sequence from the list. Correspondingly, removing the relevant entry deletes the item (note that in most systems, deleting simply means that data is overwritten over time by newly saved files, rather than actually removed). Figure 1 depicts the layout of files distributed across blocks of a storage medium, and provides a simplified view of the data structure used to track allocated blocks. When the information for a volume is corrupt or missing, it is only possible to recover files by analyzing the fragmented raw data, a process known as file carving."

"Today, file-carving tools play an important role in digital forensic investigations, where analyzing deleted files and salvaging data from damaged and faulty media are common procedures. However, when files are encoded and compressed (as with most multimedia files), recovery is dependent on the availability of the file header, which includes all the decoding parameters. If a file is partially intact with its header deleted, common carving techniques cannot recover any data. Here, we describe an algorithm1 that advances the latest developments in JPEG carving by introducing the ability to recover file fragments when the associated header is missing.

There are two main challenges associated with file carving. The first is the inability to lay out data blocks contiguously on the storage, as in the case of files b and f in Figure 1. Repeated execution of file operations, such as addition, deletion, and modification of files, over time leads to fragmentation of available free storage space. As a result, the newly generated files need to be broken into several parts to fit into the available unallocated blocks. Figure 1 illustrates this phenomenon, where files a and c are separated into two pieces. Even the new solid-state drives (which use integrated circuits, rather than disks, to store data) are susceptible to this phenomenon as they are designed to emulate the interface characteristics of hard disk drives.

The second challenge to successful file carving is that interpreting binary-formatted data requires the use of decoders, without which a block of data will reveal little or no information about the content of a file.

There has been significant recent progress in carving JPEG files, which are the most widely adopted still image compression standard today and commonly the subject of forensic investigations.2–6However, techniques for JPEG recovery still assume that a file header is always present. Without the header, the usual techniques cannot recover any data, even though the rest of the file may be intact. For example, in Figure 1, the fragments of files d, e, and g were overwritten by files c and f. Recovering an arbitrary chunk of compressed image data without the matching encoding metadata essentially requires reconstructing a new file header, which at least requires knowledge of entropy coding parameters and quantization tables, methods used during compression for downsampling the color information and image dimensions."

Click here to continue reading this paper on SPIE's web site.

No comments: