Understanding Deleted Files, Unallocated Space, and Their Impact on E-Discovery

Topics: Artificial Intelligence, Client Relations, ediscovery, Efficiency, Legal Innovation, Midsize Law Firms Blog Posts, Talent Development


As e-discovery has risen into a major point of focus in modern litigation, it is important for lawyers without strong technology backgrounds to familiarize themselves with at least basic computer storage concepts. This blog post addresses some of the key points regarding deleted files and unallocated space, and how these concepts come into play in e-discovery.

Unallocated space, also referred to as “free space,” is the area on a hard drive where new files can be stored. Conversely, allocated space is the area on a hard drive where files already reside. Think of “allocated” storage space as already filled with data and not to be overwritten with other newer data, while “unallocated” space is available to store new data even though it may contain old data which would be overwritten by new data. While this may sound simple enough, to fully understand the properties of unallocated space, it is necessary to understand how files are stored on a computer.

Computer files are created in binary code (1s and 0s). A computer’s operating system reads a file by processing this series of 1s and 0s. When a user saves a file on a hard drive, it is stored using a file system that tracks the physical location of files in allocated space. These physical storage locations are called “sectors.” A sector is designed to hold 512 bytes of data. Depending on the type of encoding used, one character (e.g. the letter “C”) can take between one and four bytes, meaning that a sector will typically hold between 128 and 512 characters. Sequential groups of sectors (usually four or eight) are called “clusters.” On any physical storage device, there is a finite number of sectors, making them a scarce resource (even though hard drives often have hundreds of millions of storage clusters).

It is important for counsel to understand unallocated space from a technical perspective and tailor their e-discovery strategy accordingly.

Since a hard drive has a limited amount of space, the file system tracks which clusters are in use and which ones are not. The file system does this by labeling a cluster with a “1” or a “0” value. “1” means that the space inside the cluster is being used to store all or part of a file (depending on how small it is). If the file is deleted, the file system labels the cluster as “0.” The change in the label does not cause the system to overwrite the data. Rather, the change signals to the file system that the space is available to store a new file. When a new file is stored, it will overwrite data on unallocated clusters and label these clusters with a “1.”

For example, assume that a user wants to delete a word document which we will call essay.doc. Assume that essay.doc is stored in multiple clusters. The file system would label the clusters associated with essay.doc as “1” since the clusters are being used to store the contents of essay.doc. If a user deletes essay.doc, the file system will label all the clusters where essay.doc was once stored as “0.” Since the cluster is a “0,” the file system knows these clusters are now part of the unallocated storage space, and available to be used to store the contents of another file.

At this stage, essay.doc is considered deleted by the file system, but is still “recoverable,” because some or all the information that comprised essay.doc has not yet been overwritten. When the clusters on which the contents of essay.doc are reused, however, the contents of the new file replace the contents of essay.doc. At this point, the overwritten content is considered “unrecoverable” even though the markers showing the existence of the overwritten file may still be indicated. As a result of this process, the file system will eventually reuse all of the clusters from essay.doc for other files, overwriting the content of the original essay.doc and making it unrecoverable.

A lawyer’s technical understanding of how digital information is stored makes a difference in the context of e-discovery.

Here is why a lawyer’s technical understanding of how digital information is stored makes a difference in the context of e-discovery. Under Federal Rule of Civil Procedure 26(f)(3)(C), parties may provide their views and make proposals for a discovery plan on “any issues about disclosure or discovery of electronically stored information[.]” Thus, a party has the chance at the discovery conference to attempt to include or exclude unallocated space from litigation holds or the definition of electronically stored information (ESI) for the matter. This is a critical aspect of any discovery plan. Often, parties fail to address the issue of unallocated space at this juncture, creating uncertainty, potentially unnecessary preservation costs, and the risk of being sanctioned for failure to preserve information contained in unallocated space.

In Genger v. TR Investors, [No. 592,2010 (Del. Supr. July 18, 2011)] for example, a Delaware court defined unallocated space as a “reservoir of data.” Unallocated space, however, cannot be properly viewed as an accessible storage area or “data reservoir” that can be intentionally managed by a computer user. Rather, unallocated space is a pool of storage resources that the file system “intends” to reuse as needed and may not contain any coherent data at all. However, under the definition of unallocated space the Genger Court adopted, significant amounts of data residing in unallocated space were required to be preserved and collected. The Court found that one of the parties failed to preserve unallocated space on the devices at issue, which, adjoined with other actions, constituted sufficient grounds to deliver substantial sanctions. While the Court also found that a status quo order does not necessarily encompass the preservation of the unallocated space, best practices would be to expressly establish early on with the court and the parties whether or not unallocated space is to be preserved and searched.

Thus, by allowing the Genger Court to accept a misleading definition of unallocated space, counsel opened the door to burdensome e-discovery obligations and sanctions that could have been avoided by making a more technically informed argument. Therefore, it is important for counsel to understand unallocated space from a technical perspective and tailor their e-discovery strategy accordingly.

Edward Stroz, co-president of Stroz Friedberg, was co-author of this article.

  • jose sandoval

    Now the questions that follow are: When exactly include in the litigation hold unallocated space and when not to include it. Files may be available or may have been overwritten, that is, unrecoverable. This can be “potentially unnecessary preservation costs.”