e-Discovery is the identification, collection, review, and production of information stored in electronic format, “ESI” or “Electronically Stored Information”, (e.g. emails and their attachments, text messages, spreadsheets, presentations, memos, reports, and computer-generated information, such as link files, registry entries, log files, configuration data, and metadata). See Electronic Discovery Reference Model (EDRM) graphic below. e-Discovery is also referred to as “eDiscovery”, “Electronic Discovery”, “Digital Discovery”, “ED”, “EDD”, “EED”, “Electronic Digital Discovery”, “Electronic Disclosure”, “Electronic Document Discovery”, and “Electronic Evidence Discovery”.
Seldom as private attorneys, or as a Litigation Support firm, are we involved at this level. Corporate attorneys should be deeply involved in what is stored, how it is stored, where it is stored, who has access, and how it is archived. Barring this level of involvement, the corporate attorney should be in control of litigation hold policies and the systems that implement this hold at a minimum. If a Corporation rarely finds itself in litigation and has no governmental requirements, policies will usually suffice along with proper archiving procedures.
The Identification Phase is used to identify potential sources of relevant data. These sources may include Mainframes, Unix Systems, Linux Servers, Windows Servers, Novell Servers, SQL Databases, Cloud Computing, LAN, WAN, MAN, SAN, NAS, Online Backups, Offline Backups, Near-Line Storage, Intranet Sites, Source Code, Instant Messaging, Document Management, SMS/Text Messages, and list goes on. Determining the location of potentially discoverable data is necessary in order to issue an effective legal hold. Executives, key players, IT Management, Records Management, and potential custodians will need to be interviewed to identify how and where relevant data may be stored, any retention policies, not “reasonably accessible data”, and any in-house tools, documentation, or policies that are available to assist in the identification process.
In the early phases of a legal dispute, the scope of data subject to preservation may be uncertain. The nature of the dispute and the individuals involved may change as litigation progresses. Change must be anticipated and procedures should be in place for preserving any newly identified information. The very short time frames typically imposed by litigation are not likely to be met without the complete support of Management. The Identification Phase requires a diligent investigation and analytical thinking, traits commonly found in a Computer Forensic Investigator. Expert consultation as early as possible in the process is critical to minimize any spoliation, intentional or not, of evidence and to ensure a productive execution of the Rules of Civil Procedure.
The Federal Rules of Civil Procedure, Rule 26(f), the planning conference, was modified to provide for a discussion of the issues related to electronic discovery, privilege assertion, and preservation. What data is reasonably accessible? What is opposing counsel really asking for and how will that affect my case? What’s the inherent risk of a forensic image versus a targeted collection? How will we deal with compressed, encrypted, password protected, or damaged data sets? What exactly is responsive? With the sheer volume and dynamic nature of digital evidence, dealing with e-Discovery is costly and often overwhelming.
Preservation and Collection
The Preservation and Collection Phases tend to overlap and the terms are often used interchangeably. The failure to promptly isolate and protect potentially relevant data (Preservation) in ways that are legally defensible, reasonable, and verifiable can lead to claims of spoliation and potential sanctions. Certain items should be addressed immediately even prior to completing the Identification phase. For instance:
- A broad legal hold notice should be released to employees.
- Backup tapes, or disks, should be removed from normal rotation schedules.
- All deletions and archiving processes should cease, at least temporarily.
- Processes that perform defragmentation or wiping of data should cease at least until collected forensically.
- Reallocation of custodian computers should cease until collected forensically.
- Any current IT projects should be evaluated as to their potential impact to preservation.
Collection is the acquisition of potentially relevant electronically stored information as defined in the identification phase of the electronic discovery process. Collections may be in the form of forensic (bit-by-bit) imaging, through a targeted collection process, or both. This is just one of the topics that should be discussed in the planning conference. If you are the requestor, you will want forensic imaging when possible. If you are the responder, your preference may lead toward a targeted collection depending on the caveats of your case.
Early Case Assessment
The Early Case Assessment Phase, also known as Early Data Assessment or pre-processing, is not one clearly defined within the Electronic Discovery Reference Model above; however, this phase falls distinctly between Collections and Processing. This phase is imperative in determining litigation exposure, reducing overall e-Discovery costs, and thinning out of the overall reviewable dataset. During Early Case Assessment, several steps may occur. For instance:
- Data staging by custodian, source, subject, etc.
- Data normalization
- Scanned documents may be OCR’ed
- Create privilege alert keywords
- deNISTing, removing known non-relevant data
- Initial deDuplication
- Extract potentially relevant data from backup sources
- Pre-filtering, or culling, by keywords, dates, file types, etc.
Following the Early Case Assessment (Early Data Assessment), it often becomes necessary to process the data before it can be reviewed for relevance. Some primary goals of the Processing Phase are to apply additional filtering rules, deDupe across the entire dataset, tag data with potential privilege alerts, OCR conversion on image files, such as TIFF and PDF, identification of encrypted and other non-readable files, identification of foreign language files, assignment of parent/child relationships, metadata extraction, full-text indexing, email threading, email analytics, and email domain identification.
Information will arrive at the Processing Phase in various data formats. All data will need to be normalized for the Review Phase, for instance:
- eMail will have to be extracted from PST, NSF, EDB, and other eMail databases.
- Loose files will have to be extracted from ZIP, RAR, 7z, and other compression containers.
- Legacy file formats may need to be converted to allow for additional processing.
- Cataloging, categorizing, and itemization of all extracted eMails, attachments, and loose files will occur.
- eMails, eMail headers, loose files, and metadata will be hashed for deDuping and validation.
- All data will be identified for exception handling.
Traditionally, the Processing Phase and the Review Phase, below, is where most of the e-Discovery costs are incurred. The Early Case Assessment (Early Data Assessment) Phase greatly reduces this cost exposure.
The Review Phase is a critical component to most litigation. During Review, responsive documents are identified and organized for production to opposing parties. Documents that may be covered by a protective order, a Rule 11 agreement, privileged, confidential, or proprietary are also identified, tagged, and logged. In the Review Phase, counsel will gain a more clear understanding of the factual issues of the case and legal strategies start to solidify. Of course, there are two sides to the Review Phase, the first is to review documents for production to opposing parties, while the second is used to review opposing party’s production to you.
Electronic discovery, with its enormous volume of data, can seem daunting. The good news is (a) the same technology that created this vast volume of information can be used to quickly review and identify all responsive and vital documents and (b) the efforts put forth in the Early Case Assessment Phase (Early Data Assessment) not only saved on Processing costs, but also on Review time and expense.
Some items to look for in a Review platform may include:
- Online review with 24/7 access
- Secured and encrypted access
- Full audit trail
- Data Mapping
- Case Dashboards
- Bulk Tagging
- Fuzzy Searching
- Tagged Searching
- Boolean Searching
- Synonym Searching
- Proximity Searching
- Metadata and Attribute Searching
- Search for Encrypted, Protected, Corrupted, and other non-readable formats
- Search for OCR Conversion failures
- Search any available field with full wildcard support
- Multiple Production Options
While the Analysis Phase appears after Review, it is really deployed throughout the e-Discovery process. As e-Discovery tools have matured, sophisticated analytic methods have been put to use, as shown above, in addition to the use of Computer Forensics within the e-Discovery process. The Analysis of the datasets, both produced and received, will provide counsel with all the facts of the case allowing for informed decisions. The Analysis should produce the how, when, where, who, and why of everything that occurred. When it doesn’t, Computer Forensics can be employed to find the missing information.
With the unprecedented increase in the amount of Electronically Stored Information in the corporate community, there has been a corresponding increase in the focus on costs, how data is collected, and finally produced in civil litigation and government investigations. Because of the complexity, the potential costs, and the risks associated with producing Electronically Stored Information, the topic has been addressed in a growing number of articles, white papers and judicial opinions. As a measure of the significance of the topic, production of Electronically Stored Information is addressed directly in the Federal Rules of Civil Procedure, which were amended effective December 1, 2006. For example, Rule 26(f) sets an expectation that the method and format by which Electronically Stored Information is to be produced should be considered and negotiated by the parties early in the discovery process.
Electronically Stored Information may be produced:
- In native format.
- In printed form..
- In image form, such as, TIFF or PDF.
- Static images can be Bates Labeled.
- Native files can have Bates data added to their filenames.
- Load file can accompany any of the above forms for Summation, Concordance, Relativity, EDRM XML, and others..
Options above can be combined, such, producing both native and tiff with OCR text for TIFF images accompanied with a load file. Privilege logs, audit logs, chain of custody, and other reports can be generated to prove proper handling of responsive documents.
The presentation of Electronically Stored Information can be a challenge for attorneys and paralegals especially when the data contains no legible form. Technology has advanced greatly over the last decade making it easier to present exhibits in near-paper. Deposition and trial exhibits may be information in paper, near-paper, near-native, or native format. The exhibits may be in boxes or organized electronically as a digital exhibit for use in Powerpoint, KeyNote, Sanction, and Trial Director.