Harbinger’s back-end systems are powered by tools which are rated in the Leader Quadrant of the highly-respected Gartner eDiscovery Report. Using the proper eDiscovery tools and getting the matter set up for effective searches, efficient tagging, and clear, accurate analytics, is vital. Processing data, whether for eDiscovery or for investigation, requires forensic precision. The processing engine must index every single item in great detail. Benefiting from its forensic background and accessing data at the binary level, the Nuix engine, utilized by Harbinger, does not rely on manufacturer drivers to access, process, and index its unstructured data. Accessing the data in this way provides the ability to recover deleted data, access corrupted data, provide more reliable processing, all of which yields more accurate results.
Six Layers of Unstructured Data according to Nuix:
Text and HTML: Most indexing engines can easily cope with these formats because they are essentially just plain text.
Examples: text files, log files, web pages, Twitter posts
Documents: These formats contain text or HTML, metadata, formatting and relatively small amounts of embedded content. The common formats are well publicized and easy for indexing engines to understand.
Examples: word processor documents, spreadsheets, presentations
Containers: These are structures of varying complexity designed to embed large numbers of items, with accompanying metadata. Some indexing engines don’t extract content embedded within these files or don’t deal well with the complexities of the formats.
Examples: folders, compressed (zip) files, disk images, single-user email databases such as PST, OST, NSF and mbox files
Complex Containers: Multiple-user email databases contain even deeper levels of embedding and more complex metadata. They can reach many terabytes in size and contain millions of embedded items.
Examples: Microsoft Exchange, IBM Lotus Domino, Novell Groupwise, large file systems that contain embedded containers and complex containers
Massive, Complex Containers: Enterprise-scale systems wrap proprietary containers around each file or email message they store. These systems are so complex they require database lookups to locate text, metadata and attachments, which are typically stored separately. While they provide native searching capabilities, these are almost never designed with the needs of eDiscovery, investigation, or information governance in mind.
Examples: email archives, content management systems, Microsoft SharePoint
Compliance Storage Systems: Many organizations that face retention regulations have invested in compliance storage systems to ensure data, once stored, cannot be tampered with. These “write once, read many” (WORM) storage repositories further obfuscate content by adding another layer of proprietary wrappers around the data stored in them.
Examples: EMC Centera, NetApp SnapLock