The whole world is in economic turmoil, perhaps the worst global depression ever recorded. Tens of thousands of legal jobs have recently been lost. We will start seeing an onslaught of litigation because of all the money lost, the contracts that went south, and from lawsuits brought by all the workers who were are laid off as a direct result of the securitization meltdown of our financial system. Many of these lawsuits will be brought by thousands of Defined Benefit Plans and other state and municipal governmental agencies who were sold junk derivatives and collateralized debt obligations derived from the subprime-credit rating agency fiasco. A bona-fide lawsuit feeding frenzy is on the horizon. The courts are going to be the only remedy for all of these financial participants because you cannot dump these toxic assets from your balance sheets (even by filing bankruptcy to shed it because the holders of these securities contracts can go after whatever liquid financial instruments that remain in your portfolio[1]). Lawsuits are going to be the only option for many involved.
For these reasons, everyone is going to be hard-pressed to reign-in the costs of litigation more than ever before. And electronic discovery (“e-Discovery”) is one of the most expensive cost components to the parties in litigation today. KPMG estimates that first level document review encompasses anywhere between 58% and 90% of the total litigation costs. (See “Cutting to the Document Review Chase,” American Bar Association Newsletter, Business Law Today, Vol. 18, No. 2, Nov.-Dec. 2008).
But it doesn’t have to be this expensive, and it shouldn’t be. Litigants are asking their counsel to investigate the feasibility of incorporating their own enterprise content management (ECM) solutions with their e-Discovery obligations during litigations. This essentially is called native document reviews and productions.
Why are we processing ESI anyway?
To understand how using my client’s ECM solutions will lower e-Discovery costs, let’s look at the why’s and wherefore’s of collecting electronically stored information (ESI) for review. We will then examine what happens to that data under the current EDRM model. And we will close our review by looking at how utilizing an ECM system to facilitate both native document reviews and native document productions can be achieved in a secured environment and will reduce a significant portion of the out-of-pocket costs associated with e-Discovery.
There is nothing more ubiquitous than electronic files. Massive quantities, terabytes and petabytes, of ESI are being collected for the review process. It is a large and costly proposition.
A document review can either be conducted by reviewing the native documents themselves through internet hosted environments or with stand alone document review software applications. Otherwise, all the data that is collected will be “processed” for loading into a review software or database for the document review. The client incurs significant costs during this processing of their data.
Processing involves taking structured native data (Word documents, emails, spreadsheets) and all the unstructured content (faxes and other paper documents that have been digitized), and extracts the data, metadata and properties and reproduces it as separate image and text files that can be loaded into a litigation support database. These databases are used for their ability to assign bates numbering and a certain amount of foldering and tagging functionality for review purposes. However, the litigation support databases were not designed originally to house massive amounts of electronic files. Enterprise class modifications are being incorporated that improves their functionality. However, they can become very slow and cumbersome to navigate, tag and sort through with large datasets loaded, and there are other search restrictions that may make the databases unfeasible. You are also limited to conducting straight Boolean, proximity and nested searches. The review software that processes the data allows you to use clustering analysis technology, but this technology is available elsewhere that does not require the data to be processed.
The “processing” method fails because you are essentially stripping the native file of its properties. Who can look you at you straight-faced and tell you that is a best practices model? Using OCR to recreate data and metadata after extraction has high documented error rates of up to 50% during document reviews. (See id: “Cutting to the Document Review Chase,” American Bar Association Newsletter, Business Law Today, Vol. 18, No. 2, Nov.-Dec. 2008). And it is expensive. Processing costs range on average between $1,000 to $1,700 per gigabyte.
Because our courts are mandating that we demonstrate both a legally defensible and repeatable methodology for producing e-Discovery, at the very least, I recommend considering all the native review tools that are available today in the market that will facilitate first pass document reviews. For instance, there is a relatively new indexing tool that is being ignored, Microsoft’s Indexing Server, the Search Server Express 2008, that provides enterprise-class search and indexing capabilities for free. If you were to migrate data to the MSIDXS and crawl the data, you would have a much more reliable search experience that searches across Exchange platforms and supports Lotus Notes. The MSIDXS supports the SQL query language which is well known and makes it very easy to use the Indexing Server. The MSIDXS will index all data from the files, folders and web sites.
The entire section of the EDRM model on processing a document collection needs to be rexamined. It is a wasteful use of resources and produces inaccurate results. Any system that demonstrates error rates of up to 50% in the review process must be rexamined.
Being able to migrate data from an existing ECM system into a review platform that allows a litigant to reduce its overall out-of-pocket costs associated with the document review and production is a system that should be looked into. Through adoption of ECM platforms, MSIDX, SharePoint, or other hosted cloud repositories or extranets for the productions, will bring document productions into the 21st Century.
Recommendation for Native Reviews
I recommend native reviews that do not process data for the simple reasons of cost and defensibility. If the content is already structured, it can and should be indexed. We need to use the myriad of software applications and utilities available to facilitate culling down the amount of e-Discovery required for the document reviews and productions in a constructive way, on data that has not been comprimised through the OCR process. The technologies are quite promising. We now have options for robust review tools that offer clustering analysis and data mining, and provide for efficient high level filtering and foldering of data.
ECM provides a controlled data environment. Autonomy has just released a repository for SharePoint that looks promising, and Kroll has introduced a brand new review platform that is used specifically in conjunction with SharePoint. There is no doubt that using ECM with e-Discovery will reduce the overall out-of-pocket costs of litigations today. But we need these tools to be able to migrate the original data without processing it.
The Answers
What are the answers? No two document productions are going to be alike. And all litigations may not require a full blown ECM solution or analytical review tool when the case circumstances would allow for a simple swap of a CD Rom to suffice. But, understand, the courts clearly want cooperation between the parties today during the discovery process. Mancia v. Mayflower Textile Services Co., 253 F.R.D. 354 (D. Md. 2008). So, I am using this downtime to familiarize myself with the various document review tools and production platforms available. I’m exploring the cloud solutions. If I can recommend incorporating my client’s ECM platform with a document review or production, I will. You might find I have a very happy client who was able to save a substantial amount of money in their discovery costs.
Julie Wade is a Litigation Support Analyst/PM Consultant in Houston, Texas. You can email Julie at acedparalegal@yahoo.com.
[1] See Sections 555 (securities contracts), 556 (commodity or forward contracts), 559 (repurchase agreements) and 560 (swap agreements) of the Bankruptcy Code.

2 responses so far ↓
Belinda Runkle // February 18, 2009 at 5:25 pm
While I agree with the spirit of your message, I think there’s a whole world of complexity that one must be prepared to defend when trying to leverage in-house IT assets for meeting e-discovery needs. Whether you are using resources in-house, buying something off the shelf, or leveraging third-party providers for e-discovery processing and review, performing some due diligence in testing your solution against a well-known set of test data which accurately reflects what you could be ordered to search and produce is your best bet towards feeling “warm and fuzzy” that you didn’t just “rob Peter to pay Paul,” ie, save money on proactive measures only have end up collecting, processing, and searching an enterprise-super-size set of data with a wave of a court order.
Case it point, I will pick on your suggestion regarding use of SQL2008’s indexing service to index and search all your enterprise data. Don’t take it personally, all search technologies have their caveats; there are none that are perfect.
SQL2008’s file type indexing capabilities are inherited via iFilters, which are basically desktop plug-ins that train iFilter-compliant applications like SQL2008 or Sharepoint on how to index the specific file type. SQL2008’s base installation doesn’t ship with many preinstalled iFilters. Your version may handle Word docs and HTML by default, but not PDF or compressed file formats like Zips. SQL2008 doesn’t even ship with Office 2007 ifilters out of the box. It’s up to your IT guy to ensure that they configure your SQL2008 installation and scour the web for the proper iFilters to enable indexing all the file formats you may be required to search and produce.
The problem with the Microsoft iFilter approach is that it’s really up to the third-party software vendor to create the iFilters for their proprietary file formats. Microsoft has covered most of their major file formats, but there are plenty of other software vendors that haven’t made the effort here. Why? Because iFilters is Microsoft’s thing, and it’s not the most efficient way to index and search documents to begin with.
Since the iFilter is responsible for the most important part of any indexing technology – character mapping and word parsing – you may also want to worry about how those iFilters were coded and tested. Given the specific issues in dealing non-English data such as Arabic or Japanese, your confidence in searching against non-Microsoft data types in double-byte foreign languages is basically under the mercy of the iFilter authors. Those IFilter authors may have nothing to do with the actual software product team that build the file format.
Long windy comment, but suffice to say that ediscovery is inherently more complex and risky than implementing an IT or ECM solution for non-legal enterprise needs.
Hope this helps!
Belinda Runkle // February 18, 2009 at 9:31 pm
I wanted to clarify my comment above. I referenced this as SQL2008, whereas you mentioned Search Server Express 2008. Unfortunately my information about IFilter is still the same.
You can check Technet to see what’s file types are supported out of the box. Beyond that, you will need to install any required MS software as well as add any other third-party iFilters to handle non-MS file formats. Technet also spells out the default foreign language handling that MS’s iFilters support.
http://technet.microsoft.com/en-us/library/cc280343.aspx#section9