Avoiding Non-Technical Pitfalls in Predictive Coding

“Predictive coding is an immensely powerful tool for reducing the costs of review, bringing the parties to the merits of a case more quickly and efficiently — ultimately advancing the search for truth and the administration of justice.

There are many pitfalls in a predictive coding project that are true regardless of the engine or the software that you prefer to use.  It is important to select a vendor with experience in managing such projects so that they can advise you on best practices and ‘bird-dog’ potential issues before they happen.

Haystack Information Discovery is happy to share some of these lessons:

  • Respect the Learning Curve – In my experience as a litigation attorney, there are some common tendencies in projects.  At first we tend to think everything is interesting or “”responsive,”” later realizing after the 25th time that what first appeared to be an interesting document is really background noise.  We will reverse and upset our own calls in the beginning of a review project, only later settling into a consistent interpretation of the review protocol.  This is perfectly normal, and going into a review with the expectation that this will not happen will only meet with disappointment.  We call it “”discovery”” because we don’t know what we’re going to find in the evidence — even if it is our own client.  As a practical matter, rates of overturning our own and the computers calls are going to be higher at the beginning.
  • Amplification – Predictive coding is only as good as the least consistent reviewer on your team.  If there is someone on your review team who is not ‘with the program,’ their opinion is going to amplified 1,000 times across the body of documents.  Yes, we implement the safeguard of statistical validation and quality control, but the less rounds of quality control that are necessary, the less work for everyone.
  • Reviewing is a Team Sport – Group dissonance about how to interpret documents as being responsive, privileged or having to do with a particular issue will have a direct effect upon any categorization system’s ability to code consistently.  The best quarterback in the world is going to make mistakes if two coaches are yelling different instructions in his helmet radio.  Every member of the team is training the machine, so frequent communication is important.  As a practical matter, reviewers should be very responsible about communicating their opinions and resolving dissonance.  Keeping an team environment where mistakes can be discussed and resolved without shame is going to result in the best work product.  Most importantly, it is going to result in a more rapidly stabilizing rate of overturn and therefore fewer rounds to complete.  This is why some predictive coding experts are strong advocates of simply having one highly informed reviewer do the entire predictive coding phase.
  • Keep Careful Track of the Process and Reasoning – Being in litigation, we should understand that the most expensive discovery happens when the evidence is in the greatest state of entropy when we begin.  Therefore, it should be no surprise that defending our own work will be more expensive in time and effort if we don’t keep accurate records of what happened and why.   The process of our discovery work and why we made certain calls and why those calls were reasonable should be duly recorded so that it can be efficiently reported to ones superiors, opposing, or even the tribunal itself.  This is what is meant by ‘defensibility’ – a term that is bandied about rather woodenly.  Defensibility can mean different things in different contexts, but what it certainly means is that at the end of the day it must be possible to make a strong argument for why a review was conducted a reasonable way, and evidence to support that it was indeed done in the way that you have chosen.  Then your side will be able to make a FRCP rule 26(g) declaration certifying that “to the best of the person’s knowledge, information, and belief formed after a reasonable inquiry (A) with respect to a disclosure, [the production] is complete and correct as of the time it is made…”