Use cases
Industries
Products
Resources
Company
Separating hype from substance in eDiscovery AI is a daunting task, especially when every solution claims to be the holy grail solving all your electronic discovery ailments. In reality, there are a spectrum of tools practitioners can use in their Legal Tech toolbox to accelerate time to insight, amplify their decisions, or make the eDiscovery process less of a pain in the rear. Understanding what each tool does, and when it can help, is the first step in becoming an effective tech-enabled legal professional.
Read on to uncover the top flavors of artificial intelligence in eDiscovery today and an explanation of practical ways to leverage each to accelerate time to evidence, control cost, and amplify your legal insights.
Social Network Analysis, sometimes also called a communication web, identifies who is talking to who within a data set and can uncover communication patterns around individuals or topics being discussed.
This type of analytic is extremely helpful in identifying key people discussing certain topics in a data set and can be used to prioritize, refine or add custodians in a litigation or investigation. If you have a known bad actor or custodian of interest, this type of analytic can help you identify who else was in communication with that person or on a matter of interest.
I have seen cases where social network analysis was used to prioritize custodians for large-scale reviews, outright eliminate custodians from scope who were not in communication with the key people of interest or uncover a previously unknown subject that ought to be included in scope. In one matter I took known subjects from a government investigation at one bank (that ultimately resulted in a billion-dollar fine) and identified who they were communicating with in a completely different organization (ultimately proactively conducting a small internal investigation to clear the second org completely from scrutiny by the regulator.
This type of analytic powered by AI technology and specifically natural language processing is all about finding the feels. What does that mean? Well the algorithm identifies several types of emotion present in a given textual sample including:
These sentiments can help uncover documents or communication with heightened emotions or identify context between actors communicating in a matter.
In a case, sentiment analysis helps you build the context for a given conversation and or nuances of the relationship between the actors that are communicating. A statement like “I hate this company,” would be elevated in a review about a whistleblower as compared to “Tuesday at noon works for me,” because of the linguistic choice of the author. The former is highly polarized vs. the latter's more neutral communication.
Emotionally charged language can be highly impactful in a variety of litigation and investigation contexts. From identifying language that involves pressure or coercion for a fraud case to emotional language indicating compliance or employment issues, this context building analysis helps surface potentially impactful communication quickly.
This type of AI-powered analytic automatically groups similar concepts together in an interactive visualization powered by unsupervised machine learning. Reveal | Brainspace has an interactive clustering visualization that allows legal practitioners to quickly explore topics of interest across data sets from the tens of thousands to large data sets in the tens of millions and more. By zooming in on topics of interest within unstructured data, lawyers and legal technologists alike can isolate important information quickly and ignore the rest.
This technology is incredibly impactful throughout a matter, from the earliest days when you are trying to identify key issues for your fact-driven case posture, to determining how you prioritize a review, and even bulk identification of non-relevant information. For a government investigation, this means you can identify the key information to develop your posture with a regulator early on, without having to cross your fingers that you happen to find the relevant info early in a more linear review. Some litigators also use tools like this to determine their settlement posture and overall scope of a review before they even have their meet and confer.
In an area of law where humans are the biggest cost factor, this sort of technology allows legal practitioners to determine which data is important enough to justify the cost of human eyes on a document. Does the data contain the word “fraud”? Maybe that should be batched out early for review. Does it contain “fantasy football”? Maybe that can be eliminated or bulk tagged non-responsive. Concept clustering is also capable of making connections that may not be apparent on the face of a document, which can inform the overall matter and review.
Concept search is an analytic process that allows legal practitioners to search by an idea, theme or example material. Concept search differs from concept clustering in that it is guided by an image, document or string of text that a legal practitioner uses to search for other conceptually similar material. As with concept clustering, one of the most powerful impacts that this tech can make is in making connections that may not be readily apparent to a human reviewer. In the ubiquitous Enron data set this means that a legal practitioner could use a document about “fraud” and ultimately quickly uncover the phrase raptor (which was a code word for fraudulent behavior).
In either an investigation or litigation this type of search can be incredibly impactful in uncovering unexpected relevant ESI during the document review stage. If the custodians used keywords or euphemisms that are not immediately apparent, you can uncover them with concept search. Additionally, if you have run a more traditional keyword search and come up with little or no evidence, that might be an indication that the relevant actors have used obfuscating language that concept search can assist in highlighting.
Concept searching is also extremely helpful if you are early on in a matter and not exactly certain about the specific language the organization or people of interest used, but have a general idea of the type of concept you want to search.
The AI model library and ability to share models makes the use of AI in eDiscovery as simple as using your Netflix queue. Legal practitioners can use a pre-generated AI model specific to the type of matter they are facing to pre-build out the AI for their case with the click of a button. And this pre-built model supercharges the entire review by using insights from similar models to prioritize likely relevant data at the push of a button, without having to train the model yourself. These out-of-the-box, pre-trained algorithms have been battle-tested by clients and our data scientist so you get a leg up in your next matter.
The applications of this tech in the legal industry are as varied as the movie choices on Netflix, and we keep adding to the marketplace! From models that help identify potentially privileged material to explicit language that might be of interest to an employment litigator or compliance team, to models optimized to identify PII or PHI relevant in a privacy or breach incident, the models' uses are as varied as a legal practitioner's needs.
In eDiscovery, as in many areas of life, context is king in understanding what the heck someone actually means. In human communication, and increasingly in shorter format sms and messaging, there is often more to the meaning of a message or sentence than just the words themselves. At the intersection of linguistics, statistics, and mathematics, NLP helps the system build context around the communication using word order.
NLP allows eDiscovery technology to move beyond a “bag of words” approach, to a more refined one that can uncover concepts and context based on how humans actually communicate. Think of NLP as a technology that lets an eDiscovery platform understand language the way the lawyers using it do. The result is better insights and more informed decision making. And the NLP powered models are not just smarter at the outset, they are able to gain insights more quickly than older tech about the specific context reviewers are using in coding the data to get smarter faster.
Literally any legal matter can benefit from the improved context and information retrieval that NLP-powered technology can do. Think of the above picture “man-eating chicken”. If you were searching for a human male eating fried chicken, NLP-powered technology would not return a picture of a 6-foot tall chicken chasing a man. NLP assists in conceptual search and active learning which can accelerate time to evidence for any flavor of litigation or investigation.
There is a veritable alphabet soup when it comes to this category, sometimes called Technology-Assisted Review (TAR), Continue active learning or predictive coding, this application of AI is simply tech that learns based on what the lawyers and technologists using it do. Inputs by reviewers are used to build an algorithm that prioritizes and categorizes data as relevant or not relevant to a specific issue based on the feedback (behavior) of the team reviewing the data.
Early iterations took a more static approach with a small set of data being used to train the algorithm over a series of iterations until the model got stabilized. Today, the technology has evolved beyond needing these seed sets and static to actively being able to learn and refine the algorithm and the categorizations in real-time. This means the system continually gets “smarter” and makes better recommendations. And the reviewers who get better recommendations are able to review more efficiently and potentially review fewer documents in a matter before uncovering all the relevant data.
As with NLP, this technology can be impactful on any type of case. And while earlier iterations required quite a bit of lift on the part of people using it and a massive amount of data to start being beneficial, the modern iteration can be impactful on cases of any size without having to even change your workflow. The AI of active learning can run in the background and simply surface increasingly relevant documents in a review.
There are more advanced workflows that can use active learning to eliminate large volumes of data from human review and/or prioritize what is reviewed and by whom, but even a linear approach to the review process can benefit from the improved insights provided by an active learning powered eDiscovery AI platform.
As in actual construction, with legal technology it is easy to think that everything is a nail when all you have is a hammer. Finding a suite of solutions or all-in-one solution with some or all of these flavors of AI allows you to be nimble in tackling all your eDiscovery challenges. Some cases may be perfect for leveraging concept clustering while others may benefit from some combination of the above. The key is to start using all the eDiscovery AI tools out there to help you and your case team uncover evidence faster than the competition and at the fraction of the cost of brute-forcing with just human review. Using this spectrum of analytics supercharges you and your law firm or legal team and will help you spend less time and money to uncover the evidence you need to build your case or conduct your investigation.