Back to blog
Articles

Getting Greater Value from eDiscovery Data with Generative AI

George Socha
April 24, 2025

5 min min read

Check how Reveal can help your business.

Schedule demo

Check how Logikull can help your business.

Schedule demo

On April 10, 2025, I facilitated a KnowledgeBridge session, “Getting Greater Value from eDiscovery Data with Generative AI”, at Consero’s Corporate Litigation & Investigations Forum in Litchfield Park, Arizona. This was my first time attending a Consero event – to say nothing of a KnowledgeBridge – so I had no idea what to expect.

Our KnowledgeBridge was a group session, with a little over 25 people participating. It was structured as a facilitated discussion. We had a topic, how to get greater value out of eDiscovery data using GenAI, which served as the starting point for our discussion. My job, as I understood it, was to get a discussion going, keep it moving, make sure people had the opportunity to ask whatever questions came to mind, get their questions answered, and end on time! Based on feedback received, we met those goals.

Our session took place partway through the second day of the forum. I hoped that would give me the luxury of getting a sense of where forum participants were at when it came to our topic.

It did.

Based on questions I heard and the responses they received, I got the sense that a significant number of the forum participants:

  • Already had had experience using GenAI in their personal lives;
  • To date had not used GenAI in the work setting or had used it only in a limited and controlled fashion;
  • Wanted to learn more about what GenAI was, what could be done to ensure it could be used safely and effectively, and, to a lesser extent, how it might be used in the eDiscovery context.

With those as my new marching orders, we began our session.

I won’t try to recap the entire discussion. Here, however, are some of the highlights along with thoughts of my own.

Their experiences using GenAI in their personal lives

Many of the people participating in the KnowledgeBridge have used some form of GenAI in their personal lives. Probably the most used implementation, no surprise, was ChatGPT. They reported good success using the technology, finding that GenAI could assist them in many of the ways we all have been hearing about: brainstorming, planning an activity, preparing the first draft of a message or document, and so on.

Why they have not used GenAI more in the work setting

It was a different situation when it came to the work setting – a response I routinely hear. Unlike in their personal lives, when it comes to their professional lives they are far more cautious about whether they use GenAI. Likewise, if they do opt to use GenAI they are far more cautious about how they use it. The reasons they cited also were the ones I routinely hear and ones with which I sympathize. Here are some of the top reasons cited, although the phrasing is mine:

“I need to better understand how GenAI works before I can be comfortable using it at work.”

“How can I know whether my data is safe – not accessible by people who should not have access to it and certainly not shared with the world?”

“How can I know whether the results are accurate, reliable, and verifiable? We’ve all heard the tales of woe about attorneys who relied to their detriment on GenAI hallucinations and none of us wants to repeat that mistake.”

“I don’t know you much using GenAI is going to cost. Litigation costs – and especially eDiscovery costs – already are too unpredictable. I don’t need unpredictable GenAI costs compounding that problem.”  

How they might use GenAI safely and effectively at work

The session was not just a time to voice concerns. It also was an opportunity to discuss how those concerns might be addressed. Here are some of the points we covered:

How does GenAI work?

We discussed some of the basics of what GenAI is and how it functions. We talked about pure GenAI tools versus tools that combine GenAI and other capabilities, ones that have adopted, for example, Retrieval-Augmented Generation.

There now is a wealth of information available on GenAI. For general information, prudent use of the internet can take you a long way. If vendors you use offer GenAI capabilities, they might be a good source of information. If you are evaluating vendors, perhaps as part of a selection process, each of the vendors you are considering ought to be eager to share information with you.

Colleagues can be a great assist. Sometimes you just need to step down the literal or figurative hall, as you might have a co-worker with deep expertise in this area.

Various levels of educational content are available, everything from explanatory sections on websites to formal programs. One client, wanting to better understand AI and what it might offer him, even earned a master’s degree in data science.  

Is my data safe?

We agreed that we all shared the same fundamental data safety concerns: we want our data under our control, accessible only by those would ought to have access to it, and unavailable to all others.

How to get there might not always be easy, but we discussed a few of the steps you can take. If you are using – or considering using – a vendor’s software, ask the vendor what they do to protect your data. And find out what information they can give you to back up the statements they make. Ask for references, if that makes sense, and then make sure you follow up with the references.

If references are not available and even if they are, try to find out who else has used the technology and, if you can, contact some of those people. Bear in mind, of course, that a reference ought to have good things to say, and that a former user of a tool might paint a picture that is far more negative than warranted.

If you have internal data security personnel or someone similar, seek their guidance. You may well need their sign-off anyway, so start with them early in the process if you can.

Start with the basics, and go from there: Where is my data stored? Who has, or can have, access to it? Under what circumstances? What is done to monitor access? What steps are taken if issues arise? Does my data go anywhere else? If it does, what safety and security measures are in place?

Are the results I get accurate, reliable, and verifiable?

Find what measures are taken to ensure accuracy and reliability. You want to know what is done to prevent GenAI from hallucinating, of course. But you also should find out what data the GenAI technology has access to, which of that data it uses, and how it uses that data.

Test the system. Give it instructions whose results you already know, as well as instructions you can verify independently from anything the GenAI system does. See what happens when you ask it to come back with results when you know that data it is using does not contain an answer to the thing you are looking for.

Determine how you can verify results within the system. Does it show you what documents or other sources it got its answer from? Does it show you want information in a document in used?

Find out whether the system can track the prompts, instructions, or queries you gave to it, the responses you got back, and the information it used to generate those responses. You might not want to turn this type of capability on, but you certainly want to know whether it is there and, if it is, how it works. While tracking is important for tasks such as verification, it also is necessary to allow you to return to work later and to collaborate with colleagues.

How they might use GenAI with eDiscovery data

We discussed some of the more mature options available today for using GenAI with eDiscovery data: what can be done and how that can be useful. (Maturity, remember, is a relative term, especially given that GenAI has only been available for commercial use for a brief period.) We also discussed some of what I consider emerging options.

Question and answer

We discussed one of the earlier – and likely most successful to date – forms of GenAI become available within eDiscovery platforms: implementations that allow users to query their data much like they would ask questions of an interviewee.

If done properly, those implementations should check all the boxes from above: They would use data loaded into the eDiscovery platform. They should not use data from outside the platform. They should tell you if they cannot find an answer to your question in the data. They should show what documents or other data sources they used to formulate the response they gave to you. They should show you the exact text or other content they used. They should have the ability to track the work performed.

A great benefit of Q&A implementations is their adaptability. You can start using them as soon as there is data for them to access. If you are working on repeat litigation, that means you may be able to turn these systems for assistance even as you first learn of a potential dispute, prepare an initial complaint, put together the first answer. You can turn to them for insights into factual content as you prepare motion papers, interview witnesses, draft discovery requests and responses – really, at any point in the life of a lawsuit (or an investigation) where you need to learn more about the data you have loaded into your eDiscovery platform.

Early Q&A systems were (I think) all-or-nothing propositions. You had to ask your questions of the entire dataset loaded into the platform. Now at least some of the platforms let you specify which documents are used, letting you, for example, using one or more of the eDiscovery platforms search capabilities to home in a specific set of documents. Some platforms also give you ways to search within a single document, a capability that can be especially useful when you are trying to locate one key bit of information in a, say, 153-page document.

Summarization

Another function that is becoming available is the ability to have GenAI prepare a summary of a document.

First-pass document review

One of the need-to-do steps in most lawsuits is the identification and production of documents in response to a set of document requests. This often is not high on outside counsel’s list of what they would like to do, and certainly it does not rank highly on in-house counsel’s list of how they would prefer to spend their organization’s money.

Virtually every provider of an eDiscovery platform is working on delivering a viable GenAI-based approach to making first-pass review work better. The reviews I have heard of the options available so far have been mixed – but we still are in early days.

Privilege review

Up there with first-pass review is privilege review, except that the latter is more complex, more challenging, and harder to address successfully. In my opinion, we all are still in the exploratory stages of figuring out how to use GenAI to tackle priv review and do so in a way that makes it worthwhile.

Get exclusive AI & eDiscovery
insights in your inbox

I confirm that I have read Reveal’s Privacy Policy and agree with it.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.