Use cases
Industries
Products
Resources
Company
Each week on ACEDSBlogLive, I will be chatting with a leader in eDiscovery or related areas. For the September 25, 2020 session, I got together with Aaron Crews. Aaron is with Littler, the world’s largest employment firm, where he serves as the firm’s first Chief Data Analytics Officer. Aaron and I covered a lot of ground. We discussed what is and what is not artificial intelligence; looked at how to crawl, walk, and then run with AI in litigation; and tried gazing into our murky crystal ball to figure out where AI will take litigators in the future.
Recorded live on October 16, 2020 | Transcription below
Note: This content has been edited and condensed for clarity.
George Socha:
Good morning and welcome. As Joy Heath Rush, CEO of ILTA, likes to say, good morning, good afternoon, and good evening, because I don’t know where you’re watching from or what your time is right now.
Welcome to #ACEDSBlogLive. I am George Socha, Senior Vice President at Reveal. Each Friday morning at 11 am Eastern, I host an episode of #ACEDSBlogLive, where I get a chance now to chat with luminaries in eDiscovery and related areas.
In a little bit I’ll bring on this week’s luminary, but first a couple of things including a logistical note for everyone. In this platform there is the ability to post chats. I will be monitoring the chats throughout, interrupting as appropriate to introduce questions or thoughts, so please make it a lively chat so we can have a good discussion. Next, before I introduce this week’s guest, I’d like to take a moment to pause to remember someone who was one of the quiet geniuses and pioneers in our industry, David Kittrell, who recently passed away.
David, if I understand correctly, wrote the very first eDiscovery review tool, probably in the late 1980s or the early 1990s. He continued from then on to make contributions, significant contributions, to the eDiscovery industry for decades. I first worked with David, starting in 2003. We were working on a project together for a joint client. I recall being at a meeting out in Seattle in a law firm conference room, where three of the walls of a decent sized conference room were floor to ceiling whiteboards. David took markers and covered those white boards from the floor as high up as he could reach with formulas, equations, snippets of code, outlining the structure we ended up using for that and other matters. The best way to describe what he put together then, in today’s terminology, would be an AI-based data clustering and predictive analytics platform, the likes of which did not hit the marketplace for years to come.
David and I launched two startups, neither of which unfortunately went very far. It would have been very interesting if either had because of what he was trying to architect, his use of servlets and AWS functionality, and a few other things. He was unique amongst eDiscovery consultants as far as I know in that he worked as a consultant with two of the largest technology companies out there who are direct competitors and yet they’ve both had him working. Pretty amazing. He hated to speak at conferences or webinars. I roped him into a few, and when I did manage to pull him on, he invariably was the panelist who left the audience with the most to ponder. He is of course missed by many.
Now, I want to bring you someone else who will give you something to ponder. Aaron Crews is joining us this week as today’s guest. Aaron is the Chief Data Analyst Officer at Littler. Aaron, you’re going to have to put up with me talking a little bit longer before we get to you. He is one of a small and a rarified group, one of the folks I think of as being part of the eDiscovery trifecta. Aaron has worked in-house as an eDiscovery attorney, notably at Walmart, where he was Senior Associate General Counsel and Head of eDiscovery. He has worked at an eDiscovery provider, where he was General Counsel and VP of Strategy at Text IQ. And both before and after those, he was at and is again at Littler. Earlier on he was a shareholder and eDiscovery Council and now as I said, he is their Chief Data Analytics Officer. And I’m sure I don’t even need to explain who and what Littler is.
Today’s topic and the thing we want to talk about is artificial intelligence and the use of it in the practice of law and the use of it in litigation, and there are three areas we want to go into. I’ll lay them out and then we will be up and running. First Aaron is going to help us figure out what in the world we mean when we talk about AI in this context and help dispel some of the myths. Second, we will look at where decision systems can really be valuable and third and probably most helpful for many, we will go into some concrete examples he can talk about of the use of AI in this context. But Aaron, I’ve talked long enough, let’s get you started. Throw up and dispel some myths about AI for us. What in the world are we talking about?
Aaron Crews:
Happy to and thanks for having me. I’m not sure the term luminary applies for me, but I appreciate being tacked on. When it comes to AI, especially in this market, there’s a lot of snake oil, smoke, kind of whatever you want to call it, and I think that that’s true generally. A couple of years ago, you saw blockchain attached to everything under the sun, and everything was being built on blockchain. AI has probably had a longer marketing tail than blockchain has. When people use the term AI, they really kind of lump a bunch of things in there together. But for me, I think of artificial intelligence as essentially decision systems, things that mimic human decision-making in one way or another. There are a whole number of examples of them that people run into all the time, the systems that help serve up – this is something I talk about all the time when I’m asked about this – the systems on Netflix that constantly recommend that I should watch my Little Pony. That system is an artificial intelligence tool, it’s essentially a cost training prediction tool for all intents and purposes. It’s basically the same functionality as your spam filter is using, it’s the same functionality that’s recommending products to you on Amazon or whatever e-marketplace you are shopping on.
And then there are a lot of things that really are not kind of AI, they’re kind of hard-coded. Essentially they’re like functions and math; “if you put in input x, you get output y” kind of stuff. And then there are a lot of pieces in between that, that we can kind of talk about. But if you really think about what this is, for all intents and purposes, it’s statistics, it’s statistical models being run at various levels of depth. You have these algorithms that are calculating, based on what their inputs are, they’re just calculating an output. At the end of the day if you really think about that, at least for me, what it does is reframe how these things should be used. And so one of the things I talk about all the time, is that if you and your organization are not really good at analytics, you’re not ready to really heavily pursue AI. It’s kind of a crawl, walk, run sort of scenario. And the walk is being strong in the analytic spaces and being able to already leverage and make sense of data in support of business functions, and then when you move to AI, what you’re really doing is you’re automating that data analysis in one way or another to augment human decision making.
George Socha:
I’d like to come back to the crawl in a moment, but I got a couple of yes/no, thumbs up/thumbs down questions. AI or not AI in this context.
Aaron Crews:
Sure.
George Socha:
Email threading?
Aaron Crews:
Not AI.
George Socha:
Why not?
Aaron Crews:
Because it’s essentially blunt force tooling, all you’re doing is you’re taking all the messages that are aligned to one another and you’re just writing a rule. Not AI.
George Socha:
Okay. Concept searching?
Aaron Crews:
Not AI because it’s human driven. You’re punching in whatever the search terms are and then it’s essentially tying those together.
George Socha:
How about concept clustering?
Aaron Crews:
Closer, because it’s the machine doing it on its own right. Closer probably, because it’s an automated first run by the system. You don’t have human intervention, you essentially hit go and the algorithm stratifies your data into chunks based on essentially how it sees the various pieces tying together. Usually that’s a system, it’s either clustering or vectors or some combination thereof, and so that is going to be significantly closer, in my view.
George Socha:
Predictive coding, “TAR”, or whatever else you want to call it?
Aaron Crews:
Yes particular the active learning piece. Lots of human intervention there. So in fact, if you really think about the continuous active learning version of predictive coding, essentially it’s the same as teaching a system to recognize a cat in a YouTube video. You’re through the process of humans going through and saying “Yes” and “No” in terms of what’s relevant and what’s not or what’s on point or what’s not, you’re essentially annotating. What you’re doing is you’re teaching the algorithm to refine whatever its base level was and improve its throughput. So there, yes, that’s a quintessential example of something that I would call machine learning or AI or decision system or whatever title you want to put on that. I don’t like the term “artificial intelligence”, only because I think it’s like cloud. Cloud is anti-marketing speak for other guys’ computers; AI is marketing speak for machine learning or decision systems or whatever else you want to throw at it.
George Socha:
A rather snarky definition of AI that I ran across a while back goes something like this – “artificial intelligence is the stuff we either wish or fear computers could do that they don’t today. And as soon as they’re able to do something it’s no longer artificial intelligence, it’s something else”.
Aaron Crews:
Yeah, I like that a lot for a couple of reasons. The one that I say all the time, is these systems are neither artificial nor intelligent. They’re computers running sometimes very complex, statistical models but at the end of the day they’re just running statistics and they’re not intelligent in any way. They’re literally spitting out an output that they have “learned” by their inputs but there’s no intelligence behind them at all.
George Socha:
Okay, so one more set of examples: model libraries or any other type of structure that’s built on the same sort of predictive analytic tools that predictive coding is built on.
Aaron Crews:
I would argue yes, largely because again most of that is an automated function. The way I think about a lot of these things is, what we’re doing is we’re creating sort of elastic models that allow machines to do at least part of the heavy lifting of what humans used to do in terms of trying to calculate or trying to divide data or trying to understand and synthesize what information is saying. It’s kind of the winnowing function. If most of that is automated and if it’s automated in such a way that is continually improving through that process, it probably falls in the bucket, and if it’s really people either hard-coding things in or if it’s essentially just a log of the decisions that people made, then it’s not.
George Socha:
So you mentioned “winnowing function”, what is that?
Aaron Crews:
I believe it’s an old farming term.
George Socha:
Back to what you said earlier about “crawl, walk, run”, I think probably very few of us who handle lawsuits are ready to run with this artificial intelligence yet. A few of us can walk. Most of us, crawling is all we can aspire to. For the people who handle lawsuits who aspire to crawl, where do they start, what do they do, and how?
Aaron Crews:
I really think honestly the answer to that is, get comfortable with data, and when I say get comfortable with data, I mean data beyond like e-mail and text messages. I think the eDiscovery world in particular has been focused on and fascinated on what I would call unstructured communication data for a very long time, for obvious reasons. It’s not an irrational focus or place to look, but the world has kind of moved on in a way. The way I’d like to describe this – and Danny Regard at IDS is the person I bounce this kind of stuff off of a lot and Hunter McMahon over there as well, we kind of talk about these things over cocktails from time to time – and the way I describe this is,
data reinforces data. If you have data, particularly data that’s being generated by systems that are not designed to tell a story, things like log files and all kinds of stuff, it doesn’t lie.
You can take that data and you can begin to analyze it to figure out what the sequence of events were, what happened. A lot of times if you overlay data on top of data, on top of data, on top of data, you can begin to get a really clear picture of exactly what went on, who did what, when. The communication data that we have always focused on in the eDiscovery space, really is the icing, so to speak. It’s the window dressing, it’s the drapes, it’s whatever you want to call it. It’s not the fundamental structure.
George Socha:
It sounds like things like marketing data, manufacturing data, information about component parts suppliers, information about customer complaints – all of those types of things – if you can get that content, throw it into that giant AI hopper, turn the crank, and something good comes out of the end in theory?
Aaron Crews:
You don’t even have to throw it into an AI hopper. I can give you even a more concrete example than that. We’re the largest labor employment law firm in the world and I only say that because it reinforces the fact that that’s all we do. We do labor employment, we do all aspects of it. Even stuff that people don’t immediately think is necessarily labor employment, like trade secrets and unfair competition law, particularly litigating those kinds of cases, and trade secret cases and wage and hour class actions are really great examples of places where you can use analytics, whether that’s just people crunching numbers, looking at data, or automated systems that actually start to really take deep looks at this data. You can start to map out what people did and where they went. You can take things like wage and hour class action, where the allegation is essentially, “I worked eleventy billion hours, I never got a lunch, I never got a break, I was scheduled so tightly that my 17-hour days and I was only supposed to work eight and you only paid me for six and so you owe me 35 gagillion dollars”. Invariably that’s how those things are always pled. Today, it is a rare organization where people walk in and touch nothing that leaves electronic footprints or a digital cloud around them. You can collect those electronic footprints, that digital cloud, and start to analyze it, to really say which parts of this complaint might be true and which parts are obvious nonsense. Just start thinking about it. Think about, not in a COVID world, but actually we could do a COVID world next, but let’s think about pre-COVID world. You had your car and your phone, both probably have some level of GPS data on them, so I can tell you where you went and what you were doing. You probably parked your car in some version of a parking garage or a parking lot that you may have had a badge swiped into. You probably had a badge swipe into a door to get into a building, but those doors and parking lots have logs. You probably logged onto a computer and multiple systems when you did that. Well those all have log-in logs and then you are active inside of what I call core systems, the things that you really need to touch in order to do your job, and if the allegation is, “I was working like crazy and whatever”, I can get a real clear picture of what your day looked like. I might not be able to tell you exactly what you were doing, but I can tell you if you were active. I can tell you when you started and when you stopped and I can tell you where you were. And that is really, really helpful.
George Socha:
You’ve got great data sources available.
Aaron Crews:
Yep.
George Socha:
Where does the AI part fit in with this? Well, I can take all that data and I could start going through it manually and figure all of that out eventually.
Aaron Crews:
Yeah, but you can begin to write programs that analyze the various tranches of data in certain ways to let you see timelines, for instance, what did people do when, just blocking out specific events. You can write those such that you can dump data in and it’s going to spit data out. Then what you can also begin to do is write programs that tie those various analyses together and begin to automatically generate something that is super useful and layered, layered analysis that lawyers in particular can then use to make strategic decisions about what goes on in those cases. What does the liability potentially look like? How bad is this thing, is it good or is it not? We’ve had cases where the allegation is, “Our client owes 70 million dollars in damages,” and when we look at the GPS data and all the other things, what we find out is, “You know what? They’re actually pretty clean.” I think that 70 million dollar case, I think the actual liability was something like 30 or 40 thousand bucks, which is a radical transit.
George Socha:
Very different. It sounds as if you then want to start to get your hands on these sources of data and put at least some of that into some form of AI system as early as you can on the life of the matter.
Aaron Crews:
Yeah, this makes me less popular than I am already, which is actually a hard thing to believe. But I constantly am ringing that you’ve got to get the data early and you’ve got to us early. The sooner that you can start laying in these data sets into something that allows you to synthesize and understand them, if you really think about the kinds of AI that you’re working with today and we’re going to be working with in the very near future, particularly it’s like spatial computing, really comes into its own. What you’re going to start to see is… these are essentially human decision eating functions, right? It’s like having a little librarian next to you that can essentially synthesize large amounts of information. People don’t do two things well, they don’t work for really long periods of time and they’re not able to digest rapidly large amounts of information. But the thing people do really well is they can take, particularly lawyers, they can take synthesized information and look at that and go, “Ok, now I should do this with that strategically or tactically,” and that’s really the value proposition here for all of us.
George Socha:
People keep posing questions about the use of artificial intelligence in the world of litigation or any form of technology as a John Henry versus the steam drill debate. Sounds like you are much more in the camp I am in, which is, what we’re really talking about is creating a super powerful exoskeleton that litigators can strap on, making them stronger, faster etc. than they ever could otherwise be.
Aaron Crews:
Yeah, I think that that’s right, I mean, I probably should have said something. I fall into a camp, I don’t believe in the idea of a super intelligent artificial intelligence. I don’t believe machines ever become intelligent. I look at what this fundamentally is, which is deep statistical analysis, and I look at that and say how do you make the leap to something that’s the cognitive equivalent of a dog, let alone like a human? The idea that Hal is coming or the Terminator or any of the Hollywood tropes that sit around this, I don’t ever see that happening. I’m often wrong but never in doubt, so I may very well be off on this but that’s my personal view of the world. If you think about it like that, then the explanation that you just laid out makes all the sense in the world. What these things do is they allow us to take low level work and automate that so that we as lawyers, as paralegals, as whatever, whoever is involved in the litigation process, we’re not doing that low level work, we’re doing the things where we really add value. And the places where legal professionals really add value is in that thought process and the strategy and the analysis of, “Here are a bunch of facts and here is what the law says and those tie together in this way and therefore risk looks like X or our ability to resolve this thing looks like Y” – that is really the game. These are little levers, right? If you think about it in the physics sense, these are levers that allow you to move larger rocks than you would be able to move if you were just using your own muscle power.
George Socha:
So I’m going to finish off with this question. I’m going to ask you to, as we all do, look through cataract eyes into a murky crystal ball and give me a prediction for what all of this is going to look like, AI, in litigation five years from now. And no one’s going to hold you to this.
Aaron Crews:
I was going to say I will preface my answer with the thing I always say when people ask me to predict the future which is, those that live by crystal balls are destined to eat ground glass. But, with that kind of asterisk on it, what I actually see coming is a world where eDiscovery professionals are more important than they are today because we can understand the data, we can help wrangle that data, and we can work directly between lawyers and the teams who are building and using the systems to analyze that data in ways that lots of other people can’t; the technology underpinnings are there. And here’s the important part. I think over the next five to even seven years, you are actually going to see a lot of what we think of as early case assessment, early culling, those capabilities are going to be infused into native business systems for all kinds of reasons. I think compliance is going to be the real driver on it, in lots of ways, around privacy and information governance and then also, internal investigations and just that kind of stuff. That’s going to be the driver that causes like major market software companies to put that core culling, search, collection, preservation – all of those functionalities are going to sit in a transparent defensible way inside of systems so that a lot of what we do today and we think of the eDiscovery process, will be a core automated feature of those functions. And what that means is, if you’re decisioning early on about what you’re going to get and what you’re going to take and what you’re going to look at, is the more important part. It’s not actually the nuts and bolts anymore to make sure that stuff didn’t disappear or whatever, it’s more the, “What are we looking at and how does that drive outcome?” and so it’s the fusion of technology to more traditional lawyering, that is going to be the thing that eDiscovery professionals probably leave the Vanguard on.
George Socha:
Allowing the litigators and trial lawyers to get to where they really should have been all along with this finding and telling the most persuasive story with the data that’s available?
Aaron Crews:
Yeah, it’s a weird concept, right? Getting back to lawyering. Who would’ve thought?
George Socha:
I know, who’d have thought? Well, Aaron, thank you very much for taking time from your day to spend time with us here. For anybody who missed, Aaron Crews is the Chief Data Analytics Officer at Littler, which as Aaron was kind to point out to us it is the largest employment law firm in the world. Once again, Aaron, thank you very much.
Aaron Crews:
Thanks for having me.