Skip to content

Artificial Lawyers

Can statistical and machine learning methods replace lawyers? A host of entrepreneurs think so, and do the folks who run Text mining and predictive model products are available now to predict case staffing requirements and perform automated document discovery, and natural language algorithms conduct legal research and case review. In 2017, a predictive algorithm from the startup Case Crunch processed the details of over 700 previously-decided cases, and correctly predicted the actual outcome in 87% of them. Now Case Crunch is forging a market with businesses to assist in predicting the outcome of litigation and in selecting the best lawyer for a case. Prospects are sufficiently bright that the company split up last year, with two of its principals going after similar markets with a new company, CourtQuant. One potential spur to the growth of legal artificial intelligence (AI) is private equity firms that add fuel to litigation by financing high stakes cases – these firms will be avid consumers of any analytical methods that help them establish the right odds.

The cost of litigating a car accident can exceed $100,000. Undertaking any activity in the legal system is expensive, and getting moreso, because the main input is costly expert labor – lawyers, judges, paralegals. Other industries use expert labor, of course, but it typically yields R&D, the fruits of which exceed the investment many fold. By contrast, a highly technical legal brief and accompanying courtroom argument might be somewhat repurposed over time, but the highly expert (and expensive) labor must be paid for mostly by the current litigants. As society becomes more prosperous and wages rise, tasks that rely exclusively on labor (as opposed to capital and intellectual property) become comparatively more expensive over time.

How far can artificial intelligence and machine learning go in taking on legal tasks? Although the law is highly complex, it is rule-based, so that’s a plus. Deep learning networks are capable of extremely complex models, so you would think that, since there are rules at the foundation of the law, they would be discoverable. And, in fact, machine learning has been used for several years in legal discovery – the process of sharing information among the parties to a legal proceeding. Discovery typically involves lots of documents, and in some corporate litigation this can run tens or hundreds of thousands of pages. Legal experts familiar with the case can review and label a sample of documents, and a machine learning algorithm can be trained on a mix of labeled relevant and non-relevant documents. Text mining can be used to sort documents into relevant ones, irrelevant ones, and gray area ones that require human review.

Case Crunch, a company founded by a team from Oxford University, has taken machine learning into the realm of predicting the outcome of cases that are in the legal system. Case Crunch made the news when it’s algorithms were able to correctly predict the outcomes of 775 cases 87% of the time in a recent competition. The cases were all from a narrow swath of consumer financial regulatory decisions in the UK, and all related to sales of a specific type of insurance. Now Case Crunch is forging a market with businesses, and with investors who fund litigation, to assist in predicting the outcome of litigation and in selecting the best lawyer for a case.

As for replacing lawyers in general, the challenge is formidable. One main stumbling block is the heterogeneity of legal issues and cases, and the comparative scarcity of data. 1000 cases may be sufficient to classify consumer complaints about sales tactics used to sell a specific product, but the overall legal landscape is diverse and complex, and the data needed to train models is hard to access and wrangle. For comparison, in a single day of driving an autonomous vehicle will make thousands of discrete decisions, informed by an organized flow of dozens of Terabytes of data, and the results are accreted to similar results from other such vehicles, which have now logged tens of millions of miles of driving.

A bigger stumbling block may be the nature of the legal process. Not all parties are seeking efficiency. If you think you are going to win your case and force the other party to pay all legal costs, you are not motivated by efficiency. A deep-pocket party may use ballooning legal costs as a tactic to deter litigation, or force a favorable settlement, so may be seeking not efficiency but the reverse.

But while a general substitution of AI for lawyers is a long way off, a quick look a litigation statistics does show some potential opportunities where there are numerous cases that are all similar. Consumer bankruptcy petitions might be one such example. These have ranged around a million per year recently, and are relatively homogeneous. Another might be specific product liability lawsuits that are not handled as a class action – thousands of lawsuits have been filed against a variety of manufacturers of a number of different pelvic repair products.

Some might argue, on principle, that human judgment should not be removed from the legal system, and that no machine can instill sufficient confidence in participants that they will accept the outcomes. On the flip side, note that there is already considerable variety in legal outcomes that saps confidence in the system. Police, prosecutors, defense attorneys, judges and juries can interact in ways that produce strikingly different results in similar circumstances. Mary Winkler and Gail Owens were both convicted of killing their husbands, who abused them. Winkler was sentenced to two months, Owens received the death penalty.

Daniel Kahneman, in Thinking Fast and Slow, describes an experiment in which judges were asked to throw a pair of loaded dice (they came up either 3 or 9), then given a shoplifting case to review and pass sentence. Judges who saw a 9 averaged sentences of 8 months, while judges who saw 3 averaged sentences of 5 months. (This phenomenon of numerical estimates or decisions being influenced by irrelevant numbers is called anchoring.)

So, as far as producing reliable, fair outcomes, AI does not have that high a bar to surpass. In fact, Case Crunch’s algorithm that did so well in predicting actual outcomes outperformed over 100 lawyers who participated in the same challenge (87% versus 62%).