On The Importance Of Teaching Dissent To Legal Large Language Models

Machine learning from legal precedent requires curating a dataset comprised of court decisions, judicial analysis, and legal briefs in a particular field that is used to train an algorithm to process the essence of these court decisions against a real-world scenario. This process must include dissenting opinions, minority views, and asymmetrical rulings to achieve near-human legal rationale and just outcomes. 

tl;dr
The use of machine learning is continuing to extend the capabilities of AI systems in the legal field. Training data is the cornerstone for producing useable machine learning results. Unfortunately, when it comes to judicial decisions, at times the AI is only being fed the majority opinions and not given the dissenting views (or, ill-prepared to handle both). We shouldn’t want and nor tolerate AI legal reasoning that is shaped so one-sidedly.

Make sure to read the full paper titled Significance Of Dissenting Court Opinions For AI Machine Learning In The Law by Dr. Lance B. Eliot at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3998250

(Source: Mapendo 2022)

When AI researchers and developers conceive of legal large language models that are expected to produce legal outcomes it is crucial to include conflicting data or dissenting opinions. The author argues for a balanced, comprehensive training dataset inclusive of judicial majority and minority views. Current court opinions tend to highlight the outcome, or the views of the majority, and neglect close examination of dissenting opinions and minority views. This can result in unjust outcomes, missed legal case nuances, or bland judicial arguments. His main argument centers around a simple observation: justice is fundamentally born through a process of cognitive complexity. In other words, a straightforward ruling with unanimous views has little value in learning or evolving a certain area of the law but considering trade-offs, reflecting on and carefully weighing different ideas and values against each other does.  

This open-source legal large language model with an integrated external knowledge base exemplifies two key considerations representative of the status quo: (1) training data is compiled by crawling and scraping legally relevant information and key judicial text that exceeds a special area and is not limited to supporting views. (2) because the training data is compiled at scale and holistically, it can be argued that majority views stand to overrepresent model input considering that minority views often receive less attention, discussion, or reflection beyond an initial post-legal decision period.  In addition, there might be complex circumstances in which a judge is split on a specific legal outcome. These often quiet moments of legal reasoning rooted in cognitive complexity hardly ever make it into a written majority or minority opinion. Therefore it is unlikely to be used for training purposes.

Another interesting consideration is the access to dissenting opinions and minority views. While access to this type of judicial writing may be available to the public at the highest levels, a dissenting view of a less public case at a lower level might not afford the same access. Gatekeepers such as WestLaw restrict the audience to these documents and their interpretations. Arguments for a fair learning exemption for large language models arise in various corners of the legal profession and are currently litigated by the current trailblazers of the AI boom. 

A recent and insightful essay written by Seán Fobbes cautions excitement when it comes to legal large language models and their capabilities to produce legally and ethically accurate as well as just outcomes. From my cursory review, it will require much more fine-tuning and quality review than a mere assurance of dissenting opinions and minority views can incorporate. Food for thought that I shall devour in a follow up post.

Forecasting Legal Outcomes With Generative AI

Imagine a futuristic society where lawsuits are adjudicated within minutes. Accurately predicting the outcome of a legal action will change the way we adhere to rules and regulations. 

tl;dr
Lawyers are steeped in making predictions. A closely studied area of the law is known as Legal Judgment Prediction (LJP) and entails using computer models to aid in making legal-oriented predictions. These capabilities will be fueled and amplified via the advent of AI in the law.

Make sure to read the full paper titled Legal Judgment Predictions and AI by Dr. Lance B. Eliot at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3954615


We are in Mega-City One in the year 2099AD. The judiciary and law enforcement are one unit. Legal violations, disputes, infringements of social norms are enforced by street judges with a mandate to summarily arrest, convict, sentence, and execute criminals. Of course, this is the plot of Judge Joseph Dredd, but the technology in the year 2023AD is already on its way to making this dystopian vision a reality. 

Forecasting the legal outcome of a proceeding is a matter of data analytics, access to information, and the absence of process-disrupting events. In our current time, this is a job for counsel and legal professionals. As representatives of the courts, lawyers are experts in reading a situation and introducing some predictability to it by adopting a clear legal strategy. Ambiguity and human error, however, make this process hardly repeatable – let alone reliable for future legal action. 

Recent developments in the field of computer science, specifically around large-language models (LLM), natural language processing (NLP), retrieval augmented generation (RAG), and reinforced learning from human feedback (RLHF) have introduced technical capabilities to increase the quality of forecasting legal outcomes. It can be summarized as generative artificial intelligence (genAI). Crossfunctional efforts between computer science and legal academia coined this area of study “Legal Judgment Prediction” (LJP).  

The litigation analytics platform “Pre/Dicta” exemplifies the progress of LJP by achieving prediction accuracy in the 86% percentile. In other words, the platform can forecast the decision of a judge in nearly 9 out of 10 cases. As impressive as this result is, the author points out that sentient behavior is a far-fetched reality for current technologies, which are largely based on statistical models with access to vast amounts of data. The quality of the data, the methods leveraged to train the model, and the application determine the accuracy and quality of the prediction. Moreover, the author makes a case for incorporating forecasting milestones and focusing on those, rather than attempting to predict the final result of a judicial proceeding that is very much dependent on factors that are challenging to quantify in statistical models. For example, research from 2011 established the “Hungry Judge Effect” which in essence stated a judge’s ruling has a tendency to be conservative if it happens before the judge had a meal (or on an empty stomach near the end of a court session) versus the same case would see a more favorable verdict if the decision process took place after the judge’s hunger had been satisfied and his mental fatigue had been mitigated. 

Other factors that pose pitfalls for achieving near 100% prediction accuracy include the semantic alignment on “legal outcome”. In other words, what specifically is forecasted? The verdict of the district judge? The verdict of a district judge that will be challenged on appeal? Or perhaps the verdict and the sentencing procedure? Or something completely adjacent to the actual court proceedings? It might seem pedantic, but clarity around “what success looks like” is paramount when it comes to legal forecasting.  

While Mega-City One might still be a futuristic vision, our current technology is inching closer and closer to a “Minority Report” type of scenario where powerful, sentient or not, technologies churn through vast amounts of intelligence information and behavioral data to forecast and supplement human decision making. The real two questions for us as a human collective beyond borders will be: (1) how much control are we willing to delegate to machines? and (2) how do we rectify injustices once we lose control over the judiciary? 

Machine Learning from Legal Precedent

When training a machine learning (ML) model with court decisions and judicial opinion, the result of these rulings is the training data needed to optimize the algorithm that determines an outcome. As lawyers, we take the result of these rulings as final. In some cases, however, the law requires change when rulings become antiquated or conflict with a shift in regulations. This cursory report explores the level of detail needed when training an ML model with court decisions and judicial opinions.   

tl;dr

Much of the time, attorneys know that the law is relatively stable and predictable. This makes things easier for all concerned. At the same time, attorneys also know and anticipate that cases will be overturned. What would happen if we trained AI but failed to point out that rulings are at times overruled? That’s the mess that some using machine learning are starting to appreciate.

Make sure to read the full paper titled Overturned Legal Rulings Are Pivotal In Using Machine Learning And The Law by Dr. Lance B. Eliot at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3998249

(Source: Mapendo 2022

Fail, fail fast, and learn from the failure. That could be an accurate summary of a computational system. In law, judicial principles demand a less frequent change of pace. Under the common law principle of stare decisis, courts are held to a precedent that can either be a vertical or horizontal rule. Vertical stare decisis describes lower courts are bound by higher courts ruling whereas horizontal stare decisis describes an appellate court decision can become a guiding ruling but only for similar or related cases on the same level. In essence, stare decisis is meant to instill respect for prior rulings to ensure legal consistency and predictability. 

In contrast, the judicial process would grind to a halt if prior decisions could never be overturned or judges wouldn’t be able to deviate and interpret a case without the dogma of stare decisis. Needless to say, overturning precedent is the exception rather than the rule. According to a case study of 25,544 rulings of the U.S. Supreme Court of the United States from 1789 to 2020, the court only overturned itself in about 145 instances, or 0.56%. While this number might be considered marginal, it does have a trickle-down effect on future court rulings at lower levels. 

A high-level description of current ML training procedures could include the curation of a dataset comprised of court decisions, analysis, and legal briefs in a particular field that is used to train an algorithm to process the essence of these court decisions against a real-world scenario. On its face, one could argue to exclude overturned, outdated, or dissenting rulings. This becomes increasingly difficult for legal precedent that is no longer fully applicable yet still recognized by some of the judiciary. Exclusion, however, would lead to a patchwork of curated data that would not be robust and capable of reaching legal reasoning of high quality. Without the consideration of an erroneous or overturned decision, a judge or an ML system could not develop a signal around pattern recognition and sufficiently adjudicate cases. On the other hand, mindlessly training an ML model with everything available could lead the algorithm to amplify erroneous aspects while ranking lower current precedents in a controversial case. 

This paper offers a number of insightful takeaways for anyone building an ML legal reasoning model. Most notably there is a need for active curation of legal precedent that includes overturned, historic content. Court decisions and judicial opinions must be analyzed for their intellectual footprint that explains the rationale of the decision. Once this rationale is identified, it must be parsed against possible conflicts and dissent to create a robust and just system.