Machine learning from legal precedent requires curating a dataset comprised of court decisions, judicial analysis, and legal briefs in a particular field that is used to train an algorithm to process the essence of these court decisions against a real-world scenario. This process must include dissenting opinions, minority views, and asymmetrical rulings to achieve near-human legal rationale and just outcomes.
The use of machine learning is continuing to extend the capabilities of AI systems in the legal field. Training data is the cornerstone for producing useable machine learning results. Unfortunately, when it comes to judicial decisions, at times the AI is only being fed the majority opinions and not given the dissenting views (or, ill-prepared to handle both). We shouldn’t want and nor tolerate AI legal reasoning that is shaped so one-sidedly.
Make sure to read the full paper titled Significance Of Dissenting Court Opinions For AI Machine Learning In The Law by Dr. Lance B. Eliot at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3998250
(Source: Mapendo 2022)
When AI researchers and developers conceive of legal large language models that are expected to produce legal outcomes it is crucial to include conflicting data or dissenting opinions. The author argues for a balanced, comprehensive training dataset inclusive of judicial majority and minority views. Current court opinions tend to highlight the outcome, or the views of the majority, and neglect close examination of dissenting opinions and minority views. This can result in unjust outcomes, missed legal case nuances, or bland judicial arguments. His main argument centers around a simple observation: justice is fundamentally born through a process of cognitive complexity. In other words, a straightforward ruling with unanimous views has little value in learning or evolving a certain area of the law but considering trade-offs, reflecting on and carefully weighing different ideas and values against each other does.
This open-source legal large language model with an integrated external knowledge base exemplifies two key considerations representative of the status quo: (1) training data is compiled by crawling and scraping legally relevant information and key judicial text that exceeds a special area and is not limited to supporting views. (2) because the training data is compiled at scale and holistically, it can be argued that majority views stand to overrepresent model input considering that minority views often receive less attention, discussion, or reflection beyond an initial post-legal decision period. In addition, there might be complex circumstances in which a judge is split on a specific legal outcome. These often quiet moments of legal reasoning rooted in cognitive complexity hardly ever make it into a written majority or minority opinion. Therefore it is unlikely to be used for training purposes.
Another interesting consideration is the access to dissenting opinions and minority views. While access to this type of judicial writing may be available to the public at the highest levels, a dissenting view of a less public case at a lower level might not afford the same access. Gatekeepers such as WestLaw restrict the audience to these documents and their interpretations. Arguments for a fair learning exemption for large language models arise in various corners of the legal profession and are currently litigated by the current trailblazers of the AI boom.
A recent and insightful essay written by Seán Fobbes cautions excitement when it comes to legal large language models and their capabilities to produce legally and ethically accurate as well as just outcomes. From my cursory review, it will require much more fine-tuning and quality review than a mere assurance of dissenting opinions and minority views can incorporate. Food for thought that I shall devour in a follow up post.