When lawyers think about legal writing, they tend to focus on their submissions to courts. Some of my work shows that writing quality matters from trial courts on up. Lawyers aren’t the only court actors who care about their legal writing though. Lawrence Baum and others (including Judge Posner) have looked at judicial writings and judicial audiences with an eye towards judges’ goals when writing opinions. Ultimately, most judges are looking to provide clear answers and characterizations or clarifications of law. In a judicial hierarchy, judicial opinions may matter on appeal, but they also may matter to judges’ peers, to law professors, and to legal practitioners. Along with works focused on judges’ audiences, other written pieces focused on judicial behavior have examined why authoring judges may care about their written output.
If we acknowledge that judges’ writings matter to the authoring judges themselves, and that the incentive structure for good writing may vary depending on court level, then we may at very least assume that Supreme Court Justices care about their written opinion writing quality to the extent that they want to best final output as possible (the characterization of “best” may vary by justice). The purpose of this article is to put Supreme Court opinions from last term under a microscope, examining opinion written quality along with some of their content (only majority opinions were examined).
Good Legal Writing
While the objectives of judges and lawyers differ, they both depend on writing clarity – for lawyers to persuade and for judges to provide the parties and the public with clear output. Clear writing begins with basic building blocks. A simple premise is that longer sentences are harder to follow. Moving this elementary understanding, readability measures were developed to provide more advanced metrics of the ease of reading. Readability algorithms have been around for well over half a century.
The Automated Readability Index (ARI), a commonly used readability measure, uses an equation based on [characters/words] and [words/sentences] to provide an approximate grade level needed to read that passage. Below are the justices’ orderings according to this Index based on their opinions from this past term.

This measure creates a basic comparison between the writings of the justices. Although it doesn’t mean in isolation that Gorsuch’s writings were of the highest quality and Sotomayor’s were of the lowest, this gives an initial sense of a scale of the justices’ writings from easiest to most difficult to follow.
It may not be a judge’s goal to write to an extremely low grade level reader even if this relates to easier-to-read opinions. Most likely there is a sweet spot that judicial writers seek with their language between simplicity and complex prose. Nonetheless, lower ARI levels tend to equate to easier to read pieces. Looking from another angle, here are the opinions from last term based on their ARI scores.

At the top of the chart, Justice Sotomayor authored the majority opinion in Murray v. UBS while on the other end of the graph, Justice Gorsuch authored Erlinger v. United States.
A separate way to examine writing complexity is to look at lexical density. This looks at language content and length of a writing by measuring the rate of content-based words in a document. In these measures, content is generally defined as nouns, verbs, adjectives, and adverbs as opposed to functional words like prepositions and auxiliary verbs.
There is also an array of lexical density formulas. One of the first measures of lexical density, the Type-Token Ratio (TTR) is simply the types of words divided by the total tokens (which essentially correlate to words) in a text. Higher TTRs tend to indicate variety of word choice while lower TTRs are associated with more repetitive language. While writings with higher TTR scores are not necessarily better pieces of writing, they may be more engaging to read. A later iteration of TTR is the CTTR (the “C” refers to corrected) which tries to better approximate the ratio of types to length by dividing types by the square root of two multiplied by the number of tokens. This is the measure used here to analyze the lexical densities of the opinions from this past Supreme Court Term.

As you can see, the lexical density measure is not only distinct from readability measures, but also results in a very different organization of cases than the readability measure. Even accounting for opinion length, some of the longest and most discussed cases from last term make up the top cases according to lexical density. While the opinions do not track perfectly along these lines, there seems to be a strong correlation. When we break this measure down by justice we see:

Interestingly, Gorsuch has the most lexically dense opinions along with the most readable from last term. To see both of these on the same axes the next graph is a scatter plot of these two variables according to the authoring justices.

The line on the graph shows a downward trend where justices that score higher based on lexical density tend to have easier-to-read opinions and vice-versa. This shows that Gorsuch seems to be the top justice according to both of these measures which accords with the individual justice graphs.
Substance
Just as computational methods make for readily comparative analyses of written opinions, similar automated approaches lend themselves to comparing opinion content as well. This makes granular comparisons of the subjects of the justices’ opinions much more accessible.
So, what did the justices write about this past term (note that the horizontal axis shows the relative frequency of the words to one another)?

Obviously, these data are predominantly helpful if you have knowledge about the cases the justices decided this past term. Some of the words make sense in the abstract such as “president” for Roberts and “trademark” for Thomas. Nonetheless, it would help to know that “8 U. S. C. §1229(a)” was the statutory provision at the heart of Justice Alito’s majority opinion in the Campos-Chaves case, and that Justice Jackson looked at the “Montgomery GI Bill” in Rudisill v. McDonough.
We may also be interested in specific terms that we know were used in cases this past term. To drill down at this level we should have relatively general terms that come up in multiple opinions, but not too overly general terms that do not reflect an important element of specific cases. Three possible words from last term include “agency” since there were multiple agency deference cases, “speech” due to the multiple cases examining 1st Amendment issues this past term, and “criminal” since this tends to come up in defined set of cases each term. The following graphs show the frequency of each word in the opinions they arise in as well as where they arise in each opinion.

Along with some obvious findings like Loper Bright focusing on “agency” and Trump v. United States looking at “criminal” this graph shows where these terms were present in other cases, the relative importance of these terms in each case, and which cases look at several of these attributes (like “speech” and “agency” coming up multiple times in the majority opinion in Murthy v. Missouri).
Concluding Thoughts
One key takeaway from this article is that writing quality and written content can both be analyzed using automated methods. While these methods do not engage in the deep analysis possible with qualitative methods, they look at similar attributes in a large number of cases and make these comparable between cases in ways that qualitative methods alone cannot.
In terms of writing quality, we see several ways of examining the cases and that we can categorize both the opinions and the justices based on readability and lexical density from the opinions this past term. Justice Gorsuch appears to be the top-ranking justice based on these measures from this past term.
The content analysis does not provide a justice-based spectrum similar to the quality analysis. Instead, the content analysis allows us to quickly dissect the opinions from last term, either based on assumptions we have or to find main case attributes if we do not have prior conceptions of the cases. This also allows for comparisons between cases and justices.
* Quanteda in R was used for the analyses in this post.
Find Adam on X/Twitter and LinkedIn. He’s also on Threads @dradamfeldman and on Bluesky Social @dradamfeldman.bksy.social