Yesterday’s opinions in Lockhart showcased some unusual themes for the Court. Between the majority and dissenting opinions the Justices referenced the Kansas City Royals and the movies Star Wars and Zoolander. Atypical to say the least. Here is take a deeper look at the language of Lockhart’s majority opinion using a few text mining and analysis tools. To begin I removed all quoted text from the opinion and all citations. This allows for a focus on the words unique to the opinion itself. Then I analyzed the sentiment of the majority opinion. I did this using the package qdap in R and the methods described in: http://www.r-bloggers.com/statistics-meets-rhetoric-a-text-analysis-of-i-have-a-dream-in-r/ . With these tools I tracked the sentiment as it proceeds through the opinion. The graphic depicts the movement of the opinion’s sentiment:
As the graph shows, much of the text is either not valenced or uncategorized by the sentiment analysis tool. The curve shows that language with a positive or negative sentiment is predominately negative, especially at the beginning of the opinion.
Next I used the tm package in R to help learn more about the actual language in the opinion. I pre-processed the text using several standard tools. These include removing numbers and punctuation, making all characters lower cased, removing stop words such as “the” “to” and “and,” and then by stemming the text so that words with the same root such as resist, resisted, and resists all get combined under the same word stem.
After employing these pre-processing tools, I generated a matrix with word frequencies. The most common terms in the opinion were:
antecedent, chapter, congress, also, court, crimes, federal, involving, last, list, lockhart, phrase, predicates, rule, see, sexual, sexualabuse, state, three
Graphically the word frequencies appear as:
To get an even greater perspective on the language used in the opinion I created a wordcloud based on the term-frequency matrix with the wordcloud package in R. The wordcloud uses both the size and color of the terms to depict their relative frequencies.
With these terms we can not only better understand the significant language in the opinion but also the tools of analysis used by the Court. Based on the wordcloud alone many of the instruments of statutory interpretation are apparent (e.g. interpretation, limiting, context, structure, etc.).
This quantitative look at the Lockhart opinion may bring to light features not readily apparent in a first read and may help focus the reader on salient elements of the opinion’s language.
Next up…network analysis.