A Little Bit of Data Can Be a Lot Dangerous

By  Aaron N. Taylor
 
What if I were to tell you that as ice cream consumption increases so does the murder rate? Would you conclude that butter pecan was prompting people to go out and kill others? Probably not. Anyone who has taken an introductory statistics course probably recognizes this illustration. It is a common tool for explaining the basic principle that correlation is not a measure of causation. Ice cream doesn’t drive people to kill. Warm weather is the factor that binds the two phenomena. People eat more ice cream when it’s warm. They also interact more with each other, which sometimes (though rarely) ends in murder.
 
I recently penned a commentary titled, For Diversity: Let’s Talk Less About Pipelines and More About Why Blacks Are Not Admitted. In it, I argue that the law school admission process disproportionately excludes Black people from legal education for reasons that are unsupported by relevant data. In making my point, I referenced research conducted by a few law schools showing the measurable, but limited, value of the LSAT in predicting bar exam performance and the acquisition of lawyering skills.
 
I received many responses to my inbox. Some favorable. Some not. I am writing though to address a public rebuttal by Robert Steinbuch, a law professor at Arkansas-Little Rock. Steinbuch challenged my thesis by presenting a data table showing that graduates of his law school who entered with lower LSAT scores passed the bar exam at lower rates. For Steinbuch, the trends seem to be definitive proof that the LSAT does indeed predict bar exam performance in ways that justify its outsized role in the admissions process.
 
Steinbuch commits a common, but dangerous error of interpretation. He assumes that a linear association between LSAT scores and bar performance reflects a predictive or impactful relationship. This is a classic conflation of correlation and causation – an error that can lead to conclusions that are unsupported and often erroneous.
 
Correlations only explain the extent to which two variables flow together (or diverge). They explain nothing about the impact of one variable on the other. To a knowledgeable observer, Steinbuch’s data table prompts more questions than answers – most significantly, why are lower LSAT scores associated with higher bar exam failure? A simplistic analysis of trends is insufficient in answering this question (and others), and any suggestion to the contrary is simply wrong.
 
In order to acquire a substantive understanding of bar performance, researchers should first identify factors that potentially influence bar performance. Factors such as law school class rank, undergraduate GPA, and of course LSAT score are obvious. Other factors such as law school course selection, socioeconomic background, employment status during bar preparation, and other background or life circumstances should also be considered. 
 
Once possible factors have been identified, the next step is to test the extent to which the factors predict bar performance. Regression analysis would be the appropriate statistical test. Regressions estimate the extent to which change in one variable (e.g. LSAT score) predicts change in another variable (e.g. bar performance). Put simply, regression analyses do what Steinbuch erroneously believes his trend analysis did. 
 
Using regression analyses, researchers at Texas Tech found that LSAT scores predicted 13 percent of the variance in bar exam scores among their graduates; this is a palpable impact but likely much lower than most people would have guessed. But if the LSAT has only limited predictive value, why does there seem to be such a clear trend where lower scores are associated with higher exam failure? The answer lies in the extent to which LSAT performance serves as a proxy for other things. Could it be that access to financial resources, which can allow for better LSAT prep and fewer distractions during bar prep, is the tie that binds the LSAT and the bar exam? Maybe. Maybe not. We don’t really know. And that’s the problem.
 
Better information about bar performance could help schools better assess failure risks and design more effective bar passage interventions. Unfortunately, studies like Texas Tech’s are rare. Filling the void is conventional wisdom that is often rooted in unsupported intuition and cringe-worthy misinterpretations of data. A little bit of data can be a lot dangerous. We need more and better. 
 
The author is executive director of the AccessLex Center for Legal Education Excellence and an associate professor of law at Saint Louis University.