This article originally appeared in The Bar Examiner print edition, Spring 2023 (Vol. 92, No. 1), pp. 61–63.By Marilyn Wellington
Measurement Bias and the Bar Exam: A Continued Focus for the NextGen Exam
Ensuring a fair testing experience for all examinees is central to NCBE’s mission to build a competent, ethical, and diverse legal profession. One crucial part of that work, for both the current bar exam and the NextGen exam now under development, is ensuring all examinees have an equal opportunity to demonstrate their knowledge and skills.
The purpose of the bar exam as a whole, and of each question on the exam, is to provide jurisdiction admission authorities with information about whether examinees possess the knowledge and skills necessary to begin practicing law consistent with public protection. To ensure a fair exam, it needs to be carefully developed to minimize bias.
What Is Measurement Bias?
To the general public, the term “bias” describes a preference against or for a particular group or groups. In testing, measurement bias is a term of art that takes the popular definition further to encompass measurable differences in exam performance based on factors such as race, ethnicity, or disability, among others.
Measurement bias occurs when scores on a test or a test item are systematically lower or higher for some definable examinee group due to factors unrelated to the knowledge and skills being evaluated. NCBE has in place a robust process to minimize measurement bias in test questions, from drafter training to independent review, to statistical analysis of how the questions perform in an actual testing environment. In this column, I’ll review that process, but let’s begin by taking a closer look at measurement bias.
A lower mean score or a lower pass rate for a group of examinees does not, in and of itself, mean a test is biased. Measurement bias can manifest itself at the level of the individual test question or at the level of the total test score and can be due to factors unrelated to the purpose of the test.
Types of Test Question Bias
To understand what we mean by test question bias, let’s look at an example. Consider a question meant to test examinees’ knowledge of products liability law. The question’s scenario involves someone buying a sandwich, but instead of calling it a sandwich, the question uses a term that only gets used in certain parts of the country: the question explains that the person bought a hoagie or a hero. Do you need to know what a hoagie is to understand who might be liable if the hoagie gives someone food poisoning? Of course not. But if you’re unfamiliar with the term, you may stumble on the question and even choose the wrong answer. If that happens, the question has failed to do its job.
The fact that you got the question wrong doesn’t tell us that you lack knowledge about products liability; instead, it may simply indicate that you come from a part of the country where sandwiches are never called hoagies. Given the use of this regional term, an entire subset of examinees may be distracted by it and have trouble with this question, while examinees for whom the word hoagie is familiar would not be. We now have a situation where the question provides useful, accurate information about one subset of examinees, but less useful, less accurate information about another subset; the results of the exam mean different things for different groups of examinees. This is a classic example of test question bias.
Of course, the use of unfamiliar terminology is just one way that bias can affect a test question; for example, bias may influence questions where there is content that might be particularly disturbing to some examinees due to personal histories or life experiences. To the extent that the knowledge and skills being tested have been identified through a thorough practice analysis as representative of important and frequent tasks among newly licensed lawyers, it may not be possible to eliminate such content entirely. But, regardless of the type of bias we might be dealing with, we want to prevent it from affecting test results. The crucial question, then, is what we do to minimize bias.
NCBE Efforts Regarding Bias Pre- and Post-Exam
At NCBE, we have a multilayered system of checks to ensure that our exam questions are not affected by either type of bias discussed above. We begin by taking systematic steps during the exam writing and review process to prevent bias from being included in test questions in the first place. Once the questions are in front of examinees—first as pretest questions, then as scored questions—we have additional safeguards in place, involving the use of statistical analysis, to alert us to any effect of bias.
For example, Multistate Bar Examination (MBE) questions are written by drafting committees comprised of six to eight practicing attorneys, judges, and law school faculty members. Preventing bias during the drafting process begins with ensuring a diverse committee membership to bring multiple perspectives and experiences to the table; something in a question that might seem unremarkable to one drafter could be a red flag for another.
Once selected to serve on a drafting committee, all drafters receive training and guidelines specifically designed to help them avoid problematic test content. The training and guidelines support them as they draft new questions and perform an initial review. After drafters have completed the initial versions of their questions, these versions go to a separate set of external reviewers, who are asked to look for any aspect of a test question that could lead to bias in the test results.
These initial stages of internal and external review go a long way in ensuring that the questions placed in front of examinees are fair and unbiased. Because this is so important, additional checks are in place that rely not on drafters and reviewers, but rather on data about how examinees perform on the questions. Even after a question has made it through the entire drafting and review process, if a subset of examinees performs significantly worse (or better) on that question than expected, the question will warrant further review.
What kind of data do we look at in checking for bias after a question has been administered? This is where the ability to compare performance on one question to other questions on the exam becomes important. In the simplest terms, if an examinee generally does well on the exam but performs poorly on one question in a way that doesn’t seem consistent with their performance on other questions, it could be a sign that there’s an issue with the question. If an entire subset of examinees performs more poorly on a particular question than expected, then we will investigate further with the question of bias in mind.
Item review is an ongoing process, as NCBE reviews performance statistics following each exam administration. Ensuring that test questions provide a fair and equal testing experience for all examinees is a continuing priority and process.
The process of review and statistical analysis described here is part of our routine exam development for the current bar exam. It is also central to the item development work and pilot, field, and prototype testing currently underway for the NextGen exam. (For more information about pilot, field, and prototype testing, see my previous column.)
Of course, item development and testing are only part of the ongoing work of preparing to launch the NextGen bar exam. We continue to engage regularly with the broader legal community about the new exam—learning from you and gathering your input. This spring and summer, NCBE will release the Content Scope Outlines, which will establish the specific topics that will appear on the new exam, plus a preview of sample test items. Research—including research into measurement bias—continues, and we extend our heartfelt thanks to the law schools and law students that have already participated in pilot testing or will be joining us in the months ahead. Law school and law student participation in these processes is critical to ensuring the quality, validity, and value of the NextGen exam.
I hope this column has provided some insight into the rigorous processes NCBE uses to help ensure a fair bar exam. Item review is only one—but crucial—piece of a much larger puzzle in ensuring a fair testing experience and equal access to the legal profession for everyone.
To stay up to date on development of the future bar exam, subscribe to the NextGen website at nextgenbarexam.ncbex.org/subscribe.
Marilyn J. Wellington is the Chief Strategy and Operations Officer for the National Conference of Bar Examiners.
The Next Generation of the Bar Exam
In January 2021, the NCBE Board of Trustees approved the recommendations of NCBE’s Testing Task Force for the redesign of the bar examination to ensure that it continues to test the knowledge, skills, and abilities required for competent entry-level legal practice in a changing profession.
The board appointed an Implementation Steering Committee (ISC), which is charged with general oversight of the implementation of the findings and recommendations from the Testing Task Force study. NCBE has established multiple workgroups, working in consultation with the ISC, to develop the next generation of the bar examination and ensure a smooth transition for candidates, jurisdictions, and law schools. The workgroups include focus areas such as: test development and psychometrics; test delivery; diversity, fairness, and inclusion; and outreach. Workgroups focused on developing the content of the new exam and drafting exam questions are comprised of law professors, legal practitioners, and judges and justices.
Contact us to request a pdf file of the original article as it appeared in the print edition.