This article originally appeared in The Bar Examiner print edition, Winter 2025-2026 (Vol. 94, No. 4), pp. 38–40.By Wendy Light and Erica Shoemaker
The grading and training process for the NextGen UBE has evolved on its journey from pilot test to field test to prototype exam, and the recently administered Beta test to prepare for operational launch. What began as a manual, spreadsheet-driven system has developed into a technology-enabled process that ensures fairness, consistency, and efficiency. This evolution reflects years of testing, feedback, troubleshooting, and innovation aimed at supporting graders, jurisdictions, and candidates.
Then: Manual Processes and Early Challenges
In the early stages, grading relied heavily on spreadsheets and manual workflows. A handful of validity responses1 were embedded into grader assignments, and calibration was absent during pilot and field test grading. Although this approach allowed for basic scoring during the development phase, it was very limited. Grader drift was difficult to detect and correct; manual data compilation and analysis was less than ideal, and graders had only a small set of tools to maintain alignment with scoring criteria. Double grading 100 percent of examinee responses on all exam questions was not feasible for the pilot or field tests. Training was similarly limited. Early on, graders received scoring guides with rubrics and benchmarks, but opportunities for hands-on practice and calibration were minimal. Virtual workshops were offered for the prototype exam, but not for earlier testing stages.2
Transition: Introducing Grading Tools and Structured Training
Recognizing these early limitations, NCBE’s NextGen grading team leveraged insights from grader think-aloud sessions, focus groups, and surveys to refine grading processes, tools, and training resources. The NextGen test development team introduced more rigorous rubrics, detailed grading guidance, and annotated benchmark responses that explained why each score was assigned—improving alignment, accuracy, and consistency.
Training also evolved to include improved modules, content-focused virtual sessions, and limited scoring practice opportunities. Grading moved to a platform that offered improvements but lacked key capabilities needed for full implementation of the NextGen UBE. Additional validity responses were embedded and evenly distributed to alert graders when their scoring drifted. Double grading was applied to 100 percent of responses, but reconciliation remained limited, and real-time monitoring or mentoring was not yet possible.
Although further refinement was needed to strengthen both training and grading delivery, these changes marked significant progress and laid the groundwork for the innovations woven into the Beta test and, ultimately, operational launch.
By combining real-time monitoring with continuous calibration, the system safeguards examinee outcomes and supports graders in delivering accurate, consistent scores.
Transition from Basic to Beta
Grader training has since evolved into taking a comprehensive, flexible approach that combines self-paced learning, interactive sessions, and hands-on practice to ensure graders are fully prepared. NCBE now offers two self-paced modules available during the training window. These modules explain the NextGen UBE grading process, introduce constructed-response question types, and provide guidance on how to use scoring guides and the grading platform.
To complement these resources, NCBE’s grading staff hosted virtual Q&A sessions and targeted training on reconciliation processes, ensuring graders were ready to tackle the Beta test. Hands-on practice remains central; graders will complete practice scoring to reinforce their understanding of scoring criteria and platform functionality.
Feedback from previous pilots, field tests, and prototype exams has shaped these resources to meet graders’ needs while respecting the time they devote to grading exams. Beta test graders scored annotated practice sets for each question type and reviewed an annotated content training set for their assigned question set or performance task—building familiarity with scoring criteria and platform use before live grading begins.
The Beta test covered three times the content of a typical operational exam, requiring streamlined training solutions. Grading workshops, including those delivered virtually, will remain a cornerstone of operational training going forward.
Now: The Internet Testing Systems Grading Platform and Beta Test Innovations
The NextGen grading process is entering a new era with the introduction of the ITS grading platform for the Beta test. This platform builds on previous improvements while adding features designed to enhance efficiency and fairness.
Team leaders and administrators can now track grading progress and accuracy in real time, ensuring consistency and fairness throughout the grading window. Calibration begins as soon as a grader starts scoring and continues throughout the process. Embedded validity responses—hidden among live examinee responses—alert graders if their scoring begins to drift from established standards, reinforcing ongoing calibration. These features not only help maintain alignment with scoring criteria but also provide graders with immediate feedback, reducing variability and improving overall reliability. By combining real-time monitoring with continuous calibration, the system safeguards examinee outcomes and supports graders in delivering accurate, consistent scores.
Every constructed-response question is double graded, meaning two independent graders evaluate each response to ensure accuracy and consistency. If their scores fall outside established tolerances, the system automatically initiates one of two reconciliation paths: team leader review or consensus group discussion. This structured process ensures that one grader’s judgement does not disproportionately influence an examinee’s result, reinforcing fairness and reliability across the board.
Beyond resolving discrepancies, reconciliation serves as a learning opportunity. Graders engage in team leader mentoring or collaborative discussions during consensus sessions, allowing them to revisit scoring criteria, clarify expectations, and strengthen alignment. These interactions not only resolve individual disagreements but also help graders recalibrate, reducing variability and improving overall scoring accuracy.
The journey from manual spreadsheets to a technology-enhanced grading platform reflects NCBE’s commitment to designing and improving a system that ensures fairness, consistency, and efficiency. By integrating technology, robust training, and rigorous grading materials, the NextGen UBE grading process offers jurisdictions a fair, consistent, and transparent approach. Each step—guided by grader feedback and analysis—has created a system that supports graders and safeguards examinee outcomes, setting a strong foundation for the operational launch of the NextGen UBE in July 2026.
Notes
- Validity responses are “responses that are clear representations of each score point that are interspersed into each grader’s assigned set of responses. This tool helps ensure grader scores align with the grading criteria in the rubrics throughout the grading process.” For this and more NextGen grading terms, see Wendy Light; Rosemary Reshetar, EdD; and Erica Shoemaker, “The Testing Column: Grading the MEE, MPT, and the NextGen Bar Exam: Ensuring Fairness to Candidates,” 93(1) The Bar Examiner 69–72 (Spring 2024). (Go back)
- Descriptions of the testing stages can be found in “The Testing Column: NextGen Research Brief: Pilot Testing,” 93(2) The Bar Examiner 43–48 (Summer 2024). (Go back)
Wendy Light is the Constructed Response Scoring Manager for the National Conference of Bar Examiners.
Erica Shoemaker is the Constructed Response Scoring Specialist for the National Conference of Bar Examiners.
Contact us to request a pdf file of the original article as it appeared in the print edition.






