Standardized Tests Aren’t Going Anywhere. So What Do We Do?

Listen to the interview with Jenn Borgioli Binis:

This page contains Amazon Affiliate and Bookshop.org links. When you make a purchase through these links, Cult of Pedagogy gets a small percentage of the sale at no extra cost to you. What’s the difference between Amazon and Bookshop.org?

Over the last decade or so, we’ve settled into a choreographed dance around large-scale, state-mandated standardized test scores. First, an education leader stands behind a podium with charts and graphs, releases a memo, or otherwise puts the most recent scores out in the world. Then comes the cacophony of responses, which typically include concern, condemnation, dismissal, and theorizing. Next, people who have nothing to do with the design, administration, taking, or scoring of the tests will then use them to critique states, districts, schools, and/or individual teachers. Some will use them to demonstrate that a particular intervention or program is (in)effective. Eventually, the noise fades out until the next round of testing is complete, and the dance begins anew.

Yet, while all that noise is noising, scores from other standardized tests are used with little muss or fuss. To name just a few: A child’s score on a standardized test, combined with other information such as teacher observations and medical referrals, can be used to facilitate access to targeted and critical early intervention education services. Young people willingly get into a car with a stranger who will use a standardized test to assess their driving ability. Thousands of professionals, from auto mechanics to lawyers, take one or more standardized tests as part of entry into their chosen field. Just this month, a group of researchers published their findings on the impact of district capital projects and used test scores to draw conclusions. Their work will undeniably help school leaders across the country make more informed decisions about district spending and likely couldn’t have been done without large-scale standardized tests.

Although they defy simple categorization, we can, at least, define them. Generally speaking, a standardized test is an assessment wherein the conditions and test (directions, prompts, questions, etc.) are the same for every test taker and their answers are scored with the same answer key or rubric. Put like that, they feel benign, almost harmless. After all, framed that way, teachers give hundreds of standardized tests a year, even those who do learner-centered assessment, project-based learning, or otherwise collect evidence of student learning in ways that are considered alternative or non-traditional.

And yet, that benign definition doesn’t explain why parents and communities started a standardized testing opt-out movement. We recognize that there’s something fundamentally different about large-scale, externally mandated standardized tests that rely on multiple-choice questions. Teachers and students can point to how the NCLB-related tests have negatively impacted education. Meanwhile, politicians, pundits, and researchers argue that schools and students would be worse off without them.

And here we are…back at the end of the first paragraph. Despite their usefulness, standardized tests can cause harm. The conversations around them can be circular and the positions we take on them feel unmovable and intractable. It is possible, though, to break through the noise and interrupt the dance by moving away from the question of all or nothing and towards the idea of reducing harm.

Harm Reduction as a Problem-Solving Approach

Many years ago, I had a chance conversation with a superintendent who compared externally mandated, large-scale standardized tests to prescription drugs. The problem, as he saw it, wasn’t the tests — or the painkillers — themselves. It was that there were simply too many. In his view, reducing the number, or even going cold turkey, would be a net good.

That conversation was long before the opium epidemic was known outside communities trying to deal with its impact and before ethicists and pain management specialists warned against unilaterally pulling them from the market. Maybe, like the opioid epidemic, when it comes to standardized tests, we need to focus less on a “cold turkey” solution and more on harm reduction.

The National Harm Reduction Coalition (NHRC) offers 8 principles for thinking about the concept of harm reduction and how communities can work to keep as many of their members as safe as possible. While specific to the issue they were created for — drug use — these principles provide a starting point for thinking about the problems posed by standardized tests. They aren’t solutions, but they do offer lessons from those who have years of experience learning what it takes to reduce harm in the midst of an epidemic. With these lessons in mind, we can craft our own moves to reduce the harm caused by standardized tests.

Problem: Standardized tests are not going away.
Move 1: Accept it.

It’s likely not a coincidence that NHRC’s first principle is about acceptance:

[This organization] accepts, for better or worse, that licit and illicit drug use is part of our world and chooses to work to minimize its harmful effects rather than simply ignore or condemn them.

This principle models how to acknowledge that we need standardized tests and are better served if we let go of the idea we need to or should get rid of them. In a recent conversation about standardized tests and grades, education historian Ethan Hutt points out that an important function served by standardized tests is synchronization:

“Our system is very decentralized — we don’t have common standards, curricula, or textbooks — which means that despite being in the same grade or even the same course, students in different areas can have very different experiences. Yet, at key moments, we need the disparate pieces of our system to fit together; that’s what standardized testing facilitates.“

As a concrete example of why that synchronization is necessary, multiple states allow a high degree of local control over the curriculum. At the same time, their state constitutions speak to public education, which is connected to children’s right to an education. Large-scale standardized tests are a way for state leaders to ensure they fulfill that right. Tests also facilitate special education services, instruction, grades, reporting, placement, and more.

We can negotiate and discuss how they’re used in each of these contexts, but getting rid of them isn’t possible. As a reminder, the opposite of standardized tests are bespoke assessments: teacher-designed individual assessments for every one of their students every time they want to document their learning. Even if we got rid of some standardized tests, using only bespoke assessments simply isn’t feasible given class sizes, existing demands on teachers’ time, and the demands of designing high-quality assessments. Standardized tests are part of the educational landscape, and they’re not going away.

However, accepting standardized tests as part of our world does not require backing down on calls to minimize, improve, or critique them. And our next move will help us get better at those critiques.

Problem: Not all standardized tests are created equal.
Move 2: Strive for precise language.

Consider the NHRC’s use of the phrase “licit and illicit” in principle 1. The group purposefully uses language to help the public understand that drugs belong to one of two categories: those that are approved for consumption by a government agency and those that are not. The language they use is value-neutral — it’s not about good or bad, help or harm, as not all illicit substances are harmful and not all licit ones are safe — but it is clear and specific. Likewise, we’re better served by being as precise as possible when we discuss testing.

As an example of what that entails, consider the More Teaching Less Testing Act from NY Representative Jamaal Bowman. The text of the bill contains the line, “in 2015, a typical American student took 112 mandated standardized tests across the length of their elementary and secondary education years.” Yet, of 112 those tests, only 16 or 17 were large-scale, federally-mandated but state-specific standardized tests (Math and ELA, once a year in grades 3-8 and once in high school, plus possibly a science assessment.) That’s a lot of adjectives, but specificity is more helpful than generalities when discussing standardized tests. It helps us see that 96 of the tests students took over 13 years were state- or locally-mandated, which means we can move to solutions without federal involvement. (Move 5 explores some of the ways states are making changes.) Using more adjectives when talking about standardized tests may feel small and meaningless, but being more precise can help us have more productive conversations about the problem and possible solutions.

Making moves around precision can also help us as readers and news consumers. As Jen Serravallo and Kelly Cartwright explored in their recent conversation on reading instruction, large-scale tests have shaped schools in big and small ways. As we all work to negotiate tensions related to how students are taught to read, some advocates are using large-scale testing scores to advance their arguments. As an example, this organization points to New York state “reading” tests, but the blueprint (and name) for the state’s test shows it’s an English Language Arts test with a heavy writing component. Talking precisely about reading scores requires tests like DIBELS or aimswebPlus, but those scores generally aren’t public. The challenge is that by focusing on state test scores, advocates make it that much harder to talk about solutions.

Striving for precise language moves us from generalities to specifics, from abstract to concrete. Small moves around language as speakers, listeners, and readers, though, are only a start.

Problem: The tests take up time and energy.
Move 3: Redirect that time and energy at the classroom level.

One of those solutions lies with harm reduction principle 8: Does not attempt to minimize or ignore the real and tragic harm and danger that can be associated with illicit drug use. As a reminder, the overarching idea of harm reduction is safety. Reframing that to the educational equivalency is about ensuring students make it through the year without being harmed by tests. This includes their experiences preparing for the tests, taking them, and the consequences of their test scores.

At the more concrete teacher-and-student level, a few ways to minimize harm and create safer schools are to:

Stop doing year-round test prep.
Stop buying test prep workbooks.
Consider dialing back treating every assessment as “practice” for the state tests. Instead, shift to learner-centered test prep that focuses more on helping students think about who they are as a test taker, the nature of the test, what purpose they serve, and what makes them different. More resources on this approach are available here (brief) and here (extended explanation with examples based on NYS tests).
Reconsider the language, tone, and vocabulary used when discussing the tests at or around test takers. Do we speak of the tests in terms that communicate they are a routine part of school or are we unconsciously telling students they’re the most important thing that will happen to them that year? Do we focus on increasing scores or decreasing stress? Do we lift the burdens students carry around the tests or do we add to them?

If the above feels like too much or not possible, when Glennon Doyle was a classroom teacher, she and a colleague, Amy Gerene, wrote a fantastic book on integrating test prep into the reading workshop and treating large-scale standardized tests as a genre unto themselves. If nothing else, this approach gives classroom teachers permission and confidence to minimize the presence of large-scale, state, and federally-mandated tests in their classrooms.

Problem: Scores are used inappropriately or to gatekeep and punish.
Move 4: Interrogate how scores are used at the district level.

A consequence of state test scores being readily available is they’re so often used for so many different things, often in ways the tests weren’t intended to be used. One way to minimize inappropriate usage is to reconsider students’ relationships to the scores.

Principle 5 offers, in effect, we need to include those affected in crafting those solutions. While districts can’t always include students in standardized testing design, they can include them in decisions on how their scores are used or shared with their families. Districts that incorporate structures like student-led conferences and learning stories could consider folding state test scores into those conversations as a way to lower the temperature around them and help students understand how they’re used. As part of that shift, schools could introduce students and teachers to the Code of Ethics in Fair Testing Practices in Education, which advises that those who use test scores to make decisions should “avoid using a single test score as the sole determinant of decisions about test takers. Interpret test scores in conjunction with other information about individuals.”

Principle 7 is also useful at the district-level as it encourages us to recognize the realities of “poverty, class, racism, social isolation, past trauma” and other social inequalities. For the purpose of standardized tests, this principle serves as a reminder that efforts around assessment are not separate from the necessary work of culturally responsive teaching, equity, trauma-informed practices, and research-based pedagogy. This work can and should also incorporate active steps to minimize bias around standardized tests, including:

ensuring standardized test scores aren’t being used to keep marginalized students out of particular courses or experiences
taking stock of district-level conversations and pressures around large-scale tests and dialing back as much as possible
creating structures to ensure administrators and teachers who design their own standardized tests are taking steps to address and reduce assessment bias (more here)
attending to the phenomenon that is Goodhart’s Law: “When a measure becomes a target, it ceases to be a good measure.” Anya Kamenetz gets more into this tension in her fantastic book, The Test: Why Our Schools Are Obsessed with Standardized Testing – But You Don’t Have to Be. (Amazon | Bookshop.org).

Problem: What gets measured, matters.
Move 5: Ensure state tests measure what matters.

All of us can use more precise language. Some of us are in the position to change students’ experiences with large-scale standardized testing or shift district policies. Only a few of us, though, can bring about systems-level change.

Move 5, which is a really big ask for everyone in schools, is to have hope and confidence that change is happening at the state level. Based on Harm Reduction Principle 3 which reads, quality of individual and community life and well-being — not necessarily cessation of all drug use — [is] the criteria for successful interventions and policies, move 5 is a reminder the goal isn’t better or fewer standardized tests; the goal is a healthier system and more positive school experiences for students and teachers.

It’s difficult to draw a straight line but it sure does seem that the current approach to large-scale, federally-mandated tests, especially for ELA, has created a whole bunch of unintended negative consequences. As an example, two trends are often mentioned in the list of things that need to be addressed regarding reading and ELA instruction. First, the mandated ELA tests are content-free; students do not need particular content knowledge to demonstrate mastery of the standards. Rather, they’re expected to use ELA skills to correctly answer the multiple choice questions and respond to writing prompts. Understandably, teachers across the country in testing grades shifted to stressing skills and now, pundits and researchers are raising concerns about the over-emphasis on skills over content.

Second, the tests generally use shorter passages, which again, understandably means many students are seeing more short passages and fewer longer texts in their ELA class, leading to concerns about students’ lack of reading stamina. There’s also how scores are used to rank schools, the amount of time actual testing takes, and well, there are a lot of issues with large-scale, federally-mandated tests.

Change is happening though! Some of the efforts are based on projects that have been around for a while. For example, New York’s long-running Performance Assessment Consortium and the rise of portfolio or competency-based schools in Pennsylvania, California, and Illinois are all multi-year projects that provide models for how states can come at large-scale assessment and accountability in a different way.

Other efforts, such as The Beyond Test Scores Project, are newer. Jack Schneider, Hutt’s co-author on the recent and very helpful book Off the Mark: How Grades, Ratings, and Rankings Undermine Learning (but Don’t Have To) (Amazon | Bookshop.org) is leading the project, which includes the Massachusetts Consortium for Innovative Education Assessment and the Education Commonwealth Project. What makes the project so exciting is they’re looking at the entire system from student-level grades, to the nature of large-scale assessment measures, to how communities assess the quality of their schools.

Another front to keep an eye on is the Every Student Succeeds Act (the most recent reauthorization of No Child Left Behind.) Researchers and advocates have offered ways to address the issues with the law including clarifying accountability measures by eliminating single letter or number scores for schools, especially those based on student-test scores. Another solution entails shifting from one big test to smaller, shorter mini-tests given over the year. All of which is to say, many people, groups, and organizations are in a position to effect change and are trying to solve what they can. Consider reaching out to your state education department to see if there are opportunities for educator involvement.

It’s absolutely understandable that people want us to go cold turkey and quit large-scale, federally-mandated standardized testing as it currently exists. It’s a commendable goal and one worth working towards. In the meantime, we know harm reduction works. According to the National Institutes of Health, there are decades of research to show harm reduction strategies “provide significant individual and public health benefits.” While imperfect, each of these moves can contribute to a healthier system and, in small and big ways, interrupt the annual dance around the scores and perhaps even give our weary feet a rest.

Resources and Recommended Texts

Melton Doyle, G. & Gerene, A. (2007). Test Talk: Integrating Test Preparation Into Reading Workshop. Stenhouse Publishers.

Hutt, E. & Schneider, J. (2023). Off the Mark: How Grades, Ratings, and Rankings Undermine Learning (but Don’t Have To). Harvard University Press.

Kamenetz, A. (2015). The Test: Why Our Schools Are Obsessed with Standardized Testing-But You Don’t Have to Be. PublicAffairs.

Quinn, D. M. (2020). Experimental Evidence on Teachers’ Racial Bias in Student Evaluation: The Role of Grading Scales. Educational Evaluation and Policy Analysis, 42(3), 375–392. https://doi.org/10.3102/0162373720932188

Reese, W. J. (2013). Testing Wars in the Public Schools: A Forgotten History. Harvard University Press.

Stewart, A. (2015). First Class: The Legacy of Dunbar, America’s First Black Public High School. Chicago Review.

Wainer, H. (2011). Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies. Princeton University Press.

Watters, A. (2021). Teaching Machines. The MIT Press.

Come back for more.
Join our mailing list and get weekly tips, tools, and inspiration that will make your teaching more effective and fun. You’ll get access to our members-only library of free downloads, including 20 Ways to Cut Your Grading Time in Half, the e-booklet that has helped thousands of teachers save time on grading. Over 50,000 teachers have already joined—come on in.

Posted In:

Categories: Hot Topics, Instruction, Leadership, Podcast

Tags: assessment, education reform, programs & systems

Nora says:

January 22, 2024

I agree that we are not going to get rid of all testing but the issue is how many, what they are used for, and the effect they have on what and how teaching unfolds. Tests need to match what they are assessing. There are times that their use does not align with the way teaching and learning occurs in some school contexts. We need multiple measures to determine teaching and learning effectiveness. What qualities are we looking for in our students? Can we use one test to assess this? I doubt it.

The other big problem I see is the emphasis placed on speed of response that is built into most standardized tests. Some children, given more thinking time, might score higher and be able to show their understanding of math or reading or whatever is being tested.

I am a strong believer in documentation of learning. Judy Harris Helms has a wonderful book, written some time ago, Windows on Learning, where she has broken down assessment into 3 windows: a window on the group, a window on the individual learner, and a window on the teacher’s teaching.

Margaret Harris-Shoates says:
January 22, 2024

You’ve brought up some great points, Nora. Thank you for sharing that resource!

John Schuler says:

Thanks for the great article. As a middle school principal I just refused to embrace the high-stakes idea of these standardized tests. I told my staff these assessments were simply part of our assessment continuum and the results gave us information we needed to help us know where our curriculum focus fell short and data on students so we could help them grow in their learning. The biggest challenges we had were from scheduling our grade levels to take completely different tests with different time needs, meeting the needs of all the special testing requirements for students, and, of course, technology issues. The time and money we spent on this one assessment was not at all worth what we got out of it.

Margaret Harris-Shoates says:
January 22, 2024

John, thanks for sharing this valuable perspective!

Femi Higgins says:

February 1, 2024

I hope we can invite more voices to this conversation for part two, especially BIPOC people. First, these exams aren’t culturally relevant at all. We expect students to understand concepts and language not used in their communities or countries. This interferes with comprehension of the question. Most importantly, we give EL students the same tests who score lower than native English students despite being proficient in reading and math in their home languages.

You can provide a standard test where students are asked the same questions and scored the same, but the people who create the questions aren’t centering students of color at all. This is why culturally relevant/sustaining pedagogy is vital. In my opinion, iReady fall short on this.

Lastly, some schools use project-based assessments instead of the state exams. These are far more comprehensive than the high-stakes testing we see across the nation. In fact, there are a few schools in NYC that adopted project-based assessments, abandoning the regents exam as a whole. This should’ve been part of the conversation. Check out the New York Performance Standard Consortium.

http://www.performanceassessment.org/

Jennifer Borgioli Binis says:
February 4, 2024

Apologies for missing your comment! I’m a fan of the Consortium and am always delighted to shout them about again! The original draft of the post included a much longer section on their great work – and NYS’ new efforts around performance tasks at lower grades – and unfortunately, got reduced to a single line.

Your other points are important and good ones. The National Council on Measurement in Education (NCME), one of the professional organizations for those who design large-scale standardized tests has recently taken steps to ensure members are attending to the issues you raised. Their policy statements can be seen here: https://www.ncme.org/resources-publications/position-statements/equity

I’m also aware of work by those who work on large-scale tests, including reading inventories, who are building on the work of Asa Hilliard and others regarding bias and culturally-responsive practices. I wasn’t able to put my hands on links to their work but I’ll shout if I can find them.

Standardized Tests Aren’t Going Anywhere. So What Do We Do?

Harm Reduction as a Problem-Solving Approach

Resources and Recommended Texts

Posted In:

6 Comments

Leave a Reply

Harm Reduction as a Problem-Solving Approach

Resources and Recommended Texts

What to Read Next

Posted In:

6 Comments

Leave a Reply