Solutions Archives - The Hechinger Report

PROOF POINTS: AI writing feedback ‘better than I thought,’ top researcher says

Jill Barshay — Mon, 03 Jun 2024 10:00:00 +0000

Researchers from the University of California, Irvine, and Arizona State University found that human feedback was generally a bit better than AI feedback, but AI was surprisingly good. Credit: Getty Images

This week I challenged my editor to face off against a machine. Barbara Kantrowitz gamely accepted, under one condition: “You have to file early.” Ever since ChatGPT arrived in 2022, many journalists have made a public stunt out of asking the new generation of artificial intelligence to write their stories. Those AI stories were often bland and sprinkled with errors. I wanted to understand how well ChatGPT handled a different aspect of writing: giving feedback.

My curiosity was piqued by a new study, published in the June 2024 issue of the peer-reviewed journal Learning and Instruction, that evaluated the quality of ChatGPT’s feedback on students’ writing. A team of researchers compared AI with human feedback on 200 history essays written by students in grades 6 through 12 and they determined that human feedback was generally a bit better. Humans had a particular advantage in advising students on something to work on that would be appropriate for where they are in their development as a writer.

But ChatGPT came close. On a five-point scale that the researchers used to rate feedback quality, with a 5 being the highest quality feedback, ChatGPT averaged a 3.6 compared with a 4.0 average from a team of 16 expert human evaluators. It was a tough challenge. Most of these humans had taught writing for more than 15 years or they had considerable experience in writing instruction. All received three hours of training for this exercise plus extra pay for providing the feedback.

ChatGPT even beat these experts in one aspect; it was slightly better at giving feedback on students’ reasoning, argumentation and use of evidence from source materials – the features that the researchers had wanted the writing evaluators to focus on.

“It was better than I thought it was going to be because I didn’t have a lot of hope that it was going to be that good,” said Steve Graham, a well-regarded expert on writing instruction at Arizona State University, and a member of the study’s research team. “It wasn’t always accurate. But sometimes it was right on the money. And I think we’ll learn how to make it better.”

Average ratings for the quality of ChatGPT and human feedback on 200 student essays

Researchers rated the quality of the feedback on a five-point scale across five different categories. Criteria-based refers to whether the feedback addressed the main goals of the writing assignment, in this case, to produce a well-reasoned argument about history using evidence from the reading source materials that the students were given. Clear directions mean whether the feedback included specific examples of something the student did well and clear directions for improvement. Accuracy means whether the feedback advice was correct without errors. Essential Features refer to whether the suggestion on what the student should work on next is appropriate for where the student is in his writing development and is an important element of this genre of writing. Supportive Tone refers to whether the feedback is delivered with language that is affirming, respectful and supportive, as opposed to condescending, impolite or authoritarian. (Source: Fig. 1 of Steiss et al, “Comparing the quality of human and ChatGPT feedback of students’ writing,” Learning and Instruction, June 2024.)

Exactly how ChatGPT is able to give good feedback is something of a black box even to the writing researchers who conducted this study. Artificial intelligence doesn’t comprehend things in the same way that humans do. But somehow, through the neural networks that ChatGPT’s programmers built, it is picking up on patterns from all the writing it has previously digested, and it is able to apply those patterns to a new text.

The surprising “relatively high quality” of ChatGPT’s feedback is important because it means that the new artificial intelligence of large language models, also known as generative AI, could potentially help students improve their writing. One of the biggest problems in writing instruction in U.S. schools is that teachers assign too little writing, Graham said, often because teachers feel that they don’t have the time to give personalized feedback to each student. That leaves students without sufficient practice to become good writers. In theory, teachers might be willing to assign more writing or insist on revisions for each paper if students (or teachers) could use ChatGPT to provide feedback between drafts.

Despite the potential, Graham isn’t an enthusiastic cheerleader for AI. “My biggest fear is that it becomes the writer,” he said. He worries that students will not limit their use of ChatGPT to helpful feedback, but ask it to do their thinking, analyzing and writing for them. That’s not good for learning. The research team also worries that writing instruction will suffer if teachers delegate too much feedback to ChatGPT. Seeing students’ incremental progress and common mistakes remain important for deciding what to teach next, the researchers said. For example, seeing loads of run-on sentences in your students’ papers might prompt a lesson on how to break them up. But if you don’t see them, you might not think to teach it. Another common concern among writing instructors is that AI feedback will steer everyone to write in the same homogenized way. A young writer’s unique voice could be flattened out before it even has the chance to develop.

There’s also the risk that students may not be interested in heeding AI feedback. Students often ignore the painstaking feedback that their teachers already give on their essays. Why should we think students will pay attention to feedback if they start getting more of it from a machine?

Still, Graham and his research colleagues at the University of California, Irvine, are continuing to study how AI could be used effectively and whether it ultimately improves students’ writing. “You can’t ignore it,” said Graham. “We either learn to live with it in useful ways, or we’re going to be very unhappy with it.”

Right now, the researchers are studying how students might converse back-and-forth with ChatGPT like a writing coach in order to understand the feedback and decide which suggestions to use.

Example of feedback from a human and ChatGPT on the same essay

In the current study, the researchers didn’t track whether students understood or employed the feedback, but only sought to measure its quality. Judging the quality of feedback is a rather subjective exercise, just as feedback itself is a bundle of subjective judgment calls. Smart people can disagree on what good writing looks like and how to revise bad writing.

In this case, the research team came up with its own criteria for what constitutes good feedback on a history essay. They instructed the humans to focus on the student’s reasoning and argumentation, rather than, say, grammar and punctuation. They also told the human raters to adopt a “glow and grow strategy” for delivering the feedback by first finding something to praise, then identifying a particular area for improvement.

The human raters provided this kind of feedback on hundreds of history essays from 2021 to 2023, as part of an unrelated study of an initiative to boost writing at school. The researchers randomly grabbed 200 of these essays and fed the raw student writing – without the human feedback – to version 3.5 of ChatGPT and asked it to give feedback, too.

At first, the AI feedback was terrible, but as the researchers tinkered with the instructions, or the “prompt,” they typed into ChatGPT, the feedback improved. The researchers eventually settled upon this wording: “Pretend you are a secondary school teacher. Provide 2-3 pieces of specific, actionable feedback on each of the following essays…. Use a friendly and encouraging tone.” The researchers also fed the assignment that the students were given, for example, “Why did the Montgomery Bus Boycott succeed?” along with the reading source material that the students were provided. (More details about how the researchers prompted ChatGPT are explained in Appendix C of the study.)

The humans took about 20 to 25 minutes per essay. ChatGPT’s feedback came back instantly. The humans sometimes marked up sentences by, for example, showing a place where the student could have cited a source to buttress an argument. ChatGPT didn’t write any in-line comments and only wrote a note to the student.

Researchers then read through both sets of feedback – human and machine – for each essay, comparing and rating them. (It was supposed to be a blind comparison test and the feedback raters were not told who authored each one. However, the language and tone of ChatGPT were distinct giveaways, and the in-line comments were a tell of human feedback.)

Humans appeared to have a clear edge with the very strongest and the very weakest writers, the researchers found. They were better at pushing a strong writer a little bit further, for example, by suggesting that the student consider and address a counterargument. ChatGPT struggled to come up with ideas for a student who was already meeting the objectives of a well-argued essay with evidence from the reading source materials. ChatGPT also struggled with the weakest writers. The researchers had to drop two of the essays from the study because they were so short that ChatGPT didn’t have any feedback for the student. The human rater was able to parse out some meaning from a brief, incomplete sentence and offer a suggestion.

In one student essay about the Montgomery Bus Boycott, reprinted above, the human feedback seemed too generic to me: “Next time, I would love to see some evidence from the sources to help back up your claim.” ChatGPT, by contrast, specifically suggested that the student could have mentioned how much revenue the bus company lost during the boycott – an idea that was mentioned in the student’s essay. ChatGPT also suggested that the student could have mentioned specific actions that the NAACP and other organizations took. But the student had actually mentioned a few of these specific actions in his essay. That part of ChatGPT’s feedback was plainly inaccurate.

In another student writing example, also reprinted below, the human straightforwardly pointed out that the student had gotten an historical fact wrong. ChatGPT appeared to affirm that the student’s mistaken version of events was correct.

Another example of feedback from a human and ChatGPT on the same essay

So how did ChatGPT’s review of my first draft stack up against my editor’s? One of the researchers on the study team suggested a prompt that I could paste into ChatGPT. After a few back and forth questions with the chatbot about my grade level and intended audience, it initially spit out some generic advice that had little connection to the ideas and words of my story. It seemed more interested in format and presentation, suggesting a summary at the top and subheads to organize the body. One suggestion would have made my piece too long-winded. Its advice to add examples of how AI feedback might be beneficial was something that I had already done. I then asked for specific things to change in my draft, and ChatGPT came back with some great subhead ideas. I plan to use them in my newsletter, which you can see if you sign up for it here. (And if you want to see my prompt and dialogue with ChatGPT, here is the link.)

My human editor, Barbara, was the clear winner in this round. She tightened up my writing, fixed style errors and helped me brainstorm this ending. Barbara’s job is safe – for now.

This story about AI feedback was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: AI writing feedback ‘better than I thought,’ top researcher says appeared first on The Hechinger Report.

PROOF POINTS: We have tried paying teachers based on how much students learn. Now schools are expanding that idea to contractors and vendors

Jill Barshay — Mon, 27 May 2024 10:00:00 +0000

Schools spend billions of dollars a year on products and services, including everything from staplers and textbooks to teacher coaching and training. Does any of it help students learn more? Some educational materials end up mothballed in closets. Much software goes unused. Yet central-office bureaucrats frequently renew their contracts with outside vendors regardless of usage or efficacy.

One idea for smarter education spending is for schools to sign smarter contracts, where part of the payment is contingent upon whether students use the services and learn more. It’s called outcomes-based contracting and is a way of sharing risk between buyer (the school) and seller (the vendor). Outcomes-based contracting is most common in healthcare. For example, a health insurer might pay a pharmaceutical company more for a drug if it actually improves people’s health, and less if it doesn’t.

Although the idea is relatively new in education, many schools tried a different version of it – evaluating and paying teachers based on how much their students’ test scores improved – in the 2010s. Teachers didn’t like it, and enthusiasm for these teacher accountability schemes waned. Then, in 2020, Harvard University’s Center for Education Policy Research announced that it was going to test the feasibility of paying tutoring companies by how much students’ test scores improved.

The initiative was particularly timely in the wake of the pandemic. The federal government would eventually give schools almost $190 billion to reopen and to help students who fell behind when schools were closed. Tutoring became a leading solution for academic recovery and schools contracted with outside companies to provide tutors. Many educators worried that billions could be wasted on low-quality tutors who didn’t help anyone. Could schools insist that tutoring companies make part of their payment contingent upon whether student achievement increased?

The Harvard center recruited a handful of school districts who wanted to try an outcomes-based contract. The researchers and districts shared ideas on how to set performance targets. How much should they expect student achievement to grow from a few months of tutoring? How much of the contract should be guaranteed to the vendor for delivering tutors, and how much should be contingent on student performance?

The first hurdle was whether tutoring companies would be willing to offer services without knowing exactly how much they would be paid. School districts sent out requests for proposals from online tutoring companies. Tutoring companies bid and the terms varied. One online tutoring company agreed that 40 percent of a $1.2 million contract with the Duval County Public Schools in Jacksonville, Florida, would be contingent upon student performance. Another online tutoring company signed a contract with Ector County schools in the Odessa, Texas, region that specified that the company had to accept a penalty if kids’ scores declined.

In the middle of the pilot, the outcomes-based contracting initiative moved from the Harvard center to the Southern Education Foundation, another nonprofit, and I recently learned how the first group of contracts panned out from Jasmine Walker, a senior manager there. Walker had a first-hand view because until the fall of 2023, she was the director of mathematics in Florida’s Duval County schools, where she oversaw the outcomes-based contract on tutoring.

Here are some lessons she learned:

Planning is time-consuming

Drawing up an outcomes-based contract requires analyzing years of historical testing data, and documenting how much achievement has typically grown for the students who need tutoring. Then, educators have to decide – based on the research evidence for tutoring – how much they could reasonably hope student achievement to grow after 12 weeks or more.

Incomplete data was a common problem

The first school district in the pilot group launched its outcome-based contract in the fall of 2021. In the middle of the pilot, school leadership changed, layoffs hit, and the leaders of the tutoring initiative left the district. With no one in the district’s central office left to track it, there was no data on whether tutoring helped the 1,000 students who received it. Half the students attended 70 percent of the tutoring sessions. Half didn’t. Test scores for almost two-thirds of the tutored students increased between the start and the end of the tutoring program. But these students also had regular math classes each day and they likely would have posted some achievement gains anyway.

Delays in settling contracts led to fewer tutored students

Walker said two school districts weren’t able to start tutoring children until January 2023, instead of the fall of 2022 as originally planned, because it took so long to iron out contract details and obtain approvals inside the districts. Many schools didn’t want to wait and launched other interventions to help needy students sooner. Understandably, schools didn’t want to yank these students away from those other interventions midyear.

That delay had big consequences in Duval County. Only 451 students received tutoring instead of a projected 1,200. Fewer students forced Walker to recalculate Duval’s outcomes-based contract. Instead of a $1.2 million contract with $480,000 of it contingent on student outcomes, she downsized it to $464,533 with $162,363 contingent. The tutored students hit 53 percent of the district’s growth and proficiency goals, leading to a total payout of $393,220 to the tutoring company – far less than the company had originally anticipated. But the average per-student payout of $872 was in line with the original terms of between $600 and $1,000 per student.

The bottom line is still uncertain

What we don’t know from any of these case studies is whether similar students who didn’t receive tutoring also made similar growth and proficiency gains. Maybe it’s all the other things that teachers were doing that made the difference. In Duval County, for example, proficiency rates in math rose from 28 percent of students to 46 percent of students. Walker believes that outcomes-based contracting for tutoring was “one lever” of many.

It’s unclear if outcomes-based contracting is a way for schools to save money. This kind of intensive tutoring – three times a week or more during the school day – is new and the school districts didn’t have previous pre-pandemic tutoring contracts for comparison. But generally, if all the student goals are met, companies stand to earn more in an outcomes-based contract than they would have otherwise, Walker said.

“It’s not really about saving money,” said Walker. “What we want is for students to achieve. I don’t care if I spent the whole contract amount if the students actually met the outcomes, because in the past, let’s face it, I was still paying and they were not achieving outcomes.”

The biggest change with outcomes-based contracting, Walker said, was the partnership with the provider. One contractor monitored student attendance during tutoring sessions, called her when attendance slipped and asked her to investigate. Students were given rewards for attending their tutoring sessions and the tutoring company even chipped in to pay for them. “Kids love Takis,” said Walker.

Advice for schools

Walker has two pieces of advice for schools considering outcomes-based contracts. One, she says, is to make the contingency amount at least 40 percent of the contract. Smaller incentives may not motivate the vendor. For her second outcomes-based contract in Duval County, Walker boosted the contingency amount to half the contract. To earn it, the tutoring company needs the students it is tutoring to hit growth and proficiency goals. That tutoring took place during the current 2023-24 school year. Based on mid-year results, students exceeded expectations, but full-year results are not yet in.

More importantly, Walker says the biggest lesson she learned was to include teachers, parents and students earlier in the contract negotiation process. She says “buy in” from teachers is critical because classroom teachers are actually making sure the tutoring happens. Otherwise, an outcomes-based contract can feel like yet “another thing” that the central office is adding to a teacher’s workload.

Walker also said she wished she had spent more time educating parents and students on the importance of attending school and their tutoring sessions. ”It’s important that everyone understands the mission,” said Walker.

Innovation can be rocky, especially at the beginning. Now the Southern Education Foundation is working to expand its outcomes-based contracting initiative nationwide. A second group of four school districts launched outcomes-based contracts for tutoring this 2023-24 school year. Walker says that the rate cards and recordkeeping are improving from the first pilot round, which took place during the stress and chaos of the pandemic.

The foundation is also seeking to expand the use of outcomes-based contracts beyond tutoring to education technology and software. Nine districts are slated to launch outcomes-based contracts for ed tech this fall. Her next dream is to design outcomes-based contracts around curriculum and teacher training. I’ll be watching.

This story about outcomes-based contracting was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: We have tried paying teachers based on how much students learn. Now schools are expanding that idea to contractors and vendors appeared first on The Hechinger Report.

PROOF POINTS: AI essay grading is already as ‘good as an overburdened’ teacher, but researchers say it needs more work

Jill Barshay — Mon, 20 May 2024 10:00:00 +0000

Grading papers is hard work. “I hate it,” a teacher friend confessed to me. And that’s a major reason why middle and high school teachers don’t assign more writing to their students. Even an efficient high school English teacher who can read and evaluate an essay in 20 minutes would spend 3,000 minutes, or 50 hours, grading if she’s teaching six classes of 25 students each. There aren’t enough hours in the day.

Could ChatGPT relieve teachers of some of the burden of grading papers? Early research is finding that the new artificial intelligence of large language models, also known as generative AI, is approaching the accuracy of a human in scoring essays and is likely to become even better soon. But we still don’t know whether offloading essay grading to ChatGPT will ultimately improve or harm student writing.

Tamara Tate, a researcher at University California, Irvine, and an associate director of her university’s Digital Learning Lab, is studying how teachers might use ChatGPT to improve writing instruction. Most recently, Tate and her seven-member research team, which includes writing expert Steve Graham at Arizona State University, compared how ChatGPT stacked up against humans in scoring 1,800 history and English essays written by middle and high school students.

Tate said ChatGPT was “roughly speaking, probably as good as an average busy teacher” and “certainly as good as an overburdened below-average teacher.” But, she said, ChatGPT isn’t yet accurate enough to be used on a high-stakes test or on an essay that would affect a final grade in a class.

Tate presented her study on ChatGPT essay scoring at the 2024 annual meeting of the American Educational Research Association in Philadelphia in April. (The paper is under peer review for publication and is still undergoing revision.)

Most remarkably, the researchers obtained these fairly decent essay scores from ChatGPT without training it first with sample essays. That means it is possible for any teacher to use it to grade any essay instantly with minimal expense and effort. “Teachers might have more bandwidth to assign more writing,” said Tate. “You have to be careful how you say that because you never want to take teachers out of the loop.”

Writing instruction could ultimately suffer, Tate warned, if teachers delegate too much grading to ChatGPT. Seeing students’ incremental progress and common mistakes remain important for deciding what to teach next, she said. For example, seeing loads of run-on sentences in your students’ papers might prompt a lesson on how to break them up. But if you don’t see them, you might not think to teach it.

In the study, Tate and her research team calculated that ChatGPT’s essay scores were in “fair” to “moderate” agreement with those of well-trained human evaluators. In one batch of 943 essays, ChatGPT was within a point of the human grader 89 percent of the time. On a six-point grading scale that researchers used in the study, ChatGPT often gave an essay a 2 when an expert human evaluator thought it was really a 1. But this level of agreement – within one point – dropped to 83 percent of the time in another batch of 344 English papers and slid even farther to 76 percent of the time in a third batch of 493 history essays. That means there were more instances where ChatGPT gave an essay a 4, for example, when a teacher marked it a 6. And that’s why Tate says these ChatGPT grades should only be used for low-stakes purposes in a classroom, such as a preliminary grade on a first draft.

ChatGPT scored an essay within one point of a human grader 89 percent of the time in one batch of essays

Corpus 3 refers to one batch of 943 essays, which represents more than half of the 1,800 essays that were scored in this study. Numbers highlighted in green show exact score matches between ChatGPT and a human. Yellow highlights scores in which ChatGPT was within one point of the human score. Source: Tamara Tate, University of California, Irvine (2024).

Still, this level of accuracy was impressive because even teachers disagree on how to score an essay and one-point discrepancies are common. Exact agreement, which only happens half the time between human raters, was worse for AI, which matched the human score exactly only about 40 percent of the time. Humans were far more likely to give a top grade of a 6 or a bottom grade of a 1. ChatGPT tended to cluster grades more in the middle, between 2 and 5.

Tate set up ChatGPT for a tough challenge, competing against teachers and experts with PhDs who had received three hours of training in how to properly evaluate essays. “Teachers generally receive very little training in secondary school writing and they’re not going to be this accurate,” said Tate. “This is a gold-standard human evaluator we have here.”

The raters had been paid to score these 1,800 essays as part of three earlier studies on student writing. Researchers fed these same student essays – ungraded – into ChatGPT and asked ChatGPT to score them cold. ChatGPT hadn’t been given any graded examples to calibrate its scores. All the researchers did was copy and paste an excerpt of the same scoring guidelines that the humans used, called a grading rubric, into ChatGPT and told it to “pretend” it was a teacher and score the essays on a scale of 1 to 6.

Older robo graders

Earlier versions of automated essay graders have had higher rates of accuracy. But they were expensive and time-consuming to create because scientists had to train the computer with hundreds of human-graded essays for each essay question. That’s economically feasible only in limited situations, such as for a standardized test, where thousands of students answer the same essay question.

Earlier robo graders could also be gamed, once a student understood the features that the computer system was grading for. In some cases, nonsense essays received high marks if fancy vocabulary words were sprinkled in them. ChatGPT isn’t grading for particular hallmarks, but is analyzing patterns in massive datasets of language. Tate says she hasn’t yet seen ChatGPT give a high score to a nonsense essay.

Tate expects ChatGPT’s grading accuracy to improve rapidly as new versions are released. Already, the research team has detected that the newer 4.0 version, which requires a paid subscription, is scoring more accurately than the free 3.5 version. Tate suspects that small tweaks to the grading instructions, or prompts, given to ChatGPT could improve existing versions. She is interested in testing whether ChatGPT’s scoring could become more reliable if a teacher trained it with just a few, perhaps five, sample essays that she has already graded. “Your average teacher might be willing to do that,” said Tate.

Many ed tech startups, and even well-known vendors of educational materials, are now marketing new AI essay robo graders to schools. Many of them are powered under the hood by ChatGPT or another large language model and I learned from this study that accuracy rates can be reported in ways that can make the new AI graders seem more accurate than they are. Tate’s team calculated that, on a population level, there was no difference between human and AI scores. ChatGPT can already reliably tell you the average essay score in a school or, say, in the state of California.

Questions for AI vendors

At this point, it is not as accurate in scoring an individual student. And a teacher wants to know exactly how each student is doing. Tate advises teachers and school leaders who are considering using an AI essay grader to ask specific questions about accuracy rates on the student level: What is the rate of exact agreement between the AI grader and a human rater on each essay? How often are they within one-point of each other?

The next step in Tate’s research is to study whether student writing improves after having an essay graded by ChatGPT. She’d like teachers to try using ChatGPT to score a first draft and then see if it encourages revisions, which are critical for improving writing. Tate thinks teachers could make it “almost like a game: how do I get my score up?”

Of course, it’s unclear if grades alone, without concrete feedback or suggestions for improvement, will motivate students to make revisions. Students may be discouraged by a low score from ChatGPT and give up. Many students might ignore a machine grade and only want to deal with a human they know. Still, Tate says some students are too scared to show their writing to a teacher until it’s in decent shape, and seeing their score improve on ChatGPT might be just the kind of positive feedback they need.

“We know that a lot of students aren’t doing any revision,” said Tate. “If we can get them to look at their paper again, that is already a win.”

That does give me hope, but I’m also worried that kids will just ask ChatGPT to write the whole essay for them in the first place.

This story about AI essay scoring was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: AI essay grading is already as ‘good as an overburdened’ teacher, but researchers say it needs more work appeared first on The Hechinger Report.

PROOF POINTS: When schools experimented with $10,000 pay hikes for teachers in hard-to-staff areas, the results were surprising

Jill Barshay — Mon, 08 Apr 2024 10:00:00 +0000

School leaders nationwide often complain about how hard it is to hire teachers and how teaching job vacancies have mushroomed. Fixing the problem is not easy because those shortages aren’t universal. Wealthy suburbs can have a surplus of qualified applicants for elementary schools at the same time that a remote, rural school cannot find anyone to teach high school physics.

A study published online in April 2024 in the journal Educational Evaluation and Policy Analysis illustrates the inconsistencies of teacher shortages in Tennessee, where one district had a surplus of high school social studies teachers, while a neighboring district had severe shortages. Nearly every district struggled to find high school math teachers.

Tennessee’s teacher shortages are worse in math, foreign languages and special education

A 2019–2020 survey of Tennessee school districts showed staffing challenges for each subject. Tech = technology; CTE = career and technical education; ESL = English as a second language. Source: Edwards et al (2024), “Teacher Shortages: A Framework for Understanding and Predicting Vacancies.” Educational Evaluation and Policy Analysis.

Economists have long argued that solutions should be targeted at specific shortages. Pay raises for all teachers, or subsidies to train future teachers, may be good ideas. But broad policies to promote the whole teaching profession may not alleviate shortages if teachers continue to gravitate toward popular specialties and geographic areas.

High school math teacher shortages were widespread in Tennessee

Surpluses of high school social studies teachers were next door to severe shortages

Elementary school teacher shortages were problems in Memphis and Nashville, but not in Knoxville

Perceived staffing challenges from a 2019-20 survey of Tennessee school districts. Source: Edwards et al (2024), “A Framework for Understanding and Predicting Vacancies.” Educational Evaluation and Policy Analysis.

Some school systems have been experimenting with targeted financial incentives. Separate groups of researchers studied what happened in two places – Hawaii and Dallas, Texas, – when teachers were offered significant pay hikes, ranging from $6,000 to $18,000 a year, to take hard-to-fill jobs. In Hawaii, special education vacancies continued to grow, while the financial incentives to work with children with disabilities unintentionally aggravated shortages in general education classrooms. In Dallas, the incentives lured excellent teachers to high-poverty schools. Student performance subsequently skyrocketed so much that the schools no longer qualified for the bump in teacher pay. Teachers left and student test scores fell back down again.

This doesn’t mean that targeted financial incentives are a bad or a failed idea. But the two studies show how the details of these pay hikes matter because there can be unintended consequences or obstacles. Some teaching specialities – such as special education – may have challenges that teacher pay hikes alone cannot solve. But these studies could help point policy makers toward better solutions.

I learned about the Hawaii study in March 2024, when Roddy Theobald, a statistician at the American Institutes for Research (AIR), presented a working paper, “The Impact of a $10,000 Bonus on Special Education Teacher Shortages in Hawai‘i,” at the annual conference of the Center for Analysis of Longitudinal Data in Education Research. (The paper has not yet been peer-reviewed or published in an academic journal and could still be revised.)

In the fall of 2020, Hawaii began offering all of its special education teachers an extra $10,000 a year. If teachers took a job in an historically hard-to-staff school, they also received a bonus of up to $8,000, for a potential total pay raise of $18,000. Either way, it was a huge bump atop a $50,000 base salary.

Theobald and his five co-authors at AIR and Boston University calculated that the pay hikes reduced the proportion of special education vacancies by a third. On the surface, that sounds like a success and other news outlets reported it that way. But special-ed vacancies actually rose over the study period, which coincided with the coronavirus pandemic, and ultimately ended up higher than before the pay hike.

What was reduced by a third was the gap between special ed and general ed vacancies. Vacancies among both groups of teachers initially plummeted during 2020-21, even though only special ed teachers were offered the $10,000. (Perhaps the urgency of the pandemic inspired all teachers to stay in their jobs.) Afterwards, vacancies began to rise again, but special ed vacancies didn’t increase as fast as general ed vacancies. That’s a sign that special ed vacancies might have been even worse had there been no $10,000 bonus.

As the researchers dug into the data, they discovered that this relative difference in vacancies was almost entirely driven by job switches at hard-to-staff schools. General education teachers were crossing the hallway and taking special education openings to make an extra $10,000. Theobald described it as “robbing Peter to pay Paul.”

These job switches were possible because, as it turns out, many general education teachers initially trained to teach special education and held the necessary credentials. Some never even tried special ed teaching and decided to go into general education classrooms instead. But the pay bump was enough for some to reconsider special ed.

Hawaii’s special education teacher vacancies initially fell after $10,000 pay hikes in 2020, but subsequently rose again

The dots represent the vacancy rates for two types of teachers. Source: Theobald et al, “The Impact of a $10,000 Bonus on Special Education Teacher Shortages in Hawai‘i,” CALDER Working Paper No. 290-0823

This study doesn’t explain why so many special education teachers left their jobs in 2021 and 2022 despite the pay incentives or why more new teachers didn’t want these higher paying jobs. In a December 2023 story in Mother Jones, special education teachers in Hawaii described difficult working conditions and how there were too few teaching assistants to help with all of their students’ special needs. Working with students with disabilities is a challenging job, and perhaps no amount of money can offset the emotional drain and burnout that so many special education teachers experience.

Dallas’s experience with pay hikes, by contrast, began as a textbook example of how targeted incentives ought to work. In 2016, the city’s school system designated four low-performing, high-poverty schools for a new Accelerating Campus Excellence (ACE) initiative. Teachers with high ratings could earn an extra $6,000 to $10,000 (depending upon their individual ratings) to work at these struggling elementary and middle schools. Existing teachers were screened to keep their jobs and only 20 percent of the staff passed the threshold and remained. (There were other reforms too, such as uniforms and a small increase in instructional time, but the teacher stipends were the main thrust and made up 85 percent of the ACE budget.)

Five researchers, including economists Eric Hanushek at Stanford University’s Hoover Institution and Steven Rivkin at the University of Illinois Chicago, calculated that test scores jumped immediately after the pay incentives kicked in while scores at other low-performing elementary and middle schools in Dallas barely budged. Student achievement at these previously lowest-performing schools came close to the district average for all of Dallas. Dallas launched a second wave of ACE schools in 2018 and again, the researchers saw similar improvements in student achievement. Results are in a working paper, “Attracting and Retaining Highly Effective Educators in Hard-to-Staff Schools.” I read a January 2024 version.

The program turned out to be so successful at boosting student achievement that three of the four initial ACE schools no longer qualified for the stipends by 2019. Over 40 percent of the high-performing teachers left their ACE schools. Student achievement fell sharply, reversing most of the gains that had been made.

For students, it was a roller coaster ride. Amber Northern, head of research at the Thomas B. Fordham Institute, blamed adults for failing to “prepare for the accomplishment they’d hoped for.”

Still, it’s unclear what should have been done. Allowing these schools to continue the stipends would have eaten up millions of dollars that could have been used to help other low-performing schools.

And even if there were enough money to give teacher stipends at every low-performing school, there’s not an infinite supply of highly effective teachers. Not all of them want to work at challenging, high poverty schools. Some prefer the easier conditions of a high-income magnet school.

These were two good faith efforts that showed the limits of throwing money at specific types of teacher shortages. At best, they are a cautionary tale for policymakers as they move forward.

This story about teacher pay was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Proof Points newsletter.

Talk to us about your college application essay

We want to hear directly from recent college applicants: What did you want to share about yourself with admissions officers? Your replies will help us understand what it’s like to apply to college now. We won’t publish anything you submit without getting your permission first.

1. What is your name?

First Last

2. What was your college application essay about? Why did you choose that topic?

3. If you are open to sharing this essay, or part of it, attach it here.

Accepted file types: docx, jpg, pdf, png, Max. file size: 5 MB.

We won’t publish anything you submit without getting your permission first.

4. Please include some contact information below.

This will allow us to verify anything we receive from you. One of our reporters may also reach out to you for a follow-up conversation.

This field is for validation purposes and should be left unchanged.

The post PROOF POINTS: When schools experimented with $10,000 pay hikes for teachers in hard-to-staff areas, the results were surprising appeared first on The Hechinger Report.

PROOF POINTS: Controversies within the science of reading

Jill Barshay — Mon, 26 Feb 2024 11:00:00 +0000

Four meta-analyses conclude that it’s more effective to teach phonemic awareness with letters, not as an oral-only exercise. Credit: Allison Shelley for EDU

Educators around the country have embraced the “science of reading” in their classrooms, but that doesn’t mean there’s a truce in the reading wars. In fact, controversies are emerging about an important but less understood aspect of learning to read: phonemic awareness.

That’s the technical name for showing children how to break down words into their component letter sounds and then fuse the sounds together. In a phonemic awareness lesson, a teacher might ask how many sounds are in the word cat. The answer is three: “k,” “a,” and “t.” Then the class blends the sounds back into the familiar sounding word: from “kuh-aah-tuh” to “kat.” The 26 letters of the English alphabet produce 44 phonemes, which include unique sounds made from combinations of letters, such as “ch” and “oo.”

Many schools have purchased scripted oral phonemic awareness lessons that do not include the visual display of letters. The oral lessons are popular because they are easy to teach and fun for students. And that’s the source of the current debate. Should kids in kindergarten or first grade be spending so much time on sounds without understanding how those sounds correspond to letters?

A new meta-analysis confirms that the answer is no. In January 2024, five researchers from Texas A&M University published their findings online in the journal Scientific Studies of Reading. They found that struggling readers, ages 4 to 6, no longer benefited after 10.2 hours of auditory instruction in small group or tutoring sessions, but continued to make progress if visual displays of the letters were combined with the sounds. That means that instead of just asking students to repeat sounds, a teacher might hold up cards with the letters C, A and T printed on them as students isolate and blend the sounds.

Meta-analyses sweep up all the best research on a topic and use statistics to tell us where the preponderance of the evidence lies. This newest 2024 synthesis follows three previous meta-analyses on phonemic awareness in the past 25 years. While there are sometimes shortcomings in the underlying studies, the conclusions from all the phonemic meta-analyses appear to be pointing in the same direction.

“If you teach phonemic awareness, students will learn phonemic awareness,” which isn’t the goal, said Tiffany Peltier, a learning scientist who consults on literacy training for teachers at NWEA, an assessment company. “If you teach blending and segmenting using letters, students are learning to read and spell.”

Phonemic awareness has a complicated history. In the 1970s, researchers discovered that good readers also had a good sense of the sounds that constitute words. This sound awareness helps students map the written alphabet to the sounds, an important step in learning to read and write. Researchers proved that these auditory skills could be taught and early studies showed that they could be taught as a purely oral exercise without letters.

But science evolved. In 2000, the National Reading Panel outlined the five pillars of evidence-based reading instruction: phonemic awareness, phonics, fluency, vocabulary and comprehension. This has come to be known as the science of reading. By then, more studies on phonemic awareness had been conducted and oral lessons alone were not as successful. The reading panel’s meta-analysis of 52 studies showed that phonemic awareness instruction was almost twice as effective when letters were presented along with the sounds.

Many schools ignored the reading panel’s recommendations and chose different approaches that didn’t systematically teach phonics or phonemic awareness. But as the science of reading grew in popularity in the past decade, phonemic awareness lessons also exploded. Teacher training programs in the science of reading emphasized the importance of phonemic awareness. Companies sold phonemic programs to schools and told teachers to teach it every day. Many of these lessons were auditory, including chants and songs without letters.

Researchers worried that educators were overemphasizing auditory training. A 2021 article, “They Say You Can Do Phonemic Awareness Instruction ‘In the Dark’, But Should You?” by nine prominent reading researchers criticized how phonemic awareness was being taught in schools.

Twenty years after the reading panel’s report, a second meta-analysis came out in 2022 with even fresher studies but arrived at the same conclusion. Researchers from Baylor University analyzed over 130 studies and found twice the benefits for phonemic awareness when it was taught with letters. A third meta-analysis was presented at a poster session of the 2022 annual meeting of the Society for the Scientific Study of Reading. It also found that instruction was more effective when sounds and letters were combined.

On the surface, adding letters to sounds might seem identical to teaching phonics. But some reading experts say phonemic awareness with letters still emphasizes the auditory skills of segmenting words into sounds and blending the sounds together. The visual display of the letter is almost like a subliminal teaching of phonics without explicitly saying, “This alphabetic symbol ‘a’ makes the sound ‘ah’.” Others explain that there isn’t a bright line between phonemic awareness and phonics and they can be taught in tandem.

The authors of the latest 2024 meta-analysis had hoped to give teachers more guidance on how much classroom time to invest on phonemic awareness. But unfortunately, the classroom studies they found didn’t keep track of the minutes. The researchers were left with only 16 high-quality studies, all of which were interventions with struggling students. These were small group or individual tutoring sessions on top of whatever phonemic awareness lessons children may also have been receiving in their regular classrooms, which was not documented. So it’s impossible to say from this meta-analysis exactly how much sound training students need.

The lead author of the 2024 meta-analysis, Florina Erbeli, an education psychologist at Texas A&M, said that the 10.2 hours number in her paper isn’t a “magic number.” It’s just an average of the results of the 16 studies that met her criteria for being included in the meta-analysis. The right amount of phonemic awareness might be more or less, depending on the child.

Erbeli said the bigger point for teachers to understand is that there are diminishing returns to auditory-only instruction and that students learn much more when auditory skills are combined with visible letters.

I corresponded with Heggerty, the market leader in phoneme awareness lessons, which says its programs are in 70 percent of U.S. school districts. The company acknowledged that the science of reading has evolved and that’s why it revised its phonemic awareness program in 2022 to incorporate letters and introduced a new program in 2023 to pair it with phonics. The company says it is working with outside researchers to keep improving the instructional materials it sells to schools. Because many schools cannot afford to buy a new instructional program, Heggerty says it also explains how teachers can modify older auditory lessons.

The company still recommends that teachers spend eight to 12 minutes a day on phonemic awareness through the end of first grade. This recommendation contrasts with the advice of many reading researchers who say the average kid doesn’t need this much. Many researchers say that phonemic awareness continues to develop automatically as the child’s reading skills improve without advanced auditory training.

NWEA literacy consultant Peltier, whom I quoted earlier, suggests that phonemic awareness can be tapered off by the fall of first grade. More phonemic awareness isn’t necessarily harmful, but there’s only so much instructional time in the day. She thinks that precious minutes currently devoted to oral phonemic awareness could be better spent on phonics, building vocabulary and content knowledge through reading books aloud, classroom discussions and writing.

Another developer of a phonemic awareness program aimed at older, struggling readers is David Kilpatrick, professor emeritus at the State University of New York at Cortland. He told me that five minutes a day might be enough for the average student in a classroom, but some struggling students need a lot more. Kilpatrick disagrees with the conclusions of the meta-analyses because they lump different types of students together. He says severely dyslexic students need more auditory training. He explained that extra time is needed for advanced auditory work that helps these students build long-term memories, he said, and the meta-analyses didn’t measure that outcome.

Another reading expert, Susan Brady, professor emerita at the University of Rhode Island, concurs that some of the more advanced manipulations can help some students. Moving a sound in and out of a word can heighten awareness of a consonant cluster, such as taking the “l” out of the word “plant” to get “pant,” and then inserting it back in again.* But she says this kind of sound subtraction should only be done with visible letters. Doing all the sound manipulations in your head is too taxing for young children, she said.

Brady’s concern is the misunderstanding that teachers need to teach all the phonemes before moving on to phonics. It’s not a precursor or a prerequisite to reading and writing, she says. Instead, sound training should be taught at the same time as new groups of letters are introduced. “The letters reinforce the phoneme awareness and the phoneme awareness reinforces the letters,” said Brady, speaking at a 2022 teacher training session. She said that researchers and teacher trainers need to help educators shift to integrating letters into their early reading instruction. “It’s going to take a while to penetrate the belief system that’s out there,” she said.

I once thought that the reading wars were about whether to teach phonics. But there are fierce debates even among those who support a phonics-heavy science of reading. I’ve come to understand that the research hasn’t yet answered all our questions about the best way to teach all the steps. Schools might be over-teaching phonemic awareness. And children with dyslexia might need more than other children. More importantly, the science of reading is the same as any other scientific inquiry. Every new answer may also raise new questions as we get closer to the truth.

*Clarification: An earlier version of this story suggested a different example of removing the “r” sound from “first,” but “r” is not an independent phoneme in this word. So a teacher would be unlikely to ask a student to do this particular sound manipulation.

This story about phonemic awareness was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Proof Points newsletter.

The post PROOF POINTS: Controversies within the science of reading appeared first on The Hechinger Report.

PROOF POINTS: Tracking student data falls short in combating absenteeism at school

Jill Barshay — Mon, 12 Feb 2024 11:00:00 +0000

Chronic absenteeism has surged across the country since the pandemic, with more than one out of four students missing at least 18 days of school a year. That’s more than three lost weeks of instruction a year for more than 10 million school children. An even higher percentage of poor students, more than one out of three, are chronically absent.

Nat Malkus, a senior fellow at the American Enterprise Institute, a conservative think tank, calls chronic absenteeism – not learning loss – “the greatest challenge for public schools.” At a Feb. 8, 2024 panel discussion, Malkus said, “It’s the primary problem because until we do something about that, academic recovery from the pandemic, which is significant, is a pipe dream.”

The number of students who have missed at least 18 days or 10 percent of the school year remained stubbornly high after schools reopened. More than one out of three students in high poverty schools were chronically absent in 2022.

One district in the Southeast tried to tackle its post-pandemic surge in absenteeism with a computer dashboard that tracks student data and highlights which students are in trouble or heading toward trouble. Called an early warning system, tracking student data this way has become common at schools around the country. (I’m not identifying the district because a researcher who studied its efforts to boost attendance agreed to keep it anonymous in exchange for sharing the outcomes with the public.)

The district’s schools had re-opened in the fall of 2020 and were operating fully in person, but students could opt for remote learning upon request. Yet nearly half of the district’s students weren’t attending school regularly during the 2020-21 year, either in person or remotely. One out of six students had crossed the “chronically absent” threshold of 18 or more missed days. That doesn’t count quarantine days at home because the student contracted or was exposed to Covid.

The early warning system color coded each student for absences. Green designated an “on track” student who regularly came to school. Yellow highlighted an “at risk” student who had missed more than four percent of the school year. And red identified “off track” students who had not come to school 10 percent or more of the time. During the summer of 2021, school staff pored over the colored dots and came up with battle plans to help students return.

A fellow at Harvard University’s Center for Education Policy Research studied what happened the following 2021-22 school year. The results, published online in the journal Educational Evaluation and Policy Analysis on Feb. 5, 2024, were woefully disappointing: the attendance rates of low-income students didn’t improve at all. Low-income students with a track record of missing school continued to miss as much school the next year, despite efforts to help them return.

The only students to improve their attendance rates were higher income students, whose families earned too much to qualify for the free or reduced price lunch program. The attendance of more advantaged students who had been flagged red for “off track” (chronically absent) improved by 1 to 2 percentage points. That’s good, but four out of five of the red “off track” students came from low-income families. Only 20 percent of the pool of chronically absent students had been helped … a bit.

The selling point for early warning systems is that they can help identify students before they’re derailed, when it’s easier to get back into the routine of going to school. But, distressingly, neither rich nor poor students who had been flagged yellow for being “at risk” saw an improvement in attendance.

Yusuf Canbolat, the Harvard fellow, explained to me that early warning systems only flag students. They don’t tell educators how to help students. Every child’s reason for not coming to school is unique. Some are bullied. Others have asthma and their parents are worried about their health. Still others have fallen so behind in their schoolwork that they cannot follow what’s going on in the classroom.

Common approaches, such as calling parents and mailing letters, tend to be more effective with higher-income families, Canbolat explained to me. They are more likely to have the resources to follow through with counseling or tutoring, for example, and help their child return to school.

Low-income families, by contrast, often have larger problems that require assistance schools cannot provide. Many low-income children lost a parent or a guardian to Covid and are still grieving. Many families in poverty need housing, food, employment, healthcare, transportation or even help with laundry. That often requires partnerships with community organizations and social service agencies.

Canbolat said that school staff in this district tried to come up with solutions that were tailored to a child’s circumstances, but giving a family the name of a counseling center isn’t the same as making sure the family is getting the counseling it needs. And there were so many kids flagged for being at risk that the schools could not begin to address their needs at all. Instead, they focused on the most severe chronic absence cases, Canbolat said.

Hedy Chang, executive director of Attendance Works, a nonprofit that is working with schools to improve attendance, said that a case management approach to absenteeism isn’t practical when so many students aren’t coming to school. Many schools, she said, might have only one or two social workers focusing on attendance and their caseloads quickly become overloaded. When nearly half of the students in a school have an attendance problem, system-wide approaches are needed, Chang said.

One systematic approach, she said, is to stop taking an adversarial tone with families — threatening parents with fines or going to court, or students with suspensions for truancy violations. “That doesn’t work,” Chang said.

She recommends that schools create more ways for students to build relationships with adults and classmates at school so that they look forward to being there. That can range from after-school programs and sports to advisory periods and paying high schoolers to mentor elementary school students.

“The most important thing is kids need to know that when they walk into school, there’s someone who cares about them,” said Chang.

Despite the disappointing results of using an early warning system to combat absenteeism, both researchers and experts say the dashboards should not be jettisoned. Chang explained that they still help schools understand the size and the scope of their attendance problem, see patterns and learn if their solutions are working.

I was shocked to read in a recent School Pulse Panel survey conducted by the Department of Education in November 2023 that only 15 percent of school leaders said they were “extremely concerned” about student absences. In high-poverty neighborhoods, there was more concern, but still only 26 percent. Given that the number of students who are chronically absent from schools has almost doubled to 28 percent from around 15 percent before the pandemic, everyone should be very concerned. If we don’t find a solution soon, millions of children will be unable to get the education they need to live a productive life. And we will all pay the price.

This story about school early warning systems was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Proof Points newsletter.

The post PROOF POINTS: Tracking student data falls short in combating absenteeism at school appeared first on The Hechinger Report.

PROOF POINTS: How to get teachers to talk less and students more

Jill Barshay — Mon, 15 Jan 2024 11:00:00 +0000

Example of the talk meter shown to Cuemath tutors at the end of the tutoring session. Source: Figure 2 of Demszky et. al. “Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform.”

Silence may be golden, but when it comes to learning with a tutor, talking is pure gold. It’s audible proof that a student is paying attention and not drifting off, research suggests. More importantly, the more a student articulates his or her reasoning, the easier it is for a tutor to correct misunderstandings or praise a breakthrough. Those are the moments when learning happens.

One India-based tutoring company, Cuemath, trains its tutors to encourage students to talk more. Its tutors are in India, but many of its clients are American families with elementary school children. The tutoring takes place at home via online video, like a Zoom meeting with a whiteboard, where both tutor and student can work on math problems together.

The company wanted to see if it could boost student participation so it collaborated with researchers at Stanford University to develop a “talk meter,” sort of a Fitbit for the voice, for its tutoring site. Thanks to advances in artificial intelligence, the researchers could separate the audio of the tutors from that of the students and calculate the ratio of tutor-to-student speech.

In initial pilot tests, the talk meter was posted on the tutor’s video screen for the entire one-hour tutoring session, but tutors found that too distracting. The study was revised so that the meter pops up every 20 minutes or three times during the session. When the student is talking less than 25 percent of the time, the meter goes red, indicating that improvement is needed. When the student is talking more than half the time, the meter turns green. In between, it’s yellow.

Example of the talk meter shown to tutors every 20 minutes during the tutoring session. Source: Figure 2 of Demszky et. al. “Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform.”

More than 700 tutors and 1,200 of their students were randomly assigned to one of three groups: one where the tutors were shown the talk meter, another where both tutors and students were shown the talk meter, and a third “control” group which wasn’t shown the talk meter at all for comparison.

When just the tutors saw the talk meter, they tended to curtail their explanations and talk much less. But despite their efforts to prod their tutees to talk more, students increased their talking only by 7 percent.

When students were also shown the talk meter, the dynamic changed. Students increased their talking by 18 percent. Introverts especially started speaking up, according to interviews with the tutors.

The results show how teaching and learning is a two-way street. It’s not just about coaching teachers to be better at their craft. We also need to coach students to be better learners.

“It’s not all the teacher’s responsibility to change student behavior,” said Dorottya Demszky, an assistant professor in education data science at Stanford University and lead author of the study. “I think it’s genuinely, super transformative to think of the student as part of it as well.”

The study hasn’t yet been published in a peer-reviewed journal and is currently a draft paper, “Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform,” so it may still be revised. It is slated to be presented at the March 2024 annual conference of the Society of Learning Analytics in Kyoto, Japan.

In analyzing the sound files, Demszky noticed that students tended to work on their practice problems with the tutor more silently in both the control and tutor-only talk meter groups. But students started to verbalize their steps aloud once they saw the talk meter. Students were filling more of the silences.

In interviews with the researchers, students said the meter made the tutoring session feel like a game. One student said, “It’s like a competition. So if you talk more, it’s like, I think you’re better at it.” Another noted: “When I see that it’s red, I get a little bit sad and then I keep on talking, then I see it yellow, and then I keep on talking more. Then I see it green and then I’m super happy.”

Some students found the meter distracting. “It can get annoying because sometimes when I’m trying to look at a question, it just appears, and then sometimes I can’t get rid of it,” one said.

Tutors had mixed reactions, too. For many, the talk meter was a helpful reminder not to be long-winded in their explanations and to ask more probing, open-ended questions. Some tutors said they felt pressured to reach a 50-50 ratio and that they were unnaturally holding back from speaking. One tutor pointed out that it’s not always desirable for a student to talk so much. When you’re introducing a new concept or the student is really lost and struggling, it may be better for the teacher to speak more.

Surprisingly, kids didn’t just fill the air with silly talk to move the gauge. Demszky’s team analyzed the transcripts in a subset of the tutoring sessions and found that students were genuinely talking about their math work and expressing their reasoning. The use of math terms increased by 42 percent.

Unfortunately, there are several drawbacks to the study design. We don’t know if students’ math achievement improved from the talk meter. The problem was that students of different ages were learning different things in different grades and different countries and there was no single, standardized test to give them all.

Another confounding factor is that students who saw the talk meter were also given extra information sessions and worksheets about the benefits of talking more. So we can’t tell from this experiment if the talk meter made the difference or if the information on the value of talking aloud would have been enough to get them to talk more.

Excerpts from transcribed tutoring sessions in which students are talking about the talk meter. Source: Table 4 of Demszky et. al. “Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform.”

Demszky is working on developing a talk meter app that can be used in traditional classrooms to encourage more student participation. She hopes teachers will share talk meter results with their students. “I think you could involve the students a little more: ‘It seems like some of you weren’t participating. Or it seems like my questions were very closed ended? How can we work on this together?’”

But she said she’s treading carefully because she is aware that there can be unintended consequences with measurement apps. She wants to give feedback not only on how much students are talking but also on the quality of what they are talking about. And natural language processing still has trouble with English in foreign accents and background noise. Beyond the technological hurdles, there are psychological ones too.

“Not everyone wants a Fitbit or a tool that gives them metrics and feedback,” Demszky acknowledges.

This story about student participation was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Proof Points newsletter.

The post PROOF POINTS: How to get teachers to talk less and students more appeared first on The Hechinger Report.

PROOF POINTS: Four lessons from post-pandemic tutoring research

Jill Barshay — Mon, 08 Jan 2024 11:00:00 +0000

Research points to intensive daily tutoring as one of the most effective ways to help academically struggling children catch up. There have been a hundred randomized control trials, but one of the most cited is of a tutoring program in Chicago high schools, where ninth and 10th graders learned an extra year or two of math from a daily dose of tutoring. That’s the kind of result that could offset pandemic learning losses, which have remained devastating and stubborn nearly four years after Covid first erupted, and it’s why the Biden Administration has recommended that schools use their $190 billion in federal recovery funds on tutoring.

This tutoring evidence, however, was generated before the pandemic, and I was curious about what post-pandemic research says about how tutoring is going now that almost 40 percent of U.S. public schools say they’re offering high-dosage tutoring and more than one out of 10 students (11 percent) are receiving it this 2023-24 school year. Here are four lessons.

Why timing matters

Scheduling tutoring time during normal school hours and finding classroom space to conduct it are huge challenges for school leaders. The schedule is already packed with other classes and there aren’t enough empty classrooms. The easiest option is to tack tutoring on to the end of the school day as an after-school program.

New Mexico did just that and offered high school students free 45-minute online video sessions three times a week in the evenings and weekends. The tutors were from Saga Education, the same tutoring organization that had produced spectacular results in Chicago. Only about 500 students signed up out of more than 34,000 who were eligible, according to a June 2023 report from MDRC, an outside research organization. Researchers concluded that after-school tutoring wasn’t a “viable solution for making a sizable and lasting impact.” The state has since switched to scheduling tutoring during the school day.

Attendance is spotty too. Many after-school tutoring programs around the country report that even students who sign up don’t attend regularly.

A hiring dilemma

The job of tutor is now the fastest-growing position in the K–12 sector, but 40 percent of schools say they’re struggling to hire tutors. That’s not surprising in a red-hot job market, where many companies say it’s tough to find employees.

Researchers at MDRC in a December 2023 report wrote about different hiring strategies that schools around the country are using. I was flabbergasted to read that New Mexico was paying online tutors $50 an hour to tutor from their homes. Hourly rates of $20 to $30 are fairly common in my reporting. But at least the state was able to offer tutoring to students in remote, rural areas where it would otherwise be impossible to find qualified tutors.

Tutoring companies are a booming business. Schools are using them because they take away the burden of hiring, training and supervising tutors. However, Fulton County, Georgia, which includes Atlanta, found that a tutoring company’s curriculum might have nothing to do with what children are learning in their classrooms and that there’s too little communication between tutors and classroom teachers. Tutors were quitting at high rates and replaced with new ones; students weren’t able to form long-term relationships with their tutors, which researchers say is critical to the success of tutoring.

When Fulton County schools hired tutors directly, they were more integrated into the school community. However, schools considered them to be “paraprofessionals” and felt there were more urgent duties than tutoring that they needed to do, from substitute teaching and covering lunch duty to assisting teachers.

Chicago took the burden off schools and hired the tutors from the central office. But schools preferred tutors who were from the neighborhood because they could potentially become future teachers. The MDRC report described a sort of catch-22. Schools don’t have the capacity to hire and train tutors, but the tutors that are sent to them from outside vendors or a central office aren’t ideal either.

Oakland, Calif., experienced many of the obstacles that schools are facing when trying to deliver tutoring at a large scale to thousands of students. The district attempted to give kindergarten through second grade students a half hour of reading tutoring a day. As described by a December 2023 case study of tutoring by researchers at the Center for Reinventing Public Education (CRPE), Oakland struggled with hiring, scheduling and real estate. It hired an outside tutoring organization to help, but it too had trouble recruiting tutors, who complained of low pay. Finding space was difficult. Some tutors had to work in the hallways with children.

The good news is that students who worked with trained tutors made the same gains in reading as those who were given extra reading help by teachers. But the reading gains for students were inconsistent. Some students progressed less in reading than students typically do in a year without tutoring. Others gained almost an additional year’s worth of reading instruction – 88 percent more.

The effectiveness of video tutoring

Bringing armies of tutors into school buildings is a logistical and security nightmare. Online tutoring solves that problem. Many vendors have been trying to mimic the model of successful high dosage tutoring by scheduling video conferencing sessions many times a week with the same well-trained tutor, who is using a good curriculum with step-by-step methods. But it remains a question whether students are as motivated to work as hard with video tutoring as they are in person. Everyone knows that 30 hours of Zoom instruction during school closures was a disaster. It’s unclear whether small, regular doses of video tutoring can be effective.

In 2020 and 2021, there were two studies of online video tutoring. A randomized control trial in Italy produced good results, especially when the students received tutoring four times a week. The tutoring was less than half as potent when the sessions fell to twice a week, according to a paper published in September 2023. Another study in Chicago found zero results from video tutoring. But the tutors were unpaid volunteers and many students missed out on sessions. Both tutors and tutees often failed to show up.

The first randomized controlled trial of a virtual tutoring program for reading was conducted during the 2022-23 school year at a large charter school network in Texas. Kindergarten, first and second graders received 20 minutes of video tutoring four times a week, from September through May, with an early reading tutoring organization called OnYourMark. Despite the logistical challenges of setting up little children on computers with headphones, the tutored children ended the year with higher DIBELS scores, a measure of reading proficiency for young children, than students who didn’t receive the tutoring. One-to-one video tutoring sometimes produced double the reading gains as video tutoring in pairs, demonstrating a difference between online and in-person tutoring, where larger groups of two and three students can be very effective too. That study was published in October 2023.

Video tutoring hasn’t always been a success. A tutoring program by Intervene K-12, a tutoring company, received high marks from reviewers at Johns Hopkins University, but outside evaluators didn’t find benefits when it was tested on students in Texas. In an unpublished study, the National Student Support Accelerator, a Stanford University organization that is promoting and studying tutoring, found no difference in year-end state test scores between students who received the tutoring and those who received other small group support. Study results can depend greatly on whether the comparison control group is getting nothing or another extra-help alternative.

Matthew Kraft, a Brown University economist who studies tutoring, says there hasn’t been an ideal study that pits online video tutoring directly against in-person tutoring to measure the difference between the two. Existing studies, he said, show some “encouraging signs.”

The most important thing for researchers to sort out is how many students a tutor can work with online at once. It’s unclear if groups of three or four, which can be effective in person, are as effective online. “The comments we’re getting from tutors are that it’s significantly different to tutor three students online than it is to tutor three students in person,” Kraft said.

In my observations of video tutoring, I have seen several students in groups of three angle their computers away from their faces. I’ve watched tutors call students’ names over and over again, trying to get their attention. To me, students appear far more focused and energetic in one-to-one video tutoring.

How humans and machines could take turns

A major downside to every kind of tutoring, both in-person and online, is its cost. The tutoring that worked so well in Chicago can run $4,000 per student. It’s expensive because students are getting over a hundred hours of tutoring and schools need to pay the tutors’ hourly wages. Several researchers are studying how to lower the costs of tutoring by combining human tutoring with online practice work.

In one pre-pandemic study that was described in a March 2023 research brief by the University of Chicago’s Education Lab, students worked in groups of four with an in-person tutor. The tutors worked closely with two students at a time while the other two students worked on practice problems independently on ALEKS, a widely used computerized tutoring system developed by academic researchers and owned by McGraw-Hill. Each day the students switched: the ALEKS kids worked with the tutor and the tutored kids turned to ALEKS. The tutor sat with all four students together, monitoring the ALEKS kids to make sure they were doing their math on the computer.

The math gains nearly matched what the researchers had found in a prior study of human tutoring alone, where tutors worked with only two students at a time and required twice as many tutors. The cost was $2,000 per student, much less than the usual $3,000-$4,000 per student price tag of the human tutoring program.

Researchers at the University of Chicago have been testing the same model with online video tutoring, instead of in-person, and said they are seeing “encouraging initial indications.” Currently, the research team is studying how many students one tutor can handle at a time, from four to as many as eight students, alternating between humans and ed tech, in order to find out if the sessions are still effective.

Researchers at Carnegie Mellon University conducted a similar study of swapping between human tutoring and practicing math on computers. Instead of ALEKS, this pilot study used Mathia, another computerized tutoring system developed by academic researchers and owned by Carnegie Learning. This was not a randomized control trial, but it did take place during the pandemic in 2020-21. Middle school students doubled the amount of math they learned compared to similar students who didn’t receive the tutoring, according to Ken Koedinger, a Carnegie Mellon professor who was part of the research team.

“AI tutors work when students use them,” said Koedinger. “But if students aren’t using them, they obviously don’t work.” The human tutors are better at motivating the students to keep practicing, he said. The computer system gives each student personalized practice work, targeted to their needs, instant feedback and hints.

Technology can also guide the tutors. With one early reading program, called Chapter One, in-person tutors work with young elementary school children in the classroom. Chapter One’s website keeps track of every child’s progress. The tutor’s screen indicates which student to work with next and what skills that student needs to work on. It also suggests phonics lessons and activities that the tutor can use during the session. A two-year randomized control trial, published in December 2023, found that the tutored children – many of whom received short five-minute bursts of tutoring at a time – outperformed children who didn’t receive the tutoring.

The next frontier in tutoring, of course, is generative AI, such as Chat GPT. Researchers are studying how students learn directly from Khan Academy’s Khanmigo, which gives step-by-step, personalized guidance, like a tutor, on how to solve problems. Other researchers are using this technology to help coach human tutors so that they can better respond to students’ misunderstandings and confusion. I’ll be looking out for these studies and will share the results with you.

This story about video tutoring was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Hechinger newsletter.

The post PROOF POINTS: Four lessons from post-pandemic tutoring research appeared first on The Hechinger Report.

PROOF POINTS: Schools keep buying online drop-in tutoring. The research doesn’t support it

Jill Barshay — Mon, 16 Oct 2023 10:00:00 +0000

Ever since schools reopened and resumed in-person instruction, districts have been trying to help students catch up from pandemic learning losses. The Biden Administration has urged schools to use tutoring. Many schools have purchased an online version that gives students 24/7 access to tutors. Typically, communication is through text chat, similar to communicating with customer service on a website. Students never see their tutors or hear their voices.

Researchers estimate that billions have been spent on these online tutoring services, but so far, there’s no good evidence that they are helping many students catch up. And many students need extra help. According to the most recent test scores from spring 2023, 50 percent more students are below grade level than before the pandemic; even higher achieving students remain months behind where they should be.

Low uptake

The main problem is that on-demand tutoring relies on students to seek extra help. Very few do. Some school systems have reported usage rates below 2 percent. A 2022 study by researchers at Brown University of an effort to boost usage among 7,000 students at a California charter school network found that students who needed the most help were the least likely to try online tutoring and only a very small percentage of students used it regularly. Opt-in tutoring could “exacerbate inequalities rather than reduce them,” warned a September 2023 research brief by Brown University’s Annenberg Center, Results for America, a nonprofit that promotes evidence-backed policies, the American Institutes for Research and NWEA, an assessment firm.

In January 2023, an independent research firm Mathematica released a more positive report on students’ math gains with an online tutoring service called UPchieve, which uses volunteers as tutors. It seemed to suggest that high school students could make extraordinary math progress from online homework help.

UPchieve is a foundation-funded nonprofit with a slightly different model. Instead of schools buying the tutoring service from a commercial vendor, UPchieve makes its tutors freely available to any student in grades eight to 12 living in a low-income zip code or attending a low-income high school. Behind the scenes, foundations cover the cost to deliver the tutoring, about $5 per student served. (Those foundations include the Bill & Melinda Gates and the Overdeck Family foundations, which are also among the many funders of The Hechinger Report.)

UPchieve posted findings from the study in large font on its website: “Using UPchieve 9 times caused student test scores to meaningfully increase” by “9 percentile rank points.” If true, that would be equivalent to doubling the amount of math that a typical high school student learns. That would mean that students learned an extra 14 weeks worth of math from just a few extra hours of instruction. Not even the most highly regarded and expensive tutoring programs using professional tutors who are following clear lesson plans achieve this.

The st u dy garnered a lot of attention on social media and flattering media coverage “for disrupting learning loss in low-income kids.” But how real was this progress?

Gift card incentives

After I read the study, which was also commissioned by the Gates foundation, I immediately saw that UPchieve’s excerpts were taken out of context. This was not a straightforward randomized controlled trial, comparing what happens to students who were offered this tutoring with students who were not. Instead, it was a trial of the power of cash incentives and email reminders.

For the experiment, Mathematica researchers had recruited high schoolers who were already logging into the UPchieve tutoring service. These were no ordinary ninth and 10th graders. They were motivated to seek extra help, resourceful enough to find this tutoring website on their own (it was not promoted through their schools) and liked math enough to take extra tests to participate in the study. One group was given extra payments of $5 a week for doing at least 10 minutes of math tutoring on UPchieve, and sent weekly email reminders. The other group wasn’t. Students in both groups received $100 for participating in the study.

The gift cards increased usage by 1.6 hours or five to six more sessions over the course of 14 weeks. These incentivized students “met” with a tutor for a total of nine sessions on average; the other students averaged fewer than four sessions. (As an aside, it’s unusual that cash incentives would double usage. Slicing the results another way, only 22 percent of the students in the gift-card group used UPchieve more than 10 times compared with 14 percent in the other group. That’s more typical.)

At the end of 14 weeks, students took the Renaissance Star math test, an assessment taken by millions of students across the nation. But the researchers did not report those test scores. That’s because they were unlucky in their random assignment of students. By chance, comparatively weaker math students kept getting assigned to receive cash incentives. It wasn’t an apples-to-apples comparison between the two groups, a problem that can happen in a small randomized controlled trial. To compensate, the researchers statistically adjusted the final math scores to account for differences in baseline math achievement. It’s those statistically adjusted scores that showed such huge math gains for the students who had received the cash incentives and used the tutoring service more.

However, the huge 9 percentile point improvement in math was not statistically significant. There were so few students in the study – 89 in total – that the results could have been a fluke. You’d need a much larger sample size to be confident.

A caution from the researcher

When I interviewed one of the Mathematica researchers, he was cautious about UPchieve and on-demand tutoring in general. “This is an approach to tutoring that has promise for improving students’ math knowledge for a specific subset of students: those who are likely to proactively take up an on-demand tutoring service,” said Greg Chojnacki, a co-author of the UPchieve study. “The study really doesn’t speak to how promising this model is for students who may face additional barriers to taking up tutoring.”

Chojnacki has been studying different versions of tutoring and he says that this on-demand version might prove to be beneficial for the “kid who may be jumping up for extra help the first chance they get,” while other children might first need to “build a trusting relationship” with a tutor they can see and talk to before they engage in learning. With UPchieve and other on-demand models, students are assigned to a different tutor at each session and don’t get a chance to build a relationship.

Chojnacki also walked back the numerical results in our interview. He told me not to “put too much stock” in the exact amount of math that students learned. He said he’s confident that self-motivated students who use the tutoring service more often learned more math, but it could be “anywhere above zero” and not nearly as high as 9 percentile points – an extra three and a half months worth of math instruction.

UPchieve defends “magical” results

UPchieve’s founder, Aly Murray, told me that the Mathematica study results initially surprised her, too. “I agree they almost seem magical,” she said by email. While acknowledging that a larger study is needed to confirm the results, she said she believes that online tutoring without audio and video can “lead to greater learning” than in-person tutoring “when done right.”

“I personally believe that tutoring is most effective when the student is choosing to be there and has an acute need that they want to address (two things that are both uniquely true of on-demand tutoring),” she wrote. “Students have told us how helpful it is to get timely feedback and support in the exact moment that they get confused (which is often late at night in their homes while working on their homework). So in general, I believe that on-demand tutoring is more impactful than traditional high-dosage tutoring models on a per tutoring session or per hour of tutoring basis. This could be part of why we were able to achieve such outsized results despite the low number of sessions.”

Murray acknowledged that low usage remains a problem. At UPchieve’s partner schools, only 5 percent of students logged in at least once during the 2022-23 year, she told me. At some schools, usage rates fell below 1 percent. Her goal is to increase usage rates at partner schools to 36 percent. (Any low-income student in grades eight to 12 can use the tutoring service at no cost and their schools don’t pay UPchieve for the tutoring either, but some “partner” schools pay UPchieve to promote and monitor usage.)

The downside to homework help

Helping students who are stuck on a homework assignment is certainly nice for motivated kids who love school, but relying on homework questions is a poor way to catch up students who are the most behind, according to many tutoring experts.

“I have a hard time believing that students know enough about what they don’t know,” said Susanna Loeb, a Stanford University economist who founded the National Student Support Accelerator, which aims to bring evidence-based tutoring to more students.

For students who are behind grade level, homework questions often don’t address their gaps in basic math foundations. “Maybe underneath, they’re struggling with percentages, but they’re bringing an algebra question,” said Loeb. “If you just bring the work of the classroom to the tutor, it doesn’t help students very much.”

Pre-pandemic research of once-a-week after-school homework help also produced disappointing results for struggling students. Effective tutoring starts with an assessment of students’ gaps, Loeb said, followed by consistent, structured lessons.

Schools struggle to offer tutors for all students

With so little evidence, why are schools buying on-demand online tutoring? Pittsburgh superintendent Wayne Walters said he was unable to arrange for in-person tutoring in all of his 54 schools and wanted to give each of his 19,000 students access to something. He signed a contract with Tutor.com for unlimited online text-chat tutoring in 2023-24.

“I’m going forward with it because it’s available,” Walters said. “If I don’t have something to provide, or even offer, then that limits opportunity and access. If there’s no access, then I can’t even push the needle to address the most marginalized and the most vulnerable.”

Walters hopes to make on-demand tutoring “sexy” and appealing to high schoolers accustomed to texting. But online tutoring is not the same as spontaneous texting between friends. One-minute delays in tutors’ replies to questions can test students’ patience.

On-demand tutoring can appear to be an economical option. Pittsburgh is able to offer this kind of tutoring, which includes college admissions test prep for high schoolers, to all 19,000 of its students for $600,000. Providing 400 students with a high-dosage tutoring program – the kind that researchers recommend – could cost $1.5 million. There are thousands of Pittsburgh students who are significantly behind grade level. It doesn’t seem fair to deliver high-quality in-person tutoring to only a lucky few.

However, once you factor in actual usage, the economics of on-demand tutoring looks less impressive. In Fairfax County, Va., for example, only 1.6 percent of students used Tutor.com. If Pittsburgh doesn’t surpass that rate, then no more than 300 of its students will be served.

There are no villains here. School leaders are trying to do the best they can and be fair to everyone. Hopes are raised when research suggests that online on-demand tutoring can work if they can succeed in marketing to students. But they should be skeptical of studies that promise easy solutions before investing precious resources. That money could be better spent on small-group tutoring that dozens of studies show is more effective for students.

This story about drop-in tutoring was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: Schools keep buying online drop-in tutoring. The research doesn’t support it appeared first on The Hechinger Report.

PROOF POINTS: Lowering test anxiety in the classroom

Jill Barshay — Mon, 25 Sep 2023 10:00:00 +0000

In education circles, it’s popular to rail against testing, especially timed exams. Tests are stressful and not the best way to measure knowledge, wrote Adam Grant, an organizational psychologist at the University of Pennsylvania’s Wharton School in a Sept. 20, 2023 New York Times essay. “You wouldn’t want a surgeon who rushes through a craniectomy, or an accountant who dashes through your taxes.”

It’s tempting to agree. But there’s another side to the testing story, with a lot of evidence behind it.

Cognitive scientists argue that testing improves learning. They call it “practice retrieval” or “test-enhanced learning.” In layman’s language, that means that the brain learns new information and skills by being forced to recall them periodically. Remembering consolidates information and helps the brain form long-term memories. Of course, testing is not the only way to accomplish this, but it’s easy and efficient in a classroom.

Several meta-analyses, which summarize the evidence from many studies, have found higher achievement when students take quizzes instead of, say, reviewing notes or rereading a book chapter. “There’s decades and decades of research showing that taking practice tests will actually improve your learning,” said David Shanks, a professor of psychology and deputy dean of the Faculty of Brain Sciences at University College London.

Still, many students get overwhelmed during tests. Shanks and a team of four researchers wanted to find out whether quizzes exacerbate test anxiety. The team collected 24 studies that measured students’ test anxiety and found that, on average, practice tests and quizzes not only improved academic achievement, but also ended up reducing test anxiety. Their meta-analysis was published in Educational Psychology Review in August 2023.

Shanks says quizzes can be a “gentle” way to help students face challenges.

“It’s not like being thrown into the deep end of a swimming pool,” said Shanks. “It’s like being put very gently into the shallow end. And then the next time a little bit deeper, and then a little bit deeper. And so the possibility of becoming properly afraid just never arises.”

Why test anxiety diminishes is unclear. It could be because students are learning to tolerate testing conditions through repeated exposure, as Shanks described. Or it could be because quizzes are helping students master the material and perform better on the final exam. We tend to be less anxious about things we’re good at. Unfortunately, the underlying studies didn’t collect the data that could resolve this academic debate.

Shanks doesn’t think competency alone reduces test anxiety. “We know that many high achieving students get very anxious,” he said. “So it can’t just be that your anxiety goes down as your performance goes up.”

To minimize test anxiety, Shanks advises that practice tests be low stakes, either ungraded or ones that students can retake multiple times. He also suggests gamified quizzes to make tests more fun and entertaining.

Some of this advice is controversial. Many education experts argue against timed spelling tests or multiplication quizzes, but Shanks recommends both. “We would strongly speculate that there is both a learning benefit from those tests and a beneficial impact on anxiety,” he said.

Shanks said a lot more research is needed. Many of the 24 existing studies were small experiments and of uneven quality, and measuring test anxiety through surveys is an inexact science. The underlying studies covered a range of school subjects, from math and science to foreign languages, and took place in both classrooms and laboratory settings, studying students as young as third grade and as old as college. Nearly half the studies took place in the United States with the remainder in the United Kingdom, Malaysia, Nigeria, Iran, Brazil, the Netherlands, China, Singapore and Pakistan.

Shanks cautioned that this meta-analysis should not be seen as a “definitive” pronouncement that tests reduce anxiety, but rather as a summary of early research in a field that is still in its “infancy.” One big issue is that the studies measured average test anxiety for students. There may be a small minority of students who are particularly sensitive to test anxiety and who may be harmed by practice tests. These differences could be the subject of future research.

Another issue is the tradeoff between boosting achievement and reducing anxiety. The harder the practice test, the more beneficial it is for learning. But the lower the stakes for a quiz, the better it is for reducing anxiety.

Shanks dreams of finding a Goldilocks “sweet spot” where “the stakes are not so high that the test begins to provoke anxiety, but the stakes are just high enough to get the full benefit of the testing effect. We’re miles away from having firm answers to subtle questions like that.”

This story about test anxiety was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: Lowering test anxiety in the classroom appeared first on The Hechinger Report.