Great Expectations: Studying Expectancy’s Effects

Robert Rosenthal bumped into the complex and far reaching power of expectations in 1956 when he thought he ruined his doctoral research. For his Ph.D. thesis in clinical psychology at UCLA, he aimed to find out whether people project disappointment with themselves onto others. However, the subjects randomly selected for a disappointing experience on a test showed a significant difference from other subjects even before the test. Rosenthal realized he must have made a mistake. “I made them different before the experiment,” he said. “That was spooky.”

Knowing which group would fail when he first met with them, he had somehow contaminated them with his expectations without knowing how. This finding elevated his research to another level. “My thesis committee thought I was onto something new: unconscious experimenter bias.”

He could find only one similar finding in published studies. In 1934, A.S. Barr found that different researchers had different results doing the same experiment.

Rosenthal, now distinguished professor of psychology at the University of California at Riverside, later did a study to examine interactions between researchers and subjects. Reviewing their behavior on film, he found obvious differences in tone and body language that could affect experimental outcomes. Although experimenters were then thought to be as neutral as thermometers taking a temperature, he said, they were not. But his colleagues were annoyed. “They thought I was impugning human studies in scientific research,” said Rosenthal. “So I talked to scientists who ran rats. They know that problem goes on with humans. That’s why they run rats. But how do they know it doesn’t happen with rats?”

He experimented with rats learning to run through mazes. He told one group of experimenters that their rats were bred to be “dull,” while the other group was told they had exceptionally bright rats, although the rats were randomly chosen for each group. Nevertheless, rats arbitrarily designated as smarter tended to find their way through mazes faster and more precisely, and other scientists replicated those results. Rosenthal attributed differences in maze proficiency to the ways experimenters handled the rats. “‘Bright rats’ were handled more warmly, like pets,” he said.

Rosenthal did the research while he was director of the clinical psychology program at the University of North Dakota, but was unable to publish it until he became a lecturer on clinical psychology at Harvard. He was soon asked by Lenore Jacobson, a San Francisco elementary school principal who saw the study, if he would be interested in testing expectation effects on her students. Within two weeks he was on a plane with a plan.

In the study, teachers each received a list of students in their class who were expected to “bloom” academically within the next few months, supposedly based on the results of a test. During that year, reasoning scores of “bloomers” tended to improve more than those of other students, while verbal scores did not.

Mexican-Americans—the school minority—showed the most improvement when marked as bloomers, though it was less for boys who were more identifiably Mexican. Also, students in lower track classes gained less than students in average classes, who gained the most.

Stereotyping and stratifying students on tracks have increasingly been identified as sources of debilitating expectations, says Rhona Weinstein, University of California—Berkeley psychologist and author of Achieving College Dreams.

Elementary school students, especially in earlier grades and new to the teacher, were most responsive to newly elevated expectations, Rosenthal and Jacobson found. “A teacher who has known a student for a while is more likely to roll their eyes” when higher expectations are suggested, says Rosenthal.

On Aug. 8, 1967, the New York Times ran a story at the bottom of the front page about Rosenthal and Jacobson’s study and the book they wrote about it, Pygmalion in the Classroom. Stories at the top of the page were about police killings of African Americans in Detroit riots and the slow pace of school integration in the South.

Yet despite the visibility of prejudice and inequality, Rosenthal’s study drew ire from both teachers’ organizations and education scholars. “Albert Shanker,” founder of the American Teachers Federation, “wrote a diatribe condemning the study. He said it hurt teacher credibility,” Rosenthal recalled. “But I thought the study showed teachers’ importance.”

Meanwhile, scholars disparaged the study’s strategy. In 1971, Stanford professors Janet Elashoff and Richard Snow published Pygmalion Reconsidered, an entire book dedicated to critiquing the Pygmalion study, with responses from Rosenthal and Don Rubin, a Harvard statistician. Their targets ranged from measures of student improvement to teacher involvement, as they noted that some teachers hardly looked at their list of “bloomers” and even threw it out.

However, by 1978, 345 expectancy effect studies in numerous situations had been conducted. Rosenthal and Don Rubin, a Harvard statistician, evaluated the studies with a meta-analysis that confirmed the phenomenon.

As efforts to replicate the Pygmalion study also multiplied, a meta-analysis of 18 similar studies in 1984, by Stephen Raudenbush, gave the original study credence. He found that if teachers received false information about “blooming” within two weeks of meeting a student, the expectancy effect was strongest.

Meanwhile, Rosenthal became interested in the nonverbal cues that could be responsible for the expectancy effect. He identified a teacher’s “warmth” as a key factor. But efforts to determine which combination of gestures registered with students as warmth failed.

“Counting how many times teachers smiled, nodded and leaned forward,” he said, did not account for a teacher’s effect. Rather, what worked was having a group of people watch films of teachers’ interactions and rate them from nasty to warm on a scale of one to nine. Those ratings corresponded with a teacher’s effect.

Three other factors also signaled teachers’ expectations, thus apparently affecting student achievement. One factor was that teachers who expected more of students taught more material. For instance, Rosenthal recalled encountering a teacher who had not been told the comprehension limits of students with Down Syndrome and taught them substantially more than “experts” expected.

Also, teachers who expect more wait longer for a student to respond to a question. And when a student errs in answering a question, high expectation teachers take more time to explain the answer to them.

Rosenthal’s investigation also led him and others to scrutinize nonverbal dynamics in a variety of relationships, discerning ways that expectancy effects suffuse human interactions. For instance, in doctors’ interactions with patients, a concerned tone is less likely to precede malpractice lawsuits, Rosenthal and his former student, Nalini Ambady, concluded from their study.

Also, Rosenthal, with his former student, Peter Blanck, and Santa Clara County Municipal Court Judge LaDoris Cordell found that a judge’s voice, when giving jury directions, often indicates what the jury’s decision will be. However, the “music” in a communication, not the words, predict the jury’s inclination.

Now, some 500 studies of expectancy effects have been conducted in numerous contexts, and meta-analyses confirm patterns in the data, says Rosenthal. When managers believe workers to be highly productive, they often are, Rosenthal and Jacobson noted in Pygmalion, citing studies. What drivers expect of other drivers on the road tends to be actualized, a 1964 study by R.E. Shore found. Expectations of nice or nasty behavior by others tend to be validated.

In Pygmalion, Rosenthal and Jacobson also cited research indicating people tend to be comfortable with what is expected, even if unpleasant, as predictability may be reassuring. Perhaps evolution favors the ability to predict, they said.

Because of the potential for experimenter bias, Rosenthal, now 83, has little faith in individual studies, he says. He has devoted much of his career, at Harvard and then at UCR, to investigating the implications of statistics in psychology. He was co-chair of the Task Force on Statistical Inference of the American Psychological Association.

What has been missing from scientific literature in the half century since the initial Pygmalion study, says Rosenthal, is the research “gold standard,” the randomized control study, evaluating the benefits of a planned intervention, based on knowledge of the expectation effect. The experimental condition is compared to a condition that “looks, feels, and smells” like the experimental condition but is not, says Rosenthal, like a placebo pill that looks like the new drug but lacks active ingredients.

What has delayed such experiments, he says, is the challenge of “bottling” expectation in a replicable way. But half a world away, Christine Rubie-Davies, education professor at the University of Auckland, enlisted Rosenthal’s assistance to strategize such a three-year study.

What motivated her was her experience as deputy principal of an elementary school. “I noticed that in some classes all students did way better than in other classes,” she said. “The teachers in those classes expected all children to do well—and they did.”

She also watched Maori children from lower socioeconomic groups thrive when more was expected of them and then go on to higher education. Rubie-Davies was the first in her own family to go to college, rising to her own expectations, she said.

However, she noted, “New Zealand has the highest within class ability grouping rate of all OECD countries (Organization for Economic Cooperation and Development). Interestingly, we also have the highest disparity between our highest and lowest achievers. Ability grouping is entrenched in our system to such an extent that many teachers find it impossible to conceive of how you could teach without using ability groups. Yet the research evidence shows that ability grouping has very little effect on student learning, but large negative effects on student self-
esteem and life chances.”

Whether or not she succeeded in “bottling” high expectations and warmth, Rubie-Davies’ results resembled those of the original Pygmalion study. In several schools, a randomly chosen group of teachers in a control group took standard teacher development courses. Meanwhile, in the randomly chosen experimental group, teachers learned practices identified by Rosenthal and Jacobson as typical of high expectation teachers.

The challenge of infusing a teacher’s behavior with “warmth” was met by teaching strategies for improving “class climate” that were demonstrated in previous research. However, her biggest challenge was breaking the ability grouping habit.

“I had to offer teachers all kinds of ideas for how they could teach without using ability grouping,” said Rubie-Davies. “They then brainstormed and planned together and came up with some excellent ideas. Some moved right away from ability grouping; some found it much harder to dispense with them altogether. Most did make at least some changes.”

By the end of the year, math scores improved substantially, but reading scores did not. This outcome resembled that of the original Pygmalion study, in which reasoning scores improved significantly, but verbal scores did not. Perhaps reading is more affected by parental involvement at home, psychologist Rhona Weinstein has suggested.

The new high expectation practices were not all in place until the last part of the school year in Rubie-Davies’ study. In the second year of the three-year study, trained teachers will train other teachers, and more of school culture will be affected.

However, says Rosenthal, “I’d be happier if the study was replicated in other countries and continents. One study is never enough. More replicating will be necessary before we can talk about successfully bottling the phenomenon.”

Read more from Jessica Cohen about The Expectancy Effect in Action.

Jessica Cohen is a freelance writer based in Pennsylvania. She most recently reported on health issues related to fracking for Utne Reader.