What can we learn from the Stanford Prison Experiment? Nothing, besides how not to do research. Brian Dunning did an excellent job of pointing out some of the study’s flaws. Here are a few of them:
- First, the issue of selection bias… In this case, Zimbardo advertised to students to participate in an experiment about “prison life”. Clearly, a large segment of the general population would be repulsed by such a concept, and you’ve got to have questions about anyone attracted to that idea. Thus, all applicants to the Stanford Prison Experiment were preselected for comfort with the idea of “prison life”.
- Most of the Stanford guards did not exhibit any cruel or unusual behavior, often being friendly and doing favors for the prisoners. The most notorious guard, nicknamed John Wayne, explained that he was simply trying to emulate Strother Martin’s character from Cool Hand Luke. Other analysts have found it difficult to support Zimbardo’s conclusions, since the allegedly poisonous environment did not affect most participants, and the most notorious participant explained that his motivation came from a completely different source.
- Zimbardo himself was also criticized for actively participating in the experiment as one of the characters. He was the prison superintendent. Although he may have restrained himself from having any influence on the experiment, the fact that he put himself in the position of ultimate active authority over the guards’ behavior calls this into question. Many designers of such experiments would summarily throw out such a study based on this alone.
- Some researchers have also questioned why Zimbardo neglected the effect of individual personalities, instead generally attributing all behavior to the prison environment. How did John Wayne’s behavior as a guard compare to his behavior outside the experiment? Was he generally a friendly guy, or might he already have been a royal jerk? We don’t know, so there was insufficient data to conclude that his behavior was changed by the experiment.
John Mark, who was a guard, said this (emphasis mine):
I didn’t think it was ever meant to go the full two weeks. I think Zimbardo wanted to create a dramatic crescendo, and then end it as quickly as possible. I felt that throughout the experiment, he knew what he wanted and then tried to shape the experiment—by how it was constructed, and how it played out—to fit the conclusion that he had already worked out. He wanted to be able to say that college students, people from middle-class backgrounds—people will turn on each other just because they’re given a role and given power.
Based on my experience, and what I saw and what I felt, I think that was a real stretch. I don’t think the actual events match up with the bold headline. I never did, and I haven’t changed my opinion.
It should also be pointed that the ‘experiment’ wasn’t even designed to provide evidence of anything (unlike Milgrim’s experiments). It’s just putting people in an artificial environment, intervening in various ways to support Zimbardo’s opinions, and watching what happens. There weren’t different conditions to compare.
This isn’t the only time Zimbardo was an active participant in a study. In 1969, Zimbardo left abandoned cars in two neighborhoods. Without saying anything else about it, it should already be clear to you that we will learn nothing from this ‘experiment,’ since we will only have 2 data points (and the locations clearly were not randomly selected). In the Palo Alto location, nothing happened for a week. So, Zimbardo decided to take a sledge hammer to the car. Other people soon joined in. From this we were supposed to learn something about human nature.
This is the worst kind of research, and yet Zimbardo has been extremely influential. What’s going on here? Do we really love his message so much (that ordinary people will do evil things) that we are blind to his methods?
The message Zimbardo was trying to send wasn’t that good people do evil things, but that people doing evil things aren’t necessarily or even usually evil. The question he was asking was NOT “Can we get sane people to do horrible things?” a la Milgrim, but “Given evil behavior in evil situations, what can we infer about people’s base personalities?” and the answer turned out to be nothing at all.
Noting all the horrific behavior that occurs during genocides and wars and all that stuff, Zimbardo set out to find whether this behavior was caused by people being evil.
The selection bias might have been unfortunate, but I don’t know who you’re supposed to advertise to, other than people interested in prison life (and free money) when you’re running a study about prisons. The subjects were screened to be especially mentally healthy. In other words, the crazy mean behavior by the guards had nothing to do with secret desires to be cruel. It was just the role they thought they were supposed to play.
“Most of the Stanford guards did not exhibit any cruel or unusual behavior, often being friendly and doing favors for the prisoners,” isn’t the least bit true, unless “occasionally allowing the prisoners to use the bathroom” counts as a favor. I randomly selected a page number from the (exceedingly long) story in The Lucifer Effect and picked the worst thing on it. From page 112:
“Eventually the guards triumph again; forcing their way into both cells and hauling big bad boy 5704 back into solitary. This time they are taking no chances. They tie him up hands and feet, using their cord taken off the cell doors, before dumping him into the Hole.”
Not as well written as I remember, but…
It’s hard (impossible?) to use non-narrative prose to convey experiences. But the book is full of that and much worse. I was horrified the entire time by what I was reading. And I got terrifying experiences of what was going on there. Two of the prisoners went nuts and had to be removed from the experiment even before Zimbardo called it off prematurely. They staged multiple rebellions, which the guards quashed. The prisoners frequently complained to Warden Zimbardo but never asked to leave. Zimbardo notes how he got unintentionally caught up in the experiment himself, how he “offered a deal” with one of the prisoners to become a snitch, rather than suggesting he leave the experiment, and snapped at someone who asked him what the independent variable of the study was. Since The Lucifer Effect is based on three decades of research after the Stanford Prison Experiment, I doubt he did these things on purpose to make his book more compelling.
“This isn’t the only time Zimbardo was an active participant in a study. In 1969, Zimbardo left abandoned cars in two neighborhoods. [...] and the locations clearly were not randomly selected.” Of course the locations weren’t randomly selected. They were selected to be a rich area and a poor area to see if this made any difference. It’s also hardly fair to call two week-long periods, one with zero activity and the other with immediate and continuous theft, two individual data points, because they only used two cars. From https://www.criminology.fsu.edu/crimtheory/zimbardo.htm:
“Of his experiments to test this theory, on is fairly well known and one is widely known. We will first look at the lesser-known vandalism experiment. In 1969, Zimbardo placed one 1959 Oldsmobile auto on a street across from the Bronx campus of New York University (a ghetto area), and one on a street in Palo Alto, California near the Stanford University campus (a rather affluent area). “The license plates of both cars were removed and the hoods opened to provide the necessary releaser signals (Zimbardo, 1969).” Within three days, the car in the Bronx was completely stripped, the result of 23 separate incidents of vandalism. The car in Palo Alto sat unmolested for over a week. Zimbardo and two of his graduate students decided to provide an example by using a sledgehammer to bash the car. They found that after they had taken the first blow, it was extremely difficult to stop. Observers, who were shouting encouragement, finally joined in the vandalism until the car was completely wrecked.” Nobody knew the car was part of a study. Is this really zero evidence? Is the conclusion that people will join perceived peers in uncivilized activity that shocking that the science should have been called into play? How many cars would he have needed for this to be a valid study?
“It should also be pointed that the ‘experiment’ wasn’t even designed to provide evidence of anything,” an unfair claim based on a misunderstanding of Zimbardo’s point. He wasn’t trying to prove that people can be tricked into doing bad things, as this work had already been mostly done by his former roommate Stanley Milgrim already. He selected mentally healthy individuals and, yes, deliberately put them in situations where they were sure to act horrendously (though indeed far worse than he actually expected). From this he concluded that a person doing evil may yet be entirely mentally healthy. Is this not reasonable?
And what about Guard “John Wayne” Hellman? Apparently he was “merely” imitating the guards from Cool Hand Luke. The idea was that this is what he learned that guards were supposed to act like, so that’s how he acted. Is that not reasonable?
I’ve probably seemed bizarrely defensive of Zimbardo in this comment. But it’s very wrong to call his work “the worst kind of research.”
I appreciate these comments, because I probably was a little hard on Zimbardo. However, it’s hard for me see any difference between his ‘experiment’ and any reality TV show. In fact, I think it was essentially the first reality TV show. People were selected and placed in artificial (interesting) conditions. They were videotaped. The ‘producer’ (Zimbardo) was involved and made sure things were interesting. The videos were edited into an entertaining, and at times shocking, documentary. I’m sure we can learn things from The Real World too, but as a scientist I’m bothered by the sensationalism and lack of rigor.
As far as the cars, it’s fine to design the study to have some cars in affluent areas and some in poor neighborhoods. Then you would do stratified random selection of locations. But with just 1 car in a non-randomly selected poor neighborhood, and just 1 car in an affluent neighborhood, the study is both not generalizable and underpowered. Further, his direct participation in it is just bizarre (but entertaining).
In fact, I think it was essentially the first reality TV show.
It was close! Many of the same criticisms apply.