Replication Crisis: A Defense of Psychology

psychology under assault of replication crisis

The replication crisis has engulfed different branches of science, and the social sciences have been hit the hardest.

Psychology and social psychology, in particular, have been swept under a spate of scandals and failures to replicate.

Whole theories that had been widely accepted as proven are being questioned and disputed (Priming, Ego Depletion).
Even classical studies that have been part of psychology curricula for decades are under heavy attack (Stanford Experiment, Milgram Experiment).

The blowback has been hard, and understandably so.
Trust in the social sciences has eroded.

Keep Calm & Stay Rational

But if behavioral economics taught us anything, it is that in most reactions and corrections, there is often a tendency for over-reaction and over-correction.

This article will make the point that some authors, commentators, and even scientists might be taking it too far.
I will make the point that the replication failure of some studies does not (necessarily) call into question general psychological theories and, often, do not even refute the principles behind the study which fails to replicate.

psychology under assault of replication crisis

Replication Crisis Knee-Jerk Reaction

The replication crisis is one of the hottest topics at scientists’ tables. And understandably so.

Most good scientists avoid overgeneralizing. But as the news seeped from academia into the world of non-academics, the reactions have not been equally discerning.

ScienceOfPeople.com, a website on applied psychology claiming to run its own science lab (and which puts “science” in its own URL) writes that (bold font is mine):

Much of what you know about psychology may be a lie.

Other popular bloggers like Na Eliason who dabble with science and psychology jumped on the bandwagon (bold is mine): 

It wouldn’t be crazy to assume that any new claim in psychology is more likely to be false than true

And even more reputable publications like The Independent didn’t hold back.
If anything, they went even further when they dropped the term “psycho-babble” (bold is mine and so is the triple question mark):

Psychology has long been the butt of jokes (???) about its deep insight into the human mind (…) and now a study has revealed that much of its published research really is psycho-babble.

Is “much of what you know about psychology” a lie, “much of psychological research “psycho-babble” and is research “more likely to be false than true”?

No, no and no.

I make the case that the knee-jerk reaction, as it’s often the case with humans, has gone too far. 

We are now witnessing an overreaction, where everyone is calling into question anything. 
And as in any witch hunting, the bigger the prey the more glory (and power) for the hunter.

That many articles that purport to “debunk pop-psychology myths” are going viral (mine included) is the proof that we are in the middle of a “psychology sh*t storm” fueled not by actual scientists, but by amateurs.

The Dark & Basest Human Drives Are Fanning The Crisis

Typical human biases and some of the darkest human drives are also stoking the flames higher.

frenemy relationships“, envy among researchers and the search for glory in the current “hunt to debunk” have ushered what Leonard Martin called a “McCarthy era for psychology”.

Scientists whose life-work have been called into question have sometimes reacted not like cool-minded scientists, but like the ego-driven creatures we often are (again: psychology at play).
And that hasn’t helped either.

But as scientists, we all have a moral duty to rise above our lowest and darkest drives.
And we all have a moral duty to seek the truth.

And to seek the truth, we cannot limit ourselves to destroying and “debunking”.
In this replication crisis, we must throw the bathwater, but we must keep the baby.

We must throw the bathwater. But we must keep the baby.

This article will seek to bring some clarity on what’s bathwater, and what’s the baby we must keep.
And it will call for a more balanced approach to this “house cleaning” that is the replication crisis.

How It All Happened

Replications have been failing since replications first started, so how we date the beginning of the current replication crisis is, at least partially, a subjective task.

Wikipedia says the term “replication crisis” was coined in the early 2010s as the world became more and more aware of the high rate of replication failures.

The newspaper Slate says the replication crisis started in 2011 when a paper by Bem “proved” that ESP is real (ie.: it proved people can predict the future).
Bem’s paper used widely accepted research methods, so it was either ESP is real and people can predict the future or… Our accepted minimum thresholds for proper research were not high enough.

And scientists being the skeptical bunch they are, they started more and more opting for the latter.

That was the beginning of the replication crisis, and acute observers could already see the dark clouds gathering over science and psychology.

replication crisis in psychology

Daniel Kahneman, author of “Thinking Fast and Slow” wrote in 2012 that “he could see a train wreck looming”.
And he was right (ESP be damned).

Five years later, Kahneman would admit that he had made a major mistake in placing too much faith on low-powered studies (ie.: studies with too few participants).
And he was only one of many to have been fooled.

In 2015 the results of the Reproducibility Project were published, and they were unflattering, to say the least.
More than half of 100 studies published on prominent psychology journals failed to replicate.

At the time of writing (May 2019), the reproducibility crisis is still underway.

Replication Crisis in Psychology: The Causes

Many explanations have been put forward to explain why we are dealing with endemic levels of replication failures.

Some of them are specific to psychology, some others apply to all other sciences:

  • Outright fraudulent research (ie.: doctored results, made up numbers etc.)
  • Questionable research practices (these are grey are practices which are not outright illegal)
    • Selective reporting or partial publication of data (reporting only some of the study conditions or data)
    • Optional stopping (choosing when to stop data collection, often based on the statistical significance of tests)
    • p-value rounding (rounding p-values down to 0.05 to suggest statistical significance)
    • Manipulation of outliers (either removing outliers or leaving outliers in a dataset to manufacture statistical significant)
  • Publication biases 
    • Publishing only studies with significant or interesting results 
    • Confirmation bias (once a theory has been accepted as true, publications and researchers seek to confirm the theory to get published) 
  • Skewed incentives for researcher and publishers
    • Researchers have to publish
    • Researchers have more success and financial rewards with “new”, “exciting” and “self-help friendly” research
    • Journals can have an incentive to publish “new” and “exciting” theories
    • Few replication studies mean little risks for scientists who cheat or “embellish”
  • Researchers’ agendas (ie.: researchers might really believe in their theory and then “find a way” to make it happen)
  • Human errors and biases
    • “Priming” effects on researchers who expect certain results
    • Poor grasp of math and statistics by psychologists

They all contribute, and probably there are even more.
We can address some of these issues, and with some, we are already doing so.

But first, I want to review the studies and theories that we should defend from “over-zealous debunkers”.

Theories & Studies Worth Defending

In this section, I will analyze and fact-check significant studies that have come under undeservedly heavy attacks.

By “undeservedly heavy attack” I mean that the studies, albeit flawed, don’t deserve to be branded as “debunked” and thrown away.
Many studies that failed to replicate indeed still offer valid insights and, in almost all of the cases, they don’t cancel the validity of the general psychological concept.

#1. Power Posing

You know this one, don’t you?

If you have been reading any self-help literature or if you are part of any self-help circle you must know this one. 

Amy Cuddy’s TED talk amassed a massive 16 million views on YouTube alone.
And yet, I’m pretty sure that it’s only a tiny fraction of those viewers who are aware of the massive issues with “power posing”.

Power Posing Replication Crisis

The power posing original study only had 42 participants and it failed replication when it comes to hormonal changes.

The arguments against power posing are so convincing that Dana Carney, co-author of the original paper, abandoned the idea publicly in 2016.
In her open letter, she declared very directly that “she does not believe that power poses effects are real”.
And, I quote her:

The evidence against the existence of power poses is undeniable

Amy Cuddy clung to her original idea (I suppose 16 million views can lead to a major dose of cognitive dissonance) and after more research, she published a new paper confirming the effects of power poses on people’s own feelings (and not anymore on hormones).

Oh, P.S.:
“power posing” underwent a major rebranding under Cuddy’s “fight on” strategy.
Now it’s called “postural feedback”, so make sure you follow the correct branding guidelines :).

Defense: Feelings of Confidence Are Real

Albeit it’s not true that “power posing” induces any hormonal changes, it’s true that people do feel more powerful and confident.

And it’s also true that the way you stand will change people’s perception of you, which will, in turn, most likely change your interactions and, again, the way you feel about yourself.

#2. Stanford Prison Experiment

The Stanford Prison Experiment (SPE) is a huge classic of social psychology.

And like many other classics, it’s been under heavy attack recently.
Philip Zimbardo, the author of the original experiment and also the author of “The Lucifer Effect” has been a staunch defender of his baby -and by baby I mean SPE-. 
But the staunch defense only further galvanized the attackers.

Discussing SPE Brian Resnick from Vox asks Zimbardo:

So does that not invalidate the conclusion?

From then on the interview resembled a sparring session of attack and counter-attack rather than a discussion on human nature (or the scientific merits of the experiment).

This other article, which title “The Lifespan of A Lie” tells you all you need to know about the author’s agenda, says: 

I hope there does come a point (…) where Zimbardo’s narrative dies

Really?
Hoping for Zimbardo’s narrative to die rather than the truth to come out?

This is what I’m talking about when I say that the replication crisis is fueled by power dynamics and personal agendas rather than simply a matter of proper science.

Stanford Experiment Replication Crisis

The “Stanford Experiment” is not an experiment in the way that scientists use the word “experiment”.
It’s not replicable and Zimbardo himself has always said so, calling it instead “a demonstration”.

But still, I recognize the value of the criticism. 

There are good reasons to caution anyone reading, studying or researching SPE.
Especially after it grew so big that the media and Hollywood made a mockery out of the psychology behind it:

Defense: Environments and Social Roles Do Shape Us

However, in spite of all the very valid criticism and limitations regarding its scientific nature, the “Stanford Experiment” still tells us lots of deep (and uncomfortable) truths about human nature.

The realization that normal people can act evil and that social roles and extreme environments can heavily affect our behaviors are socio-psychological truths.
There is a whole branch of psychology studying how groups and social roles impact the individual and it’s called Social Identity Theory.

Refuting that society and social roles can impact our behavior for the worst as many critics seem to do is a failure to understand human nature.

As Zimbardo said:

SPE serves as a cautionary tale of what might happen to any of us if we underestimate the extent to which the power of social roles and external pressures can influence our actions.

I agree with that.
And that’s true no matter how you stand in relation to the “Stanford Experiment”.

#3. Milgram Experiment

Riding on the iconoclastic high wave sweeping through all classical studies, “Milgram’s Experiment” couldn’t have possibly come out of it unscathed.

Milgram Experiment Replication Crisis

Gina Perry, an Australian psychologist has written a whole book on the supposed “lies” behind the “Milgram Experiment”.

I haven’t read the book yet but her interview here and her paper here are pretty clear.
In short, she says that only half of the subjects believed the experiment was real and of those who believed it was real, 66% disobeyed the researcher.

Defense: Authority Does Exert Powerful Influence

At this point, I cannot judge how reliable and how good the data behind the Milgram’s experiment really is.

But not only a similar study in Poland recently yielded results similar to the original experiment, but this meta-analysis showed a high level of obedience pretty much anywhere.

Oh, and if we want to look beyond data and labs, we can just take a look at history and reality.
And they both show quite clearly that authority can lead people to do some pretty evil stuff (or did you really think it was just the evil Germans?).

#4. Grit

Angela Duckworth is the researcher behind the construct of “grit”, which is a mixture of tenacity and passion.
She claimed that passion and perseverance are the mix that more than anything else predicted success.

She also coined a simple “success” equation:

Talent x Effort = Skill 

Skill x Effort = Achievement

The concept was a smash hit in the self-help literature.
Possibly because it told people exactly what they wanted to hear: follow your passion and as long as you put in the effort you’ll be successful.

Pop-psychology self-help gurus jumped on it and helped propel her book “Grit“ into a best-seller and her TED talk to millions of views:

Note: Angela Duckworth said she was unhappy with the title that was given to her TED Talk, and yet… She did very little to make the content of the talk factually accurate.
As a matter of fact, she further oversold her (inflated) findings.

Grit Replication Crisis

Two studies (with 213 students and 498 students) did not find “grit” to correlate highly with results (but conscientiousness did).

And a meta-study from 88 samples shows only a modest correlation between grit and performance.

Duckworth also seems to have been overselling Grit both in her book and in her TED talk. 
Even in her original research, the difference between everyone and the grittiest cadets was very small  (98% of the grittiest cadets made it through and 95% of overall cadets made it through).

Check out my pop-psychology article if you want to read more, but the point I’m making here is that in spite of the obvious limitations not all about “grit” as in “passion and perseverance” must be thrown away.

Defense: Perseverance & Passion Are Crucial in Life

Angela Duckworth sought to measure both perseverance and passion.

And yes, she has embellished and oversold her findings. And yes, she probably made up this new construct that is a bit too similar to previous psychological traits.

But anyone arguing that perseverance -as in the ability to stick to your work when things get difficult- and passion -as in the amount of pleasure you derive from your work- are not important in life is lying.

Perseverance over the long run is what helps transforms big dreams into realities. 
And working on what we care about is what makes our lives more meaningful and, well… Happier.

So do stick with your projects as long as it makes sense. Don’t give up at the first difficulties and do pursue what you like and what stimulates you.

#5. The Marshmallow Test

The “Marshmallow Test” central message goes like this: “children who were able to resist the temptations of eating a marshmallow now in exchange for two marshmallows later went on to become more successful in life”.

That seemed to prove that control over urges and delay of gratification were key to success. 
And that you could test that early on with a simple test.

The original authors of the study, of course, went on to write a popular book named, you’d never guess, “The Marshmallow Test“.
And the video of the experiment amassed millions of views:

Marshmallow Test Replication Crisis

Typer Watts revisited the study and came out with a 50% smaller correlation between delayed gratification and “success” at 15 years old.

And what was the correlation when accounting for intelligence and family background?
Almost nothing.

He writes:

Associations between delay time and measures of behavioral outcomes at age 15 were much smaller and rarely statistically significant.

There you go.

Defense: Delaying Gratification Is Crucial to Success

The “Marshmallow Test” is obviously a severely limited study. For example, who says that all kids liked marshmallows in equal measure? 

And, between us nutrition nuts, who said that kids were better off with double the amount of poisonous lumps of sugar? 🙂

But jokes aside, would anyone deny that the ability to strategically delay gratification can be a crucial aspect of success?

Maybe you know the tale of “The Ant and The Grasshopper“.
And if you don’t, you can probably agree that what often makes the difference between turning dreams into reality and never achieving anything lies in the ability to work hard.

And, often, working hard includes and entails the delay of gratification.
Which is exactly the point of the “Marshmallow Test”.

#6. Fixed & Growth Mindset

Carol Dweck’s research (and book) says that children who believe their abilities are liable to improve through efforts and failures learn more and improve much faster than children who believe that their traits are fixed and can’t be changed.

Dweck’s original paper suggested that a growth mindset can be taught with appropriate feedback and direction from teachers.

Growth Mindset Replication Crisis

More than a “replication crisis” Carol Dweck’s assumption is simply difficult to replicate.
A proper growth mindset study entails teaching the mindset and then measuring the results over time, which is neither easy nor quick.

But Li and Bates’ effort to replicate Dweck’s initial results fell largely flat.
Carol Dweck replied that the experiments are difficult to replicate and must be done well.
To which Nick Brown quipped that “if the experiments are so difficult to replicate, why does Dweck think they can be easily replicated by school teachers”. 

He has a point.

A meta-analysis measuring goal-setting, work towards the goals and goals achievement was kinder towards Dweck’s growth mindset research, albeit with some caveats. 
It says:

The present meta-analysis suggests that mindsets matter.
One important conclusion from the present meta-analysis is that the associations of implicit theories with self-regulation are not straightforward and that perhaps the literature would be better served by asking when and how implicit theories are consequential for self-regulation rather than asking if incremental theories are generally beneficial.

However, most papers on growth mindset have been written or co-authored by Carol Dweck herself, so one must wonder about the validity of a meta-analysis where a good chunk of the papers is from the same author who’s been criticized of embellishing her results.

Defense: Mindsets Are Important Psychological Traits With Major Implications

There are quite a few serious issues with Dweck’s researches and I am grateful to all the people who are pointing them out.
More research is also needed here.

Yet it’s inconceivable to me that the difference between believing you can improve and believing you can’t improve won’t have a large impact on your life.

And that impact reflects not only in “success” or “school grades” (who cares about school grades anyway?), but also from a mental health point of view and the way you approach life.

I personally could see in myself how a fixed mindset held me back many times in life.
And I can see how nudging myself into a more growth-mindset has helped me out in countless ways.

Sure, maybe you can’t easily teach a “growth mindset” and then see a huge jump in school grades. 
That wouldn’t be realistic anyway, and I suspect that’s part of the issue: the impact is not so obvious and easy to measure.
But developing a growth mindset will improve your life. And significantly so.

#7. Priming

Priming in nutshell is this: if I subconsciously make you think about something, I will activate other related thought that will (subconsciously) change your behavior.

John Bargh is the Yale psychologist who first “proved” subconscious priming.
The original experiment has been cited thousands of times and spawned tens of similar studies which all seemed to prove that, indeed, priming works.

Priming Replication Crisis

Priming failed to replicate.
Bargh vehemently defended his research but, eventually, the tide seemed to heavily turn against priming.

Defense: Priming Might Not Be As Powerful, But It’s Real

Here is the funny thing: as priming failed to replicate in the study’s subjects, it actually worked… in the researchers.

Indeed when Doyen told the researchers that the subjects were primed to walk more slowly, the researchers measured the subjects’ walk more slowly as compared to the automated measurement devices.

Doyen’s study is indeed called: “It’s All in the Mind, but Whose Mind?”

It’s ironic that the paper who started the “assault” on priming actually confirms that priming works.
Just in a different way.

Probably priming does not change behavior as powerfully and unconsciously as we thought, but it exists.
And it works.

And the psychological phenomenon of “elevating” related concepts that underpin priming is also true.
Says Colin Camerer:

Priming has turned out to be the least replicable phenomenon.
It’s a shame because the underlying concept—that thinking about one thing elevates associations to related things—is undoubtedly true.

The Replication Crisis is An Opportunity

The replication crisis in psychology -and in all other sciences as well- is a wonderful opportunity for science.

Says Howard Kurtzman, executive director for science at the American Psychology Association:

The outcomes point to the need for reforms in research, review and publication practices.

Indeed.

I am happy we are going through this crisis.
It is showing to us all, beyond any reasonable doubt, that we have been operating below our potential and below science’s potential.
And that’s always a sign that we can advance towards better systems and processes.

This replication crisis does not sap faith in science but restores it.
It means that we were able to catch issues and mistakes, and we now have the opportunity to fix those mistakes.

This can be our chance to reborn stronger and better than ever.
That we take this opportunity is up to each one of us.

Incentives Might Need to Be Addressed

Darren Huff showed it long ago in his “How to Lie With Statistics“: lying with statistics is easy.

If we put incentives on researchers to find correlations, come up with new theories, deliver TED talks and publish groundbreaking papers, then we are inviting them to embellish those statistics.
And researchers will find a way to do just that.

And our current research systems are structured, scientists are incentivized to bend statistics and embellish their results

Says Brian Nosek, executive director of the Center for Open Science:

We can nudge the incentives driving our behavior so that researchers are rewarded for more transparent and reproducible research

Exactly.

Of course, there are other issues we need to address as well, but addressing researchers’ and journals’ incentives is where I would start from.

Other Areas of Intervention

Here are a few more ideas to improve science:

  • Pre-registration

Pre-registration means that before a researcher begins his study, he publishers -ie.: register- online what his goal is, what data he will collect and what type of analysis he will be running.

That way, he won’t be able to “tweak” the numbers and run different analysis on the data to come up with “something more meaningful”.

  • Institutionalize replications (ie.: papers’ auditing)

Replication is unsexy.
The biggest spoils in research always go to the ones that do something original and the biggest windfall of them all goes to those who champion a new theory, a new concept or a new marketable technology.

Yet replication is exactly what can nip in the bud years of fake pop-psychology claims and self-help myths from festering and spreading.

An institutionalized and independent replication center will also serve as an incentive not to mess with the data.
It’s like an independent watchdog for proper science.

We’re Already Improving

Some research sponsors are already improving their policies to stress the importance of reproducibility and more journals are now requiring pre-registration.

More scientists are doing mea-culpa, and psychologists know that’s a precondition for positive change.

Transparency is now all the rage and data sharing for cross-checking is more expected than ever.

Replication research is getting more press, more attention and, as well, more funds. 
And that’s good.

The legislation crisis is already improving science.
We must be grateful for it, and at the same time we must keep on pushing.

clouds opening up on psychology

SUMMARY

The replication crisis in psychology and social psychology has hit us hard.

But this is not a crisis of psychology -or science, for that matter-.
The main foundations of psychology and social psychology are still all there, and always will be.

If you want to get along with people, persuade and influence people, and if you want to upgrade yourself, then you need to understand social psychology and individuals’ psychology.

This article made the point that even the studies who have been hit by the replication crisis the most, they still remain hugely important milestones in our understanding of the human mind.

But this is not to say that there are no issues. 
There are issues in the researchers’ incentives and in the publication system.
And properly addressing those issues will make science and psychology better and stronger than it’s ever been.

It’s already happening, and that’s great news.

Processing...