Reasons to be sceptical of that shiny new study: A bibliography

Woolie · Oct 23, 2017

Articles that highlight the problems/limitations of clinical trials in general (**=key article)

Placebo effects are weak: regression to the mean is the main reason ineffective treatments appear to work.
Colquhoun, D. (blog) 2015
link to blog

Placebo Myths debunked.
Novella, S. (blog) 2017
In this article, Novella explains that the placebo effect isn’t one thing, but rather a collection of spurious effects than can affect trial outcomes. These include: a) regression to the mean (people often enrol in studies at their worst, so there is a good likelihood of some spontaneous improvement during the trial); b) bias in perceiving and reporting subjective symptoms where improvement is expected; and c) indirect effects, such as increased compliance with other treatment recommendations. The article further points out that the placebo effect is unlikely to be “real”, because in virtually all cases, no accompanying improvements are seen on objectively measurable outcomes. The only exception to this is likely to be certain psychological complaints, such as anxiety, where the promise of possible improvement might have a direct beneficial effect on the person’s state of mind.
link to blog
link to S4ME discussion of article

** Bias due to lack of patient blinding in clinical trials. A systematic review of trials randomizing patients to blind and nonblind sub-studies.
Hróbjartsson A, Emanuelsson F, Skou Thomsen AS, Hilden J, Brorson S.
International journal of epidemiology. 2014 May 30;43(4):1272-83.
link to article
This study asks the question: "How much of an effect does blinding have on the outcomes of a clinical trial"? The researchers selected all clinical trials they could find that comprised two phases: one where participants were fully aware of what treatment they were getting, and the other where they were fully blinded. The authors found that participants' own ratings of their health/symptoms were powerfully affected by their knowledge of their treatment allocation. Treatment effects were much bigger when participants knew which treatment group they were in. However, objective measures, or ratings of health/symptoms made by a (blinded) observer were not affected in this way. This study shows that when participants have a strong expectation that a treatment will work, their perceptions of its effectiveness are massively inflated; consequently, trials that cannot be blinded (e.g, psychotherapy trials) should avoid outcomes that rely entirely on patients' own self reports.

Empirical evidence of study design biases in randomized trials: systematic review of meta-epidemiological studies.
Page MJ, Higgins JP, Clayton G, Sterne JA, Hróbjartsson A, Savović J.
PloS one. 2016 Jul 11;11(7):e0159267.
link to article

Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study.
Wood L, Egger M, Gluud LL, et al.
BMJ. 2008 Mar 13;336(7644):601-5.
link to article

Articles that focus specifically on trials of psychotherapy or other behavioural interventions (** = key article)

The pervasive problem with placebos in psychology: why active control groups are not sufficient to rule out placebo effects.
Boot WR, Simons DJ, Stothart C, Stutts C.
Perspectives on Psychological Science. 2013 Jul;8(4):445-54.
link to article
When evaluating any sort of behavioural treatment (e.g. CBT, graded exercise), the researcher has to decide what sort of attention the comparison group will receive. This article argues that it’s not enough to just offer an “active” treatment, like relaxation therapy or medical care. You also have to make sure that people receiving the comparison treatment have the same expectation of improvement as those receiving the treatment of interest. Failure to do this will mean that the study is fundamentally flawed and no meaningful conclusions can be drawn from it. The article offers some suggestions for dealing with this problem, including alternative designs and also ways of measuring participants’ expectations.

A systematic review of trial registration and selective outcome reporting in psychotherapy randomized controlled trials.
Bradley HA, Rucklidge JJ, Mulder RT.
Acta Psychiatrica Scandinavica. 2017 Jan 1;135(1):65-77.
link to article
This article reviewed a large cohort of high profile randomised trials of psychotherapy interventions. The majority of these trials were registered (that is, the researchers had published a protocol stating ahead of time what they would measure). However, only five percent presented their outcomes correctly, as specified - that is, without any selective reporting favouring the most positive outcomes. The article concludes the we cannot currently have confidence in the conclusions of even the highest quality psychotherapy trials.

** Why ineffective psychotherapies appear to work: A taxonomy of causes of spurious therapeutic effectiveness.
Lilienfeld SO, Ritschel LA, Lynn SJ, Cautin RL, Latzman RD.
Perspectives on Psychological Science. 2014 Jul;9(4):355-87.
link to article
This article discusses the psychological factors that can operate to give clinicians and others the false impression a therapy is actually helping the client. These factors include our tendency to see meaningful causal relations even when they are not there, our tendency to focus on evidence that confirms, rather than challenges, our beliefs (confirmation bias), and our inclination to overestimate how much effect our own actions have on outcomes (illusion of control). The article then lists 26 different phenomena that can lead therapists to overestimate the effectiveness of a treatment, either in their own practice or in a formal clinical trial. Examples include: the tendency for clients to report improvements that align with the therapist's expectations, and underplay those that do not; the tendency for psychotherapy to change perceptions so that people report fewer symptoms/distress even though the reality is unchanged ('response shift bias'); regression to the mean (people often present for therapy or a formal trial at their worst, so on average there will be likely to be some improvement); effort justification.

The researcher’s own therapy allegiances: A ‘wild card’ in comparisons of treatment efficacy.
Luborsky L, Diguer L, Seligman DA, et al. (1999)
Clinical Psychology: Science and Practice 6(1): 95–106.
link to abstract
Trials designed to measure the effectiveness of psychotherapy are often conducted by researchers with a strong vested interest in one particular treatment. This metanalysis looks at whether these allegiances have any effect on outcomes. The researchers selected a series of psychotherapy studies that included more than one treatment, and created a measure of “researcher allegiance” of each (based on things like their previous publications). They found that their measure of researcher allegiance accounted for over two thirds of the variance in trial outcomes. In other words, one of the strongest factors affecting the outcomes of psychotherapy trials is what the researchers themselves believe.

Articles that are critical of current scientific methods more generally (not just clinical trials)

A vast graveyard of undead theories: Publication bias and psychological science’s aversion to the null.
Ferguson, C. J., & Heene, M.
Perspectives on Psychological Science. 2012 Nov;7(6):555-61.
link to article

Why science is not necessarily self-correcting.
Ioannidis, J. P. (2012).
Perspectives on Psychological Science. 2012 Nov;7(6):645-54
link to article

Citation bias favoring statistically significant studies was present in medical research.
Jannot, A. S., Agoritsas, T., Gayet-Ageron, A., & Perneger, T. V.
Journal of Clinical Epidemiology. 2013 Mar 31;66(3):296-301
link to abstract

False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant.
Simmons, J. P., Nelson, L. D., & Simonsohn, U.
Psychological science. 2011 Nov;22(11):1359-66.
link to article

Why Most Published Research Findings Are False
Ioannidis, J.P.A
PLoS Medicine. 2005; 2(8), e124.
link to article

0.05 or 0.005? P-value Wars Continue
Novella, S. blog. 2017
link to blog
Most scientific articles base their conclusions on statistical tests (e.g. t-test, correlation, regression, analysis of variance). These tests generate an outcome called a p-value, which is an estimate of how likely it is that effects observed int he study could have occurred by chance alone. For decades, researchers have agreed that, if the p-value is less than .05, then we can safely say the effect is probably real (then there is less than a 1 in 20 chance the effect was due to chance alone). This blog discusses a recent paper that recommended the minimum p-value be changed from .05 to .005. They argued this would improve replicability: if we used the stricter criterion of .005, then its more likely that a subsequent replication of that study will also be positive. However, other researchers - including the author of this blog -- disagree. They argue that it's our obsession with the p value itself that's the problem, not the value its set to.

Reasons to be sceptical of that shiny new study: A bibliography

Woolie

Senior Member