James Coyne: "When psychotherapy trials have multiple flaws…"

Andy

Retired committee member
Multiple flaws pose more threats to the validity of psychotherapy studies than would be inferred when the individual flaws are considered independently.

We can learn to spot features of psychotherapy trials that are likely to lead to exaggerated claims of efficacy for treatments or claims that will not generalize beyond the sample that is being studied in a particular clinical trial. We can look to the adequacy of sample size, and spot what Cochrane collaboration has defined as risk of bias in their handy assessment tool.

We can look at the case-mix in the particular sites where patients were recruited. We can examine the adequacy of diagnostic criteria that were used for entering patients to a trial. We can examine how blinded the trial was in terms of whoever assigned patients to particular conditions, but also what the patients, the treatment providers, and their evaluaters knew which condition to which particular patients were assigned.

And so on. But what about combinations of these factors?
We typically do not pay enough attention multiple flaws in the same trial. I include myself among the guilty. We may suspect that flaws are seldom simply additive in their effect, but we don’t consider whether they may be even synergism in the negative effects on the validity of a trial. As we will see in this analysis of a clinical trial, multiple flaws can provide more threats to the validity trial than what we might infer when the individual flaws are considered independently.

The particular paper we are probing is described in its discussion section as the “largest RCT to date testing the efficacy of group CBT for patients with CFS.” It also takes on added importance because two of the authors, Gijs Bleijenberg and Hans Knoop, are considered leading experts in the Netherlands. The treatment protocol was developed over time by the Dutch Expert Centre for Chronic Fatigue (NKCV, http://www.nkcv.nl; Knoop and Bleijenberg, 2010). Moreover, these senior authors dismiss any criticism and even ridicule critics. This study is cited as support for their overall assessment of their own work. Gijs Bleijenberg claims:

Cognitive behavioural therapy is still an effective treatment, even the preferential treatment for chronic fatigue syndrome.

But “Not everybody endorses these conclusions, however their objections are mostly baseless.
https://www.coyneoftherealm.com/blogs/mind-the-brain/when-psychotherapy-trials-have-multiple-flaws
 
This is a hard hitting piece.

James Coyne said:
I at first did not believe what I was reading when I encountered this statement in the results section.

The mean time between baseline and second assessment was 6.2 months (SD = 0.9) in the control condition and 12.0 months (SD = 2.4) in the intervention group. This difference in assessment duration was significant (p < 0.001) and was mainly due to the fact that the start of the therapy groups had to be frequently postponed because of an irregular patient flow and limited treatment capacities for group therapy at our clinic. In accordance with the treatment manual, the second assessment was postponed until the fourteenth group session was accomplished. The mean time between the last group session and the second assessment was 3.3 weeks (SD = 3.5).
 
Last edited:
He's got a summary near the beginning, though it's still good to click through so his blog gets the hits:
This is a long read blog post. I will offer a summary for those who don’t want to read through it, but who still want the gist of what I will be saying. However, as always, I encourage readers to be skeptical of what I say and to look to my evidence and arguments and decide for themselves.

Authors of this trial stacked the deck to demonstrate that their treatment is effective. They are striving to support the extraordinary claim that group cognitive behavior therapy fosters not only better adaptation, but actually recovery from what is internationally considered a physical condition.

There are some obvious features of the study that contribute to the likelihood of a positive effect, but these features need to be considered collectively, in combination, to appreciate the strength of this effort to guarantee positive results.

This study represents the perfect storm of design features that operate synergistically:

perfect-storm.jpg


Referral bias – Trial conducted in a single specialized treatment setting known for advocating psychological factors maintaining physical illness.

Strong self-selection bias of a minority of patients enrolling in the trial seeking a treatment they otherwise cannot get.

Broad, overinclusive diagnostic criteria for entry into the trial.

Active treatment condition carry strong message how patients should respond to outcome assessment with improvement.

An unblinded trial with a waitlist control lacking the nonspecific elements (placebo) that confound the active treatment.

Subjective self-report outcomes.

Specifying a clinically significant improvement that required only that a primary outcome be less than needed for entry into the trial

Deliberate exclusion of relevant objective outcomes.

Avoidance of any recording of negative effects.
 
I am sure I will enjoy reading this in due course. But I have a sneaking suspicion this is just a rehash of Esther12's PM to me about three years ago, which I had the opportunity to rehash for the Journal of Health Psychology. PACE started off bad, but every opportunity to mitigate that badness was turned around into a synergistic worsening.

This one though does sound a bit Pythonesque. Sort of 'if you think Fatang-Fatang is a mouthful try Harquin Fim Tim Lim Bim Bus Stop Fatang Fatang Ole Biscuit Barrel'.
 
I am sure I will enjoy reading this in due course. But I have a sneaking suspicion this is just a rehash of Esther12's PM to me about three years ago, which I had the opportunity to rehash for the Journal of Health Psychology. PACE started off bad, but every opportunity to mitigate that badness was turned around into a synergistic worsening.

You got it. This one has a few extra surprises though. The timing of assessment varied between the control group and the intervention group, and the authors dismissed the need to collect data on potential harms. Both of these sound highly unusual to me (at least in context of fairly consistent reports of harm in surveys).
 
I enjoyed that. No rehash! I don't remember reading much from other people about this study previously. Maybe because it came out when everyone was PACE obsessed? I don't think I've ever read the paper referred to.

Paper is currenlty here: http://www.karger.com.https.sci-hub.tv/Article/Abstract/438867

Does it seem weird that this paper was only published in 2015, when the trial registration say in was completed in 2006? http://www.isrctn.com/ISRCTN15823716

Interesting that they said this:

In contrast to our previous work [4], we communicated recovery in terms of fatigue and disabilities as general goal of the intervention.

I thought this group has generally been pretty gung-ho with that - even more so this time?!

Great to see more attention being brought to this:

We cannot let these the authors of this trial off the hook in their dependence on subjective self-report outcomes. They are instructing patients that recovery is the goal, which implies that it is an attainable goal. We can reasonably be skeptical about acclaim of recovery based on changes in self-report measures. Were the patients actually able to exercise? What was their exercise capacity, as objectively measured? Did they return to work?

These authors have included such objective measurements in past studies, but not included them as primary outcomes, nor, even in some cases, reported them in the main paper reporting the trial.

Wiborg JF, Knoop H, Stulemeijer M, Prins JB, Bleijenberg G. How does cognitive behaviour therapy reduce fatigue in patients with chronic fatigue syndrome? The role of physical activity. Psychol Med. 2010 Jan 5:1

The senior authors’ review fails to mention their three studies using actigraphy that did not find effects for CBT. I am unaware of any studies that did find enduring effects.

Perhaps this is what they mean when they say the protocol has been developed over time – they removed what they found to be threats to the findings that they wanted to claim.

Weird about the different times between baseline and second assessment. I don't think I've ever seen that before.

So, outcomes were assessed for the intervention group shortly after completion of therapy, when nonspecific (placebo) effects would be stronger, but a mean of six months later than for patients assigned to the control condition.

In some ways, I wondered if that would actually make much difference, see as the wait-list control group was already so poor at controlling for nonspecific effects?

Some parts I was a bit less sure on:

Personally, I'm not too keen on what I've seen of Beck's 'collaborative empiricism', and I suspect Knoop & co could claim they use some form of it, although I probably need to look into this more. The problems we see around ME/CFS often makes me think the worst of related work, and I always fear that supposedly 'collaborative' approaches can end up being used to manage patients according to the assumptions of the professional. [edit: I was thinking about this last night - while they may not explicitly distance themselves from 'collaborative empiriricism', their approach to CBT does sound quite different.]

Is this bit right?:

Actually, the US Center for Disease Control would now reject this trial because these entry criteria are considered obsolete, overinclusive, and not sufficiently exclusive of other conditions that might be associated with chronic fatigue.*

Fair to say that they seem to consider the Oxford criteria obsolete, but all the other various criteria still seem to be floating around. The * links to the concern that patients may not have been properly assessed before entry into the trial, but that doesn't make the criteria obsolete. I admit that I don't really know what I'm talking about with all the different CFS criteria, but this point seemed overblown to me.

I'm often wary of debates about defining ME/CFS (and including things like the quote from Hooper) within the context of this sort of research (although I recognise these discussions can be important in other social and political contexts). There is still a lot of uncertainty around ME/CFS and how it should be defined, and if these trials were providing good evidence that their intervention was helping some group of patients, that would still be of interest. What most concerns me is that problems with the research so often makes it impossible to say that anyone is being helped. Some of my caution with this point may stem from years of watching discussions being taken off-track by irrelevant 'CFS and the stigma of MH' warbling.
 
Last edited:
It's great that a negative analysis has been published first by someone outside of the ME community - shows the tides are turning and that other people are now looking more at the way trials have been carried out and reported.

James Coyne has been blogging about ME and other psychological research for a while. I'm pleased too that he's exposing the flaws in the research, but I don't know what the readership is for his blogs.
I'm not sure the people who need to hear this analysis would regard blogs as 'published' in the sense that they are not peer reviewed.
 
Back
Top Bottom