Cochrane Review: 'Exercise therapy for chronic fatigue syndrome' 2017, Larun et al. - Recent developments, 2018-19

@Caroline Struthers, these emails may be of interest to you.
Thank you, and especially to @Marit @memhj for getting hold of them. I have now had a chance to read these emails and they are horribly fascinating. And of course completely missing the point about the folly of relying on subjective outcomes in a review of unblinded trials, because of course that would necessitate starting the review again. So the nitpicky argument over the "defensible" way to describe the size of an effect is a waste of everyone's time, as we all know. Sole concern of both sides is to save face rather than help people with ME.
 
And of course completely missing the point about the folly of relying on subjective outcomes in a review of unblinded trials, because of course that would necessitate starting the review again.

The most bizarre thing is their grading of the evidence. HADS they score "very low", but the combined fatigue measures (which they should have realised that they can't combine, because one (CFQ) measures *change* in individual fatigue since becoming ill and the other (CIS) measures pseudoabsolute individual fatigue over past 2 weeks) are scored "low" or "moderate" depending on which likert/non-likert version they used. It seems to indicate that their GRADE score is dependent on whether the score provides a significant result in the review, and not on whether it is of sufficient *quality* and suitability to do so.

If HADS scores "very low", then all the scales should score "very low", because they all suffer from the same problems.
 
If HADS scores "very low", then all the scales should score "very low", because they all suffer from the same problems.

I can't remember that part of the discussion now, but could it be that they had fewer trials with HADS as an outcome, and that affected the rating? On the assumption that the more garbage you get in the better?
 
Exercise therapy did not worsen symptoms for people with CFS. Serious side effects were rare in all groups, but limited information makes it difficult to draw firm conclusions about the safety of exercise therapy.
[my bold]

How can a scientific paper include the above? Are they suggesting that adverse safety events would not constitute worsening of symptoms? You just cannot make a concrete assertion like "did not" if in the next sentence saying you had insufficient information to draw firm conclusions! This is like dealing with shape shifters. I really do worry that these people see worsening of symptoms as completely divorced from harms, whereas surely they are all on a common spectrum of outcomes, from being a little bit worse, all the way through to being dead, and everything in between.
 
Also, while I've not looked at these particular studies, some MCID work seems designed to give researchers an artificially low MCID. For example, asking patients to select question answers showing a MCID on questionnaires that include very different items that are scored equally.

I thought that the MCID was concerned with having a value that must be big enough to improve a patients health rather than just a significant difference between arms.

In terms of the questions if the questions are very different items (and scored the same) they simply shouldn't be adding scores up - it gives a value that is meaningless. Personally I would sack any statitician who is happy doing this as they haven't thought through the underlying semantics of the data and how it relates to the score they are giving. Sometimes questions are ordered such as whether you find it easy to walk a block or a mile (but the amounts are so different that it does seem perverse to have the same score with no underlying physical justification) but worse is when there are also questions say about walking up steps as these clearly relate as if you can walk a mile then chances are you will also find steps easier. So whilst improving on one question you are also likely to improve on another.
 
Thank you, and especially to @Marit @memhj for getting hold of them. I have now had a chance to read these emails and they are horribly fascinating. And of course completely missing the point about the folly of relying on subjective outcomes in a review of unblinded trials, because of course that would necessitate starting the review again. So the nitpicky argument over the "defensible" way to describe the size of an effect is a waste of everyone's time, as we all know. Sole concern of both sides is to save face rather than help people with ME.

They do show that Cochrane is morally bankrupt in that their aim is to get something that the reviewers are happy with that doesn't fall to the most obvious level of criticisim. They don't seem to have any cares for accuracy or for the efficacy of treatments or patient safety. Just that the review writers are happy and they won't get embaressed.
 
I thought that the MCID was concerned with having a value that must be big enough to improve a patients health rather than just a significant difference between arms.

Oh - I thought it was that the change resulting from treatment (so requiring a control group comparison) was substantial enough to mean the treatment was seen as worthwhile. If it's not assessed on the basis of a difference between arms then isn't that more worrying?

Thinking back, that PACE researchers gave the % for their MCID per arm, not on the basis of a difference between arms. I'm going to be annoyed with myself if I've ben reading MCID papers while having forgotten exactly how MCID is used!

In terms of the questions if the questions are very different items (and scored the same) they simply shouldn't be adding scores up - it gives a value that is meaningless. Personally I would sack any statitician who is happy doing this as they haven't thought through the underlying semantics of the data and how it relates to the score they are giving. Sometimes questions are ordered such as whether you find it easy to walk a block or a mile (but the amounts are so different that it does seem perverse to have the same score with no underlying physical justification) but worse is when there are also questions say about walking up steps as these clearly relate as if you can walk a mile then chances are you will also find steps easier. So whilst improving on one question you are also likely to improve on another.

This seems another one of those problems that is so commonplace in medical research that we're unlikely to be able to make much progress with it.
 
The most bizarre thing is their grading of the evidence. HADS they score "very low", but the combined fatigue measures (which they should have realised that they can't combine, because one (CFQ) measures *change* in individual fatigue since becoming ill and the other (CIS) measures pseudoabsolute individual fatigue over past 2 weeks) are scored "low" or "moderate" depending on which likert/non-likert version they used. It seems to indicate that their GRADE score is dependent on whether the score provides a significant result in the review, and not on whether it is of sufficient *quality* and suitability to do so.

If HADS scores "very low", then all the scales should score "very low", because they all suffer from the same problems.

Any method should involve understanding the semantics of a scale in terms of what the questions mean and what the likely distributions are and how linear they are with change. Unfortunately they seem to forget these basic steps becasue someone writes a paper claiming a scale is valid and then it is assumed to be valid and linear - even when switching to an alternative incompatable marking scheme.

HADS has real problems. Coyne likened it to Elvis (still seen but shouldn't be). Some of the questions answer similarly for fatigue or depression so it has no real discriminatory power.
 
Oh - I thought it was that the change resulting from treatment (so requiring a control group comparison) was substantial enough to mean the treatment was seen as worthwhile. If it's not assessed on the basis of a difference between arms then isn't that more worrying?

Yes I missed that bit out. I guess it is that the difference between the arms need to be big enough that it makes a real difference to a patient.
 
Exercise therapy did not worsen symptoms for people with CFS. Serious side effects were rare in all groups, but limited information makes it difficult to draw firm conclusions about the safety of exercise therapy.

[my bold]

How can a scientific paper include the above? Are they suggesting that adverse safety events would not constitute worsening of symptoms? You just cannot make a concrete assertion like "did not" if in the next sentence saying you had insufficient information to draw firm conclusions! This is like dealing with shape shifters.

And how can the Cochrane GET review claim no worsening of symptoms while the AHRQ concluded the following about harms from GET using essentially the same evidence.
"GET was associated with higher numbers of reported adverse events compared with counseling therapies or controls. Harms were generally inadequately reported across trials."

"Harms were not well reported, although in one trial patients receiving GET reported more adverse events compared with those receiving cognitive behavior therapy (CBT), adaptive pacing, or usual care; one trial reported more withdrawals of patients receiving GET, one trial had a high percentage of patients refusing repeat exercise testing, and several other trials reported more withdrawals of patients receiving GET, all compared with controls."​
 
And how can the Cochrane GET review claim no worsening of symptoms while the AHRQ concluded the following about harms from GET using essentially the same evidence.
"GET was associated with higher numbers of reported adverse events compared with counseling therapies or controls. Harms were generally inadequately reported across trials."

Because they only looked at trials that reported harm (i.e. PACE) and they dismissed the reported issues as unrelated to GET.

We should remember that Crawley used Cochranes safety claims to justify her GET trial on children claiming the looked at numbers from all the trials even thought they did say most didn't report harms data.

The real issue is drop outs (and lack of reporting and recording of harms - if patients are told treatments are safe they are less likely to report harms and therapists are not trained to look/report harms - See @Graham's latest paper).
 
Wow, never thought I’d see an acknowledgement, both that these criticisms are coming from many sources, including “dispassionate” ones, and that concluding that this is moderate certainty evidence is indefensible.
Well, it is an attack on the credibility of science. Imagine if this dumbing down spread through all of medicine. It's a genuine slippery slope that could end up downgrading the value and effectiveness of all medical practice, including "respectable" diseases.
 
Because they only looked at trials that reported harm (i.e. PACE) and they dismissed the reported issues as unrelated to GET.

We should remember that Crawley used Cochranes safety claims to justify her GET trial on children claiming the looked at numbers from all the trials even thought they did say most didn't report harms data.

The real issue is drop outs (and lack of reporting and recording of harms - if patients are told treatments are safe they are less likely to report harms and therapists are not trained to look/report harms - See @Graham's latest paper).

And if patients did report harm woud they be believed or would it be dismissed/downplayed as part of their wrong illness beliefs?
 
[my bold]

How can a scientific paper include the above? Are they suggesting that adverse safety events would not constitute worsening of symptoms? You just cannot make a concrete assertion like "did not" if in the next sentence saying you had insufficient information to draw firm conclusions! This is like dealing with shape shifters. I really do worry that these people see worsening of symptoms as completely divorced from harms, whereas surely they are all on a common spectrum of outcomes, from being a little bit worse, all the way through to being dead, and everything in between.
They simply have no idea what they're talking about. Somehow that's waived off, as if they could not possibly be mistaken but that is the core issue here, they are completely out of their depth and insist it is the patients who are confused about our own illness, that it's reality that is wrong, not their personal interpretation of it.
 
I thought that the MCID was concerned with having a value that must be big enough to improve a patients health rather than just a significant difference between arms.
Given it is based on being smallest change in clinical outcome deemed to be important, then I presume it would be valid to compare the difference between a control arm having no intervention, and a trial arm with the intervention; the difference measured would be the same as the 'D' in MCID wouldn't it?.

ETA: I see you resolved this a bit later.
 
Any method should involve understanding the semantics of a scale
Especially arbitrary scales. Neither of HADS, SF-36 or CFQ are of much relevance to ME, in combination or in isolation. Their choice is nothing more than a matter of preference, they are not validated or adequate for this purpose, especially when objective measures are possible, like working days, physical fitness, etc.

There is a lot of discussion over precision points for a scale that is as inaccurate as a scale can possibly get. It's exactly as precise as measuring integer values of people's height in football fields. One can argue for days about how significant the scoring either of 0 or 1 is, it does not make it a meaningful measurement anymore than measuring heat with adverbs because someone finds celsius too cumbersome or that farenheit is scary or something.
 
And if patients did report harm woud they be believed or would it be dismissed/downplayed as part of their wrong illness beliefs?
We know the answer to that. The logic over the safety of those treatments is entirely circular. They are safe because there are no reports of harm and it is not possible to report harm because they are known to be safe. The trials did not do due diligence and neither is the implementation of those treatments in practice, which we have now confirmed.
 
Back
Top Bottom