1. Sign our petition calling on Cochrane to withdraw their review of Exercise Therapy for CFS here.
    Dismiss Notice
  2. Guest, the 'News in Brief' for the week beginning 15th April 2024 is here.
    Dismiss Notice
  3. Welcome! To read the Core Purpose and Values of our forum, click here.
    Dismiss Notice

cognitive behavioural and personalised exercise interventions for fatigue in inflammatory rheumatic diseases (LIFT), 2022, Bachmair et al

Discussion in 'Other health news and research' started by chillier, Jun 28, 2022.

  1. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    Merged thread

    A cursory look at this study.

    Participants

    The participants had the following conditions and were all stable (low inflammation markers):
    The inclusion criterion regarding fatigue was that it had to be persistent (>3 months) and clinically significant, defined as “≥6 on numerical rating 0–10 scale measuring average level of fatigue during the past 7 day”.

    Interventions

    The active interventions (personalised exercise a.k.a. PEP and CBT) were delivered by therapists over the phone, with at most 7 one-to-one sessions of up to 45 mins per participant, over 14 weeks. The number of sessions received by each participant was decided individually between them and their therapist, but a booster session was added at 22 weeks for all participants. PEP participants received a median of 5 sessions (IQR 1-8) and CBT participants a median of 8 sessions (IQR 2-8).

    Both interventions were based on the usual exertion fear-avoidance model of the PACE trial, which is directly cited (ref. 12, as @Andy pointed out):
    The usual care consisted of “a Versus Arthritis education booklet for fatigue” (no therapist involved), which seems to be this one from a quick Google search. It essentially explains adaptative pacing as in the PACE trial, i.e. trying to gradually increase exercise while recommending “the four P’s”: problem solving, planning, prioritising and pacing.

    Outcomes

    The primary outcome was the Chalder fatigue scale (0-33 rating) and, because there was an improvement, the FSS was also included. They were both evaluated at 10, 28 and 56 weeks. The following mean changes at 6 months post-treatment (56 weeks) are reported:

    - PEP compared to usual care:

    CFS: -3.03, 97.5% CI [-5.05, -1.02] (p = 0.0007)

    FSS: -0.64 [-0.95, -0.33] (p = 0.0058)

    - CBT compared to usual care:

    CFS: -2.36 [-4.28, -0.44] (p < 0.0001)

    FSS: -0.58 [-0.87, -0.28] (p < 0.0001)

    [​IMG]

    The authors chose half a standard deviation as the minimally important difference (MID, i.e. the minimal change that can be considered clinically significant) on the Chalder fatigue scale:
    However, this seems inappropriate. First, their choice of half a SD is based on a small review of 32 studies that reported quality-of-life outcomes; this rule of thumb has been contested. [1] Second, they chose a smaller MID than those reported previously for lupus & rheumatoid arthritis on the CFS and the FSS, which respectively ranged from 2.3 to 3.3 and 0.5 to 1.2. [2] Third, the interventions are directly based on the PACE trial, as was the GETSET trial; in contrast, a MID of 3 points for the CFS was chosen in the latter. Importantly, all reported confidence intervals for the CFS and the FSS include values under their respective MIDs.

    It is unfortunate that the authors reported a “clinically significant” change of “medium size” despite their results 1) showing uncertainty that they even reached MIDs, 2) likely owing to response bias on the questionnaires rather than a true effect from the intervention, considering the small differences compared to usual care.

    In particular, for participants who received 3 or more sessions of PEP, either exercise was indeed slightly more effective than usual care and CBT or response bias might have been particularly pronounced, as there was a greater improvement on the CFS and the confidence interval did not include the MID:

    In any case, this is a disappointing result: if the reported improvement owed to the treatment itself as opposed to bias, it is likely that a dose-dependent relationship would have emerged. This is not the case.

    Further, while there was a reported improvement with PEP on the Work Productivity and Activity Impairment questionnaire (which concerns the prior 7 days) on “overall work impairment” (p = 0.010), none was observed on the “valued life activities” subscale of this questionnaire. Generally, despite “statistically significant” changes being reported on some secondary outcomes, they were not corrected for multiple comparisons, and other important secondary outcomes were null:
    Conclusion

    Overall, it seems fairly clear from the results that the interventions were not effective. I do not understand why the Lancet’s peer reviewers did not point this out and agreed to publish the article as such, that is, with the claims of improvement that it makes.

    While the authors mention that they will report on cost-effectiveness in a later article, I cannot see how they can make the case that their interventions are cost-effective (3+ months of treatment for barely reaching MIDs).

    Perhaps the most unfortunate part of this trial is that it was funded by the charity Versus Arthritis. Prof Gary MacFarlane, one of the co-authors, is VA’s vice-director; surely this must have helped securing funding.

    [1] Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Medical care, 41(5), 582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C

    [2] Nordin, Å., Taft, C., Lundgren-Nilsson, Å., & Dencker, A. (2016). Minimal important differences for fatigue patient reported outcome measures-a systematic review. BMC medical research methodology, 16, 62. https://doi.org/10.1186/s12874-016-0167-6
     
    Last edited: Jul 15, 2022
    Louie41, SunnyK, Cheshire and 9 others like this.
  2. Peter Trewhitt

    Peter Trewhitt Senior Member (Voting Rights)

    Messages:
    3,669
    Doesn’t this imply they had already prejudged the issue? Unblinded with subjective outcomes, is this another ocean liner launched towards a predetermined destination.

    I have no idea about the relevance of these interventions in this situation, but sad that this is research that will not advance our knowledge.
     
    Louie41, SunnyK, alktipping and 5 others like this.
  3. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
  4. Hutan

    Hutan Moderator Staff Member

    Messages:
    26,926
    Location:
    Aotearoa New Zealand
    That's a great analysis @cassava7. Have you looked to see if you can submit a comment to the journal? Do you plan to send it to the charity that funded the study?


    I haven't read the study, but this bit wasn't quite clear to me:
    I think it's arguable that there was a dose-dependent relationship:
    It sounded as though there was a sort of a dose dependent relationship (2 or less PEP sessions versus 3 or more sessions), even if it's just a single step increase at that threshold?

    I'm also not sure that a dose-dependent relationship is proof of lack of bias in an open-label subjective outcome trial. Your earlier comment that "a response bias might have been particularly pronounced" in those who completed 3 or more sessions also seems to cast doubt on that idea. Those who have made a lot of effort to be compliant with the treatment, who have invested more in teh treatment being helpful, might be more inclined to evaluate the outcome more positively. They also had more contact with the therapists and so were more likely to have been swayed by suggestions that the treatment is effective.
     
    Louie41, SunnyK, alktipping and 5 others like this.
  5. dave30th

    dave30th Senior Member (Voting Rights)

    Messages:
    2,246
    Is this Alison Wearden???
     
    Louie41, SunnyK, Amw66 and 2 others like this.
  6. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    Thank you for your feedback. I have done neither of these things as I do not have the energy to formalize this cursory analysis, even though I would like to send a short letter to the editor.

    In my opinion, it would not be of much use to contact Versus Arthritis directly as they are very much in favour of BPS interventions for chronic pain and fatigue. One of the two laypersons who sat on the NICE committee for the chronic primary pain guideline, which ended up recommending CBT and exercise, is a member of VA.

    Indeed. It seems like for PEP, the number of sessions received maximized response bias on the CFS, which may be more prone to it than the FSS given its mental health-oriented questions. So there seems to be a dose-dependent relationship in response bias, but not in the actual effect of the treatment, as evidenced by the fact that those who followed 3 or more sessions did not fare better on the FSS.

    That said, and importantly, the authors did not report in the appendix whether comparisons between {the PEP or CBA “complier” groups (3+ sessions) vs usual care} vs {the PEP or CBA “non-complier” groups vs usual care} were statistically significant. Given the small mean difference between these two comparisons for PEP on the CFS (-4.44 vs -3.03, i.e. under the author-specified MID of 2 points) and the overlapping 97.5% confidence intervals ([-5.66, -3.21] vs [-5.05, -1.02]), I am not sure that this comparison is statistically significant, let alone clinically significant, and instead appears to be a bit of cherry-picking.

    [​IMG]

    Still, assuming that the result is significant, the hypothesis that the number of sessions maximizes response bias does not hold when comparing both intervention groups. Participants in the PEP arm were less in contact with therapists than those in the CBA arm, as they received a median of 5 vs 8 sessions respectively. Also, according to the appendix, PEP telephone sessions (30 mins, 1h for the first one) were shorter than CBA ones (30-45 mins, 1h for the first two). Yet those who received 3 CBA sessions or more did not achieve a larger decrease on the CFS, which is what one would have expected.

    While the analysis reports that PEP was slightly more effective than CBA on secondary outcomes, I believe few of these results, if any, would have survived correction for multiple comparisons.

    In the end, I am not sure what to make of this finding. It does not seem significant. I would appreciate some feedback on my reasoning.
     
    Last edited: Jul 9, 2022
    Louie41, Amw66, Sean and 2 others like this.
  7. petrichor

    petrichor Senior Member (Voting Rights)

    Messages:
    320
    I agree with the analysis from @cassava7, which is very good, and also @Hutan's comments. It's indisputible that these are small effect sizes, especially for the CBT. A 2-3 point reduction on a 33 point scale isn't much, and their choice of Minimal Important Difference is pretty arbitrary
     
  8. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    Thanks to @Lucibee for highlighting another significant problem with the CFS: the fact that it is uninterpretable as an outcome measure because it is not an absolute scale.

    To summarize briefly: the CFS asks the patient to rate changes in fatigue intensity over time, rather than fatigue intensity itself at a given time.

    While the main timepoint for comparison in the questionnaire is “as usual”, the short description at the beginning confusingly mentions two other timepoints: “in the last month” and “if you have been tired for a long while, then compare yourself to how you felt when you last felt well”. A participant might select (or be made to select) different timepoints every time they fill the questionnaire, even though the same timepoint (that chosen at baseline) should be used for all comparisons.

    For example, the participant could understand “as usual” as “a month ago” the first time, but as “compared to the beginning of the trial” subsequently (instead of using once again “one month before the beginning of the trial”). This is problematic because if their fatigue does not change throughout the trial, they might rate it as “worse than usual” at baseline but “no more than usual” at outcome. This will result in an artificial improvement on the CFS at outcome, only because the comparison timepoints have changed. Similarly, any reported improvement in fatigue (“less than usual” at outcome) will be artificially boosted.

    Importantly, this also introduces variability between each participant on their choices of comparison timepoints. In practice, unless all participants are asked to refer to the same specific timepoint every time they fill the questionnaire (e.g. a month prior to the beginning of the trial or when they last felt well), it is impossible to know which one(s) they will choose.

    Even if participants were to use the same timepoint for all comparisons, recall bias would arise. At baseline, they may be able to remember how fatigued they felt during the preceding month, but at outcome (e.g. 1 year later) they would not be able to recall it accurately. Further, if “they have felt tired for a long while”, they would likely not be able to recall how they felt “when they were last well” with much accuracy, especially the longer that they have been ill.

    All in all, within- and between-patient variability in comparison timepoints and recall bias prevent from using the CFS as an outcome measure. By contrast, an absolute scale measures fatigue intensity itself at a given time (e.g. rating fatigue in the past 7 days from 0 to 10), which allows for comparisons between arbitrary timepoints.

    As a result, the small improvement in CFS between baseline and at 6 months post-treatment (which is not replicated on the FSS if a more suitable MID than that picked by the authors is used) could simply be a consequence of this issue and/or response bias due to the lack of blinding.

    https://twitter.com/user/status/1545718634189922305
     
    Last edited: Jul 9, 2022
  9. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    12,461
    Location:
    Canada
    This is where the whole paradigm just falls flat into comparing apples and colors, one more reason why evidence-based medicine needs to stop.

    It's discussed a lot, but it makes no sense to speak of "dose" in this context. 10 seances of CBT do not amount to 10x as much CBT as just one. Not in the sense of what dose means in science, where 100mg of something really is 10x 10mg. CBT is a qualitative thing, it can't be added or multiplied, is not quantifiable.

    Same thing with measure. Dose has a meaning, it implies a measure. Nothing is measured here. Those words are simply used as incorrectly as would using the colloquial meaning of theory, the whole concept is invalid and is nothing but cheap manipulation of the exact same kind that was commonplace before science became a serious thing. This is straight up from the time when evidence was rhetoric and peer review was an adoring crowd.
     
    Louie41 and alktipping like this.
  10. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    @rvallee I am not sure that I agree with your view. Indeed, referring to a “dose” of a non-pharmacological intervention as opposed to a medication may be somewhat of a stretch or a shortcut, but the number of sessions is quantifiable and still counts.

    Indeed, what the patient and the therapist agree on doing in the first CBT or PEP session(s) needs to be implemented and adjusted over the next sessions, not only because of the graded approach to increasing activity but also because of setbacks and other difficulties that the patient may encounter. One would reasonably expect that more sessions lead to better and sustained implementation of physical activity, and perhaps to reduced fatigue. Conversely, it is also reasonable to expect that there is a cap on the number of sessions above which more of them does not improve outcomes further. In effect, this is a dose-dependent relationship as for a drug.

    (Interestingly, it can be argued that unlike most drugs, a lower number of sessions or “dosage” could lead to more adverse effects because the patient is not informed enough on how to implement the therapist’s guidance properly — if the intervention is truly safe in the first place.)

    Something is in fact being measured: bias(es), as artifacts of the combination of flawed outcome measures used in this trial and its design. What is being measured may not be relevant to the question the trial is attempting to address (i.e. are GET and CBT effective in inflammatory rheumatic diseases?), but it is precisely because we have to assess whether the trial succeeds in addressing this question that it is important to clarify what it is being measured.
     
    Last edited: Jul 9, 2022
    alktipping and Sean like this.
  11. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    12,461
    Location:
    Canada
    It's a massive stretch. Qualitative things don't become quantified just because there's an independent number somewhere. This means as much as talking about 10 mindfulnesses. Nothing is being measured here, they are rated, scored. Ratings and scores can be meaningful but they are different from quantitative measures that have a graded scale, one where 2x as many of something has a literal quantified meaning. Here every séance is explicitly different, they are not equal, in fact nothing is equal here.

    It's just borrowing language from medicine and chemistry to make this sound more credible than it is. Given everything I've seen from this field, this is obviously the explicit intent. This is what pseudoscience is made of: take BS, make it sound scientific with language and framing borrowed from science.
     
    Louie41, alktipping, chrisb and 2 others like this.
  12. MSEsperanza

    MSEsperanza Senior Member (Voting Rights)

    Messages:
    2,857
    Location:
    betwixt and between
    Not able to post properly probably for a while so just leave that RAFT here (think hasn't been posted on the forum yet) ...

    Reducing Arthritis Fatigue - clinical Teams using cognitive behavioural approaches (RAFT)

    https://www.isrctn.com/ISRCTN52709998
     
  13. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    12,461
    Location:
    Canada
    Uh. Weird. Essentially the same as the CBT arm of PACE. But they assume it works and jumped straight to cost-effectiveness:
    Overall trial end date: 2018. No citation provided for how the intervention "proved" to be effective. Registered as a controlled trial. Not properly controlled, aka not controlled.

    This entire paradigm is useless, worse than useless even, it's counterproductive. It works on timescales of years simply to answer bad questions and can't even do that, everything is interpreted and biased. There's a strong case to make that overall, this is the most incompetent thing in all the professions. Literally all of them. And by a wide margin.
     
    Louie41, alktipping and MEMarge like this.
  14. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    The results with follow-up at 2 years have been published: https://ard.bmj.com/content/78/4/465.long

    As in LIFT, the inclusion criterion for fatigue in RAFT was 6/10 or more and patients in the control group were given the education booklet on fatigue from Versus Arthritis. Those in the intervention arm received 7 CBT sessions delivered by a pair comprising an occupational therapist and a nurse.

    The primary outcome was the Bristol Rheumatoid Arthritis Fatigue Numerical Rating Scale (0-10) at 26 weeks: −0.59 (95% CI –1.11 to -0.06), with similar modest changes on secondary outcomes and no changes on 14 secondary outcomes.

    At two years, the BRAF-NRS outcome was −0.49 (95% CI −0.83 to -0.14); interestingly, there were again small modest changes on some secondary outcomes, but not necessarily on the same ones as those that “improved” at 26 weeks, and no changes on most secondary outcomes.

    Based on the changes in secondary outcomes that improved between the 26 week outcome and the 2 year follow-up, it is sad to think that RAFT likely measured random noise or, at best, response bias (as did the LIFT trial).

    Edit: RAFT was funded by the NIHR with a £1.3M grant… https://fundingawards.nihr.ac.uk/award/11/112/01
     
    Last edited: Jul 10, 2022
    Louie41, alktipping, MEMarge and 7 others like this.
  15. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    My letter to the editor was turned down because “we feel that the issues raised were discussed sufficiently in the Discussion section of the article and the linked Comment.” (This is clearly not the case.)

    I am posting it below in case it might serve in the future. Thanks to @Lucibee for her suggestions.



    The implausible clinical relevance of the LIFT trial

    Bachmair et al’s claim that telephone-delivered cognitive behavioural and personalised exercise interventions (CBA & PEP) “produced and maintained statistically and clinically significant reductions in the severity and impact of fatigue” in their trial LIFT [1] is disputable.

    LIFT’s open-label nature, reliance on patient-reported outcome measures (PROM) and lack of an active comparator (e.g. counseling) means that performance and attention biases were not controlled for, leading to expectation and response biases. These were compounded by the interventions’ aims of “[replacing] unhelpful beliefs and behaviours (…) with more adaptive ones” and “[normalising] misperceptions of effort”. As documented for trials of similar interventions, such biases alone can explain modest reported changes on PROMs. [2]

    The authors assumed a minimally important difference (MID) of 2 and 0·5 points on the Chalder fatigue scale (CFS) and Fatigue Severity Scale (FSS), respectively. However, a trial of similar interventions led by one member of the study group retained a 3 point MID for the CFS, [3] and previous trials for rheumatoid arthritis, lupus and multiple sclerosis used MIDs of 2·3 to 3·3 and 0·5 to 1·2 for the CFS and the FSS, respectively. [4] This also casts doubt over the clinical significance of LIFT’s primary outcomes, especially given their imprecision (large confidence intervals that include the author-specified MIDs). [1]

    Further, the Chalder fatigue scale is unsuitable as an outcome measure as it measures changes in fatigue intensity rather than absolute fatigue. Since it confusingly mentions three timepoints (“as usual”, “in the last month”, “when you last felt well”), it is plausible that at baseline, a participant may understand “as usual” as when they last felt well and indicate their fatigue is “worse than usual”, but at outcome, they may interpret it as the time of enrollment and answer “no more than usual”. They would thus score lower at outcome despite no change in their fatigue, resulting in an artificial improvement. The scale is also prone to recall bias.

    While the Bristol Rheumatoid Arthritis Fatigue Multidimensional Questionnaire secondary outcome is statistically significant, the modest reported improvements (mean reductions under 7 points) may not be clinically significant. [1, 5] The lack of improvement in pain, anxiety and physical health-related quality of life further questions the reported improvements in fatigue. [1]

    Overall, owing to LIFT’s inherent biases, its true primary outcomes likely approach the lower bounds of their confidence intervals, or may even fall below them, and thus lack clinical significance.

    References

    1 Bachmair E, Martin K, Aucott L, et al. Remotely delivered cognitive behavioural and personalised exercise interventions for fatigue severity and impact in inflammatory rheumatic diseases (LIFT): a multicentre, randomised, controlled, open-label, parallel-group trial. Lancet Rheumatology 2022; doi: 10.1016/S2665-9913(22)00156-4.

    2 Tack M, Tuller DM, Struthers C. Bias caused by reliance on patient-reported outcome measures in non-blinded randomized trials: an in-depth look at exercise therapy for chronic fatigue syndrome. Fatigue: Biomedicine, Health & Behavior 2020; 8: 181–192. doi: 10.1080/21641846.2020.1848262.

    3 Clark LV, Pesola F, Thomas JM, Vergara-Williamson M, Beynon M, White PD. Guided graded exercise self-help plus specialist medical care versus specialist medical care alone for chronic fatigue syndrome (GETSET): a pragmatic randomised controlled trial. Lancet 2017; 390: 363–373. doi: 10.1016/S0140-6736(16)32589-2.

    4 Nordin Å, Taft C, Lundgren-Nilsson Å, Dencker A. Minimal important differences for fatigue patient reported outcome measures-a systematic review. BMC Med Res Methodol 2016; 16: 62. doi: 10.1186/s12874-016-0167-6.

    5 Dures EK, Hewlett SE, Cramp FA, et al. Reliability and sensitivity to change of the Bristol Rheumatoid Arthritis Fatigue scales. Rheumatology (Oxford). 2013; 52: 1832–1839. doi: 10.1093/rheumatology/ket218
     
    Sean, Art Vandelay, petrichor and 6 others like this.
  16. MSEsperanza

    MSEsperanza Senior Member (Voting Rights)

    Messages:
    2,857
    Location:
    betwixt and between
    Thank you for your work on this.

    Do you consider to put it as a comment on PubPeer?

    Link to the study paper on PubPeer: https://pubpeer.com/publications/2ADB2E81D20ED8148DDBC42196E470

    About PubPeer see this thread: https://www.s4me.info/threads/retra...ication-comments-on-pubpeer.1472/#post-267340
     
  17. cassava7

    cassava7 Senior Member (Voting Rights)

    Messages:
    985
    Last edited: Jul 18, 2022

Share This Page