Review Effect of exercise for depression: systematic review and network meta-analysis of randomised controlled trials, 2024, Noetel et al

JohnTheJack

Moderator
Staff member
Effect of exercise for depression: systematic review and network meta-analysis of randomised controlled trials
BMJ 2024; 384 doi: https://doi.org/10.1136/bmj-2023-075847 (Published 14 February 2024)
Cite this as: BMJ 2024;384:e075847

Michael Noetel, senior lecturer1, Taren Sanders, senior research fellow2, Daniel Gallardo-Gómez, doctoral student3, Paul Taylor, deputy head of school4, Borja del Pozo Cruz, associate professor56, Daniel van den Hoek, senior lecturer7, Jordan J Smith, senior lecturer8, John Mahoney, senior lecturer9, Jemima Spathis, senior lecturer9, Mark Moresi, lecturer4, Rebecca Pagano, senior lecturer10, Lisa Pagano, postdoctoral fellow11, Roberta Vasconcellos, doctoral student2, Hugh Arnott, masters student2, Benjamin Varley, doctoral student12, Philip Parker, pro vice chancellor research13, Stuart Biddle, professor1415, Chris Lonsdale, deputy provost13

Abstract
Objective
To identify the optimal dose and modality of exercise for treating major depressive disorder, compared with psychotherapy, antidepressants, and control conditions.

Design
Systematic review and network meta-analysis.

Methods
Screening, data extraction, coding, and risk of bias assessment were performed independently and in duplicate. Bayesian arm based, multilevel network meta-analyses were performed for the primary analyses. Quality of the evidence for each arm was graded using the confidence in network meta-analysis (CINeMA) online tool.

Data sources
Cochrane Library, Medline, Embase, SPORTDiscus, and PsycINFO databases.
Eligibility criteria for selecting studies Any randomised trial with exercise arms for participants meeting clinical cut-offs for major depression.

Results
218 unique studies with a total of 495 arms and 14 170 participants were included. Compared with active controls (eg, usual care, placebo tablet), moderate reductions in depression were found for walking or jogging (n=1210, κ=51, Hedges’ g −0.62, 95% credible interval −0.80 to −0.45), yoga (n=1047, κ=33, g −0.55, −0.73 to −0.36), strength training (n=643, κ=22, g −0.49, −0.69 to −0.29), mixed aerobic exercises (n=1286, κ=51, g −0.43, −0.61 to −0.24), and tai chi or qigong (n=343, κ=12, g −0.42, −0.65 to −0.21).

The effects of exercise were proportional to the intensity prescribed. Strength training and yoga appeared to be the most acceptable modalities. Results appeared robust to publication bias, but only one study met the Cochrane criteria for low risk of bias. As a result, confidence in accordance with CINeMA was low for walking or jogging and very low for other treatments.

Conclusions Exercise is an effective treatment for depression, with walking or jogging, yoga, and strength training more effective than other exercises, particularly when intense. Yoga and strength training were well tolerated compared with other treatments.

Exercise appeared equally effective for people with and without comorbidities and with different baseline levels of depression. To mitigate expectancy effects, future studies could aim to blind participants and staff.

These forms of exercise could be considered alongside psychotherapy and antidepressants as core treatments for depression.


Except read this thread that absolutely destroys it



A study making false claims about exercise as a treatment? Surely not.
 
Thanks JTJ
I've read the threads and quoted threads for as long as I could. The review looks truly awful, in a multitude of ways. Here's some of the content for people not on twitter.

Mar 11>>
The study suffers from some noticeable, obvious miscoding of effect sizes. For example, the authors reported an effect size of -11.22 Hedge's g. That's MUCH larger than the difference in people's preference for chocolate over feces-flavored ice cream.

Feb 22>
I thought it would be funny to go through that depression meta-analysis by showing progressively more extended plots to get an idea of how outrageously large some of the effects were. But instead I found a citation and coding error and I think there might be more.

The study's most extreme estimate is an SMD of -11.21647 from on Wadden et al. (2014). This is enormous and unbelievable. This is like the effect of taking a punch to the face on having a hurt jaw.

What is this miracle treatment Wadden et al. used? According to Noetel et al., it was "Mixed aerobic exercises." So I go to search for Wadden et al. and, well, it doesn't exist. I have to go to the reference information file to see what it is:

Instead of Wadden et al., it is "The Look AHEAD Research Group." Or if you go by the list of people in that group, it's Rubin et al.: https://pubmed.ncbi.nlm.nih.gov/24855155/ Thomas Wadden is the corresponding author. How does the Diabetes Care journal say to cite it? Off to a good start.

The control group for the study are the "Intensive Lifestyle Intervention" group and the treatment group is the "Diabetes Support and Education" group: ILI vs. DSE. Here's how the intervention is described:
Interventions
Study participants were randomly assigned within each center to the ILI or to diabetes
support and education (DSE), the comparison condition for this trial. As reported previously
(14), the ILI was designed to induce 27% weight loss in the first year of the trial and to
maintain the weight loss in subsequent years. IL participants were prescribed a calorie goal
of 1,200-1,800 kcal/day (based on initial weight), with <30% of total calories from fat (<10%
from saturated fat) and ≥15% of total calories from protein. ILI participants were encouraged
to engage in ≥175 min per week of moderate intensity exercise (e.g., brisk walking). The ILI
incorporated behavioral strategies, including regular self-recording, to facilitate achievement
of the above goals (13).
IL participants were seen in a combination of individual and group sessions, weekly for the
first 6 months and three times per month for the next 6 months. Sessions addressed
behavioral adherence and ways to improve it. In all subsequent years. ILI participants were
offered monthly in-person sessions. In addition, two to three times per year, they were invited
to participate in a once-weekly 6-8 week group program designed to achieve a specific
weight loss (e.g., 3-5 Ibs.) or activity goal.
SE participants were invited to three group sessions per year for the first 4 years. These
sessions addressed general issues of diet, exercise, and social support using a standardized
protocol. One group session per year was provided thereafter.

This is not just "Mixed aerobic exercises" and adherence to an exercise plan wasn't monitored. Looking good. Noetel et al. reported that the sample size was 480 and it dropped to 392 by the follow-up four years later. The effect was on Beck Depression Inventory (BDI) score.

You look through the study and you see... it's not clear where they got their sample sizes or their effect estimates In the plots of effects on BDI scores, you see A) HRs for BDI>=10 for people initially <10 B) Mean BDI C) Mean BDI for baseline <10 D) Mean BDI for baseline >=10

Do you see 480 on that plot? Because I do. It's for the ILI group-the control-in plot D. Do you see 392 on that plot? Because I do. It's for the DSE group at year 3-not 4-in plot A. The initial and final sample sizes? Misreported. But it gets worse. What's the effect?

Noetel et al. reported an SMD of 11.21647 for this. That's very precise. Which effect does it refer to? The significant INCREASE for DSE over ILI in A? (Wrong sign.) The significant INCREASE for DSE over ILI in C? (Wrong sign.) The nonsignificant differences in B or D?

None of these effects in years 3 or 4 makes any sense for us In fact, it seems the effect wasn't based on a comparison within the active control group at all, it was based on the DSE group's change in plot D, from a mean of 14.04 to 9.32 with SDs of 0.42 initially and 1.24 after

Cohen's d for that difference is... 5.099. That's not quite 11.21647. OK, maybe they miscoded the outcome name and they didn't use the BDI, they used the physical (PCS) and mental (MCS) quality of life scores or usage of anti-depressant medication.

It's not those and the supplement also shows it's not anti-depressant medication usage. Wrong sample sizes, wrong effects, bad description of the intervention, and if you look in the code, it's not clear how the effects were computed, they're just loaded in already messed up!

Now I've only looked at this one estimate so far, but this was the biggest effect and it was so large it should have thrown up alarm bells. And yet, it was still reported and no reviewer caught it. I don't believe the rest of the studies are coded up perfectly either. Do you?

There are tons of estimates in this meta-analysis, so it would be a lot of effort to check them all. I would really like to know how this and every other effect was computed, but it doesn't seem like the study provides any clear answers.

I hope I've missed something and someone can tell me what it was since, right now, this looks bad! Study, open access: https://ncbi.nlm.nih.gov/pmc/articles/PMC10870815/ Data, partial code: https://osf.io/nzw6u/ Responses: https://bmj.com/content/384/bmj-2023-075847/rr-1, https://bmj.com/content/384/bmj-2023-075847/rr-0,

.. there's more​
Attempting to replicate the authors' effect sizes with their provided data, it's simply not possible. Most effect sizes are not even within 1 Hedge's g of what they reported, and 1 g is a huge effect.

I asked the authors what happened. They replied by uploading some new code. Their new code showed that they did not estimate the effects they verbally described estimating. In fact, they didn't estimate treatment effects, they estimated change scores.

Even if this wasn't the case, there is extremely low power and there's a good deal of publication bias.

The power was so poor that, among studies classified as having a low risk of selective reporting (doubtful given other mistakes), the mean effect was estimated at 0, with CIs from almost -1 to 1.
The meta-analysis assures us that the overall poor quality of studies is a non-issue but there's this weird figure in the supplement indicating that studies with low risk of selective reporting had a zero mean effect?? From the first figure that's a big chunk of their studies

Among the studies whose estimates were included in the meta-analysis, the authors had some citation errors (e.g., Wadden et al.), but they also completely left out some papers (e.g., Philips 2018).

Even if all the citations were fine and there appeared to be plenty of power and no publication bias—sadly, not the case—there are more issues. For example, many studies' interventions are improperly described:

And many studies' samples were dubiously relevant:
Given that the average depression score for the treatment group is just barely above their cutoff for "depressed" (.5), it seems plausibly that a lot of non-depressed subjects were included - I have no idea what is going on here

Some studies even seemed to be 'too good to be true' and quite likely fake, or at least so unrealistic that they should not have been used.
studies that definitely happened - with no mention of dropouts or missing data! truly incredible similarity of post-test depressive symptoms for the treatment group who all have remarkably similar mild-to-moderate depression (cf. same instrument in a sample of medical students)
How many studies listed as having an active control actually had one? Not many
importantly their results go into the toilet when Bayesian-compared to pooled "active controls" and most of their "active controls" are trash (by my count only five were placebo pill controlled)
Unfortunately, rubbish meta-analysis seems to be a theme for these authors. They have a previous meta-analysis on treating schizophrenia with exercise. Among their citations in that meta-analysis are studies that make zero mention of schizophrenia.
among other things the authors seem to have included a study on exercise in heart failure patients that doesn't even mention schizophrenia?? but it's one of the highest-quality studies lol

In fact, some of the studies they categorized as "high-quality" evidence of schizophrenia treatment effects make no mention of schizophrenia. Why were they cited? Can't say. But I can say that this group needs to have all of their work investigated.
 
Oooh, it has to be that study that only had a single certainty rating that isn't very low, and that one is still low...

It is!

Effect-of-exercise-for-depression-systematic-review.png

Would you look at that, like a mirror of the ME studies, all of which are either highly biased, or biased, and vastly overhype pre-determined opinions that have no relation to the data.

The fact that this "study" is titled as if they are looking at whether it is effective, but their objective clearly and overtly assumes that it is says it all. I've been saying this for years, and I have not changed my mind: literally all of this body of research will be eventually entirely discredited, at its very best infamous for vastly overestimating and overhyping mediocre results out of a GIGO process. It will be studied as the peak era of pseudoscience, and be the "collapsed bridges" lessons of medicine that engineering fields all share, a growing phase that medicine sadly hasn't gone through yet.

The dance being such a positive outlier is so obviously in the same category as "we gave candies to this group and it just happened to be the group that gave the best rating".

In the end what this entire body of evidence amounts to is that people enjoy things they enjoy, and that there are relatively small and mostly imperceptible benefits to exercise, which come at a cost, and if that cost is lowered it provides more benefits, but most of the obstacles are socioeconomic so the idea of medical prescribing is just the height of foolishness. If exercise were as good for health as it's been touted, the main difficulty would be in getting people to exercise less, most people would be borderline addicted to it.
 
Back
Top Bottom