Review Interventions for the management of long covid post-covid condition: living systematic review, 2024, Zeraatkar, Flottorp, Garner, Busse+

One focuses on the null, and the other on MID. I seems like they are supposed to pick one.
It seems like there are three possible options:
1. Point estimate over MID, compare CI to null
- evidence of any effect​
2. Point estimate over MID, compare CI to MID
- evidence of important effect​
3. Point estimate below MID, compare CI to MID
- evidence against an important effect?​
And I think the blurb is about choosing between 1 and 2? Did they do option 2 for any outcomes as well?
 
Last edited:
Do you mean no important effect? We see that there's likely a nonzero effect since the CI doesn't cross null
Yes. GRADE calls it 'little or no effect'. It's a bit beside the point but is described in this paper:
Rating certainty when the target threshold is the null and the point estimate is close to the null | BMJ Evidence-Based Medicine

If the point estimate is lower than the MID, GRADE finds it misleading to rate the certainty of a nonzero effect. instead it recommends rating the certainty of little or no effect, which comes down to checking if the CI crosses the MID.
 
It seems like there are three possible options:
1. Point estimate over MID, compare CI to null
- evidence of any effect2. Point estimate over MID, compare CI to MID
- evidence of important effect3. Point estimate below MID, compare CI to MID
- evidence against an important effect?
Approach 3 is what they've done with Vortioxetine trial and it is possible to combine this with approach 1 for point estimates that are higher than the MID. So that would be ok.

The thing that doesn't fit is that they themselves claim: "When the point estimate exceeded the MID, we rated certainty of an important effect". This isn't true. It would mean following approach 2 while they used approach 1.
 
Last edited:
If the point estimate is lower than the MID, GRADE finds it misleading to rate the certainty of a nonzero effect. instead it recommends rating the certainty of little or no effect, which comes down to checking if the CI crosses the MID.
But how do you combine that when you’re rating multiple studies in a review?

If you have «low certainty of an important effect» in study X, and «medium certainty of little or no effect» in study Y, what’s the conclusion?
 
But how do you combine that when you’re rating multiple studies in a review?

If you have «low certainty of an important effect» in study X, and «medium certainty of little or no effect» in study Y, what’s the conclusion?
The conclusion would be: 'We are uncertain that treatment X improves symptoms' and 'Treatment Y probably has little or no effect on symptoms.'

And why would you ever not rate in relation to the MID if it’s available? How does it matter if the effect is lower than 0 or not if it’s still lower than the MID? The intervention does not have a clinical effect regardless.
Agree. GRADE allows it (probably because it was the standard in the past) but it makes little or no sense.

As a reviewer you want to determine if there is a clinically significant effect or not. I do not see why it would be useful to determine if there is an effect that may or may not be clinically significant? That doesn't answer any question.

Most reviews recommend treatments and claim that they reduce symptoms, so in all these cases they mean an effect that patients will notice. It doesn't make sense to make such statements for effects that are not clinically significant.

I think reviewers misuse this to not downgrade evidence for treatments that they fancy. It's one (of the many issues) with the Cochrane review on GET.
 
If @forestglip ‘s observation earlier is correct, they are applying two different criteria. One focuses on the null, and the other on MID. I seems like they are supposed to pick one.
I don't see an issue with the Vortioxetine rating but other estimates (from non-behavioral interventions such as hyperbaric oxygen or transcranial stimulation) were downgraded with two levels for imprecision.

There seems to be an inconsistency where non-behavioral interventions were downgraded with 2 levels, while similar estimates from behavioral interventions were not downgraded at all.

Here's an overview from the blog I wrote about it:
The BMJ review on Long Covid interventions - ME/CFS Skeptic
1751134983127.png

1751135002067.png

As justification for downgrading with two levels the paper states: 'Likely much too few participants to achieve prognostic balance.' This terminology of 'prognostic balance' is used in older GRADE statements on precision such as this one:
GRADE guidelines 6. Rating the quality of evidence--imprecision - PubMed

The reasoning behind it is the same as for OIS: that sample sizes need to be large enough to have confidence in the estimate. Even if the CI look good, one may downgrade for imprecision if the sample size is too small.

I think it's ok to downgrade the estimates for hyperbaric oxygen or transcranial stimulation in this way but then the same approach should have been applied to the estimates for CBT. The difference in sample size (70 versus 114) is not large enough to warrant downgrading the first one with two levels and the other not at all.
 
Aren’t the treatments on the same intervention, though? Or have I misunderstood something?
Not sure what you mean but if there are multiple trials on the same intervention/outcome then they are pooled using meta-analysis. They are added together so that there is one estimate and then the GRADE approach is only applied after that on that pooled estimate.

If these are estimates of the same intervention but for different symptoms then the conclusion would be:
'We are uncertain that treatment A improves symptoms X' and 'Treatment A probably has little or no effect on symptoms Y.'
 
Back
Top