Had a brief look at the Google Translated version of the document.
They seem to have taken a rather stringent view on case definitions in that only the more recent case definitions that require PEM (CCC, ICC, IOM) are believed to capture the correct patient population. In their view, older case definitions such as the Fukuda criteria identify a more heterogenous group of patients which can lead to misdiagnosis or overdiagnosis. At the same time, the report highlights the vagueness in the definition and operationalization of PEM and says that this should be a focus of further research.
The section on case definitions focuses on the sole study (Nacul et al. 2011) that provided a prevalence estimate for the CCC, namely of 0.1% in the UK. This was likely an underestimate because it was based on diagnoses that GPs had already made (and we have good reasons to believe GPs are underdiagnosing ME/CFS-patients). The report extrapolates the 0.1% figure to Germany, resulting in approximately 70.000 adult patients. For children, there is a US estimate by Jason et al. which found a much higher prevalence of 0.75% but the report says it could not extrapolate this to Germany because this prevalence study did not provide information on age standardization.
When evaluating treatments, they first looked at existing reviews. They found 8 but only the NICE guideline was deemed sufficiently up-to-date and of high quality, so they used it as a starting point. The English translation states on page 30:
“All data from NICE 2021 was extracted as part of the evidence mapping. The data were not cross-checked with or supplemented from the primary publications and no own calculations of results were performed.”
So they are heavily reliant on the review by NICE but then they did something rather strange. The literature search by NICE found 85 randomized controlled trials (RCT’s) testing all sorts of treatments for ME/CFS. In the NICE guidance, all of these reports were reviewed and assessed. Those in which less than 95% of patients did not have PEM were still included but downgraded by one level according to the GRADE system because of indirectness. This meant that if the trials were high-quality, they could still provide important evidence, but that evidence was simply deemed a bit more uncertain because the patient population no longer fitted whit what we now define as ME/CFS.
The IQWiG approach took a more radical approach: they disregard 77 of the 85 RCT’s because participants were not required to have PEM or because the trial did not provide enough info on this. On the other hand, they also tweaked the criteria that NICE used: instead of 95%, they used a threshold of 80%. Trials in which 80% or more participants were reported to have PEM, were included. This meant that the PACE trial was included even though it selected patients using the Oxford criteria. The Dutch trial of CBT by Janse et al. from 2018 was also included even though it selected patients using the Fukuda criteria.
When evaluating the evidence on GET, only 2 trials were considered: the PACE trial and the GETSET trial. Similarly, when evaluating the evidence on CBT only 2 trials were considered: the PACE trial and the trial by Janse et al. In other words, it’s pretty clear that this report heavily relies on data from the PACE trial.
When it comes to evaluating the quality of evidence, IQWiG seems to be on the same line as NICE. They say that the risk of bias was high for all 3 trials across all outcomes. The English translation explains (page 92):
“The reason for this was the lack of blinding of the patients and the treating persons. Even if it is not possible to blind these people to the interventions examined, the open study design endangers the equality of treatment (performance including co-intervention bias in the case of concomitant treatments that can be influenced). In addition, in the absence of blinding, patients' specific expectations of the test intervention can be encouraged and thus the evaluation of subjective or subjectively collected endpoints can be statistically significantly influenced (detection bias)” They even say that they didn’t evaluate the quality of evidence of different outcome measure because all 3 trials were already considered high risk of bias. If I understand correctly, all estimates were considered the lowest quality of evidence, just like in the NICE review.
Now the estimates themselves. For GET, they added the PACE and GETSET data together and found an effect, a standardized mean difference of 0.37 [0.53-0.19]. If I understand correctly the report itself states that this effect was not big or clear enough to be considered relevant. They used the following rule: only if the confidence interval for the SMD lies completely outside the irrelevance range, which they defined as [-0.2-0.2], is an effect considered meaningful. And because 0.19 < 0.2 this wasn’t the case for the effect for GET apparently. There was also data on GET at longer follow-up times but here the control group in the GETSET trial actually performed better, so the report says no conclusion can be made because the follow-up data were too heterogeneous.
The data they used for the estimate above was a comparison of GET against usual care. So there really was no credible control group: in both trials patients who received GET also received usual care in addition to GET. Something similar was true for the evaluation of CBT as in the trial by Janse et al., patients in the “control group” were simply put on a waiting list to receive CBT as well. So, it was patients who got CBT versus patients waiting to get CBT. The effect size they found by adding this trial to the PACE trial data on CBT was an SMD of 0.39 [0.57-0.21]. This was considered a relevant effect because the lower bound (0.21) was higher than the significance threshold (0.2).
Despite all of this, the report seems to suggest that there is evidence for GET and CBT. On page 154, for example, it states:
“the available evidence is interpreted to the effect that GET can be beneficial at least for milder ME/CFS degrees of severity.”
The surveys on patients’ harm are disregarded because these were not considered a reliable enough source of evidence and the RCT’s did not show adverse effects of GET. It states:
“surveys (cross-sectional studies, retrospective before-after studies) are methodologically unsuitable for deriving reliable statements on the benefit or harm of a treatment.”
The report also seems to suggest that reports of harm are due to improper delivery of GET. Again, on page 154 it states: “
Patients should be warned before an undifferentiated use of a GET that this is only carried out by medical or physiotherapeutic specialists who have sufficient knowledge and experience with the clinical picture ME/CFS.”
When it comes to other treatments, the report indicates no reliable conclusion can be made, which is similar to the conclusion by NICE and other reviews.
The report recommends that the care for patients with ME/CFS should be strengthened starting with factual information for the general public and better teaching content for healthcare workers. It states that
“The information on ME/CFS will be published as a ‘topic’ on www.gesundheitsinformation.de after the project is completed and will be updated regularly.” The report also recommends biomedical research and says that the studies have been set up in 2022, including a Germany-wide ME/CFS registry coordinated by centres in Munich and Berlin.