Making sense of a systematic review - 2021 David Tovey

Discussion in 'Other health news and research' started by Sly Saint, Dec 16, 2024.

  1. Sly Saint

    Sly Saint Senior Member (Voting Rights)

    Messages:
    10,239
    Location:
    UK
    Abstract
    The article in last month’s issue explored the place of systematic reviews in informing health care decision making. This article describes the core components of a high-quality systematic review of health care interventions. These components include an assessment of the risk of bias for individual studies, evaluation of the overall certainty of the body of evidence for the main outcomes, meta-analysis and the interpretation of the data. An understanding of these differing elements is the key to recognising the potential and important limitations of any systematic review.

    paywalled

    Making sense of a systematic review - David Tovey, 2021
     
  2. InitialConditions

    InitialConditions Senior Member (Voting Rights)

    Messages:
    1,851
    Location:
    North-West England
    Does anyone have access to this?
     
  3. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,059
    Location:
    London, UK
      • The beginning:
      • Abstract
    The article in last month’s issue explored the place of systematic reviews in informing health care decision making. This article describes the core components of a high-quality systematic review of health care interventions. These components include an assessment of the risk of bias for individual studies, evaluation of the overall certainty of the body of evidence for the main outcomes, meta-analysis and the interpretation of the data. An understanding of these differing elements is the key to recognising the potential and important limitations of any systematic review.
    Clinical case scenario
    You notice towards the end of your morning surgery that you are due to see Mr and Mrs Smith. You saw Mr Smith alone last week to consider his medication. He is 78 years old and had a small stroke 12 months ago. At the previous consultation you noted that he was not taking any treatment aimed at lowering blood pressure. At the time, you recalled noticing a recent Cochrane systematic review on the role of blood pressure reduction in patients who have had a stroke or transient ischaemic attack.
    You realise that you have 10 minutes to try to make sense of the article, even though it runs to over 50 pages. You find the GRADE Summary of Findings tables and by tracking along each of the rows you get a measure of the review’s findings.
    When Mr and Mrs Smith arrive, you bring up the Summary of Findings table on your screen and ask the couple what matters most to them. You explain that there are a range of differing outcomes, both of benefits and harms. The couple exchange looks and smile to one another. Mrs Smith explains for both of them. ‘The stroke frightened both of us, so our first priority is not to go through that again. This is important to us, even if it means him putting up with some side effects.’ The review shows that there is moderate certainty evidence of a reduction in the likelihood of stroke – nearly 20% in relative terms, but translating to around 2% in absolute terms. You explain this finding to the couple. You note that the agents most studied were angiotensin-converting inhibitors (ACE) inhibitors and diuretics. On this basis, and having no contraindications, Mr Smith agrees to start an ACE inhibitor.
    The basics
    In last month’s article, a framework for assessing the validity and quality of systematic reviews was described (Tovey, 2020). This paper will focus on the appraisal of evidence included within the review. When considering the effects of any health care intervention, decision makers need some key information from systematic reviews. First, for any intervention they wish to know whether there are any differences in effect compared with the most plausible alternative (which may be an active treatment or placebo). Second, they need to understand the direction and magnitude of such differences. Lastly, evidence users require an assessment of the degree of certainty that effects found in the relevant studies will translate into the real world. In this article I consider the last question first, since the measure of our uncertainty is critical to everything that follows.
    Risk of bias
    Traditionally, randomised controlled trials (RCTs) have been seen as the top of the primary evidence pyramid. In most circumstances, this remains valid when one is considering the effectiveness or harm of a health care intervention. However, RCTs are not equal in terms of their design, conduct or reporting. For the last 15 years, assessments of RCT quality have been replaced by assessments of their risk of bias using specific and explicit criteria. The Cochrane risk of bias tool has become the accepted best practice method (Higgins, 2019).
    For each RCT included in a systematic review there should be an overall rating: one of high, ‘some concerns’ or low. Bias is assessed for the key outcomes of each trial across a number of fixed domains. These are:

    Bias arising from the randomisation process: For example, if the researchers did not manage the process to ensure allocation to the different interventions was strictly random and concealed from the participants and investigators

    Bias due to deviations from intended interventions: For example, if participants or researchers were likely to be aware of an individual’s allocation during the trial (failure of blinding)

    Bias due to missing outcome data: For example, if key outcome data were not available for all patients due to drop out or other factors

    Bias in measurement of the outcome: For example, if the measurement of the outcome might have been influenced by factors such as the researcher’s awareness of the allocation

    Bias in selection of the reported result: For example, if outcomes that had been identified as important in the protocol were not included in trial reports
    In each case ‘signaling questions’ are used to identify whether bias is likely to have occurred, and algorithms link the responses generated to a judgement. The overall assessment for the study is then the most pessimistic rating across the individual domains. Thus, if for any domain, a high risk of bias is judged to be present, the overall rating is ‘high risk’. If across a number of domains there are ‘some concerns’ this may also lead to an overall rating of ‘high risk’. Judging the risk of bias is a challenging process, requiring a high level of methodological understanding and skill. However, it is essential to consider the validity of the results of each study included in the systematic review.
    Evaluation of the certainty of a body of evidence
    The degree of certainty, or overall quality, of a body of evidence in delivering valid and replicable results does not solely depend on risk of bias. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group began in 2000 with the task of describing the quality of a body of evidence in an intuitive, meaningful and transparent manner. The group is an informal collaboration that includes many of the world’s foremost evidence synthesis researchers. The GRADE approach now sits at the heart of most high-quality systematic reviews and their translation into guidelines. The most common output of the GRADE process is the ‘Summary of Findings’ table, which combines many of the key findings of a systematic review and provides a crucial link to the development of high-quality guidelines (Guyatt et al., 2011).
    There are a number of key elements of GRADE of which readers should be aware and that explicitly aim to make the output of a systematic review more applicable to decision making. These are summarized in Box 1.
    Box 1. Characteristics of GRADE.
    • Requires a priori identification of up to seven outcomes that are most important to decision makers. At least one outcome should represent potential harm • Quality (or certainty) is assessed for each outcome across studies • The assessment is based on fixed and transparent criteria. Evidence from RCTs may be downgraded whereas evidence from non-randomised studies can be either be up- or down-graded. • Effect estimates are provided in relative and absolute terms • Certainty ratings are presented in intuitive and accessible form: high, moderate, low and very low. These can be translated into easily understandable language for interpretation by non-specialists
    For each comparison, systematic review authors are invited to identify at the outset and before considering any results, up to seven outcomes that are most important to decision makers. This seeks to avoid both the risk of preferential selection of outcomes on the basis of the greatest effect, and also long lists of potential outcomes that are not only confusing to the evidence user but also may not reflect the outcomes most important to patients and their carers. At least one outcome should represent a potential harmful effect of the intervention. This is critical, as many systematic reviews (and published trial reports) are rightly criticised for inadequately researching and reporting harmful effects.
    Quality (or more recently ‘certainty’) is judged per outcome across the relevant studies. This approach differs from traditional evidence grading systems, for which the studies, not outcomes are the unit of interest. The GRADE approach is therefore more intuitive and useful to decision makers, who are more interested in whether the intervention leads to specific outcomes than what was found by each individual study.
    The quality (certainty) of evidence is judged on the basis of fixed criteria, but it is possible either to downgrade evidence from randomised trials (which start out rated as ‘high’) or to downgrade or upgrade the certainty of evidence from non-randomised studies. Traditional models have invariably favoured RCT evidence, which may be misleading in some circumstances.
    The terminology used avoids inaccessible, ambiguous and misleading numbers or letters (e.g. grade 2 or A evidence) in favour of accessible terms (high, moderate, low, very low), where the rationale is presented to the reader transparently via footnotes. This is important since the judgements made, on whether to downgrade by one or two points, are partly subjective. The factors that can lead to a downgrading of evidence are intuitive and important. For each consideration it is possible to downgrade from a baseline of ‘high’ for randomised studies and low for evidence from non-randomised studies. For example:

    Where the body of evidence is thought to be at a high risk of bias

    Where the body of evidence is ‘indirect’ and therefore may be potentially inapplicable to decision making. Such indirectness might be related to the population, mode of delivery or dosage of either intervention or comparison, or outcome measure (for example, indirect outcomes measures such as measures of bone density, as a proxy for fractures)

    Where the results of the studies are inconsistent (heterogeneity)

    Where the results of the evidence are imprecise (for example, the confidence intervals of the absolute effects are sufficiently wide that they are consistent with both an important effect or no important effect)

    Where there is evidence of reporting bias (publication or selective outcome reporting bias)
    In addition to judging quality (certainty), the GRADE approach also steers researchers to present effect estimates in both relative and absolute terms. This is important to guide decision making, since it is known that presenting only relative effects leads to exaggerated (usually over optimistic) perceptions of any effects (Gigerenzer, 2002). Presenting relative effects alone (e.g. Odds Ratios or Relative Risk) can make it impossible, or at least challenging, for the reader to calculate the likely absolute effect, expressed either in natural frequencies, fractions or ‘Numbers Needed to Treat (or Harm)’. Outcomes measures are described in more detail later.
    Figure 1 describes the Summary of Findings table relating to the clinical case scenario (Zonneveld et al., 2018). This example includes seven outcomes of interest. Unusually, these all represent potential benefits of the interventions studied. This is probably due to the different harmful effects of the drugs studied. Evidence is downgraded for risk of bias in the underlying studies for two outcomes, and both relative and absolute effects are described.
     
    Sean, alktipping, Binkie4 and 3 others like this.
  4. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,059
    Location:
    London, UK
    It reads to me as pretty received dogma stuff about GRADE and RoB.
    I have sent full text to IC.
     
    Sean, alktipping, Binkie4 and 5 others like this.
  5. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    14,568
    Location:
    Canada
    Easy to see why this industry is such a total failure. This is a laughable process. It starts by placing RCTs as the top of the pyramid, but in every other profession evidence as low quality as RCTs is generally not even looked at, it's far too biased and arbitrary. No one in my profession would even look at anything this flimsy, and there are industries with even higher standards.

    So the process of so-called evidence-based medicine and systematic reviews is pretty much "leave the thinking to us". In the sense that the thinking is a quality control process. Except it doesn't do that at all, it controls nothing and the rules are completely arbitrary, with more exemptions than anything else. And the overall quality of clinical trials is so abysmal that you can't polish those turds into anything other than polished turds.

    It's very telling that in the examples he gives about factors that should downgrade studies, the entire evidence base for pretty much everything biopsychosocial/psychobehavioral not only rates highest, but it technically rates twice, since he puts in there a condition for downgrading being that the entire body of evidence is at high risk of bias. Which it sure does, for all the other examples listed, and then some.

    But this type of evidence is instead held up as superior, unquestionable. It can remain so for literal decades while nothing is done to improve the quality of the "manufacturing", the trials themselves, and instead the quality control process, so-called evidence-based medicine, keeps lowering the bar and ignoring their own principles.

    Tovey took years to act on the awful Cochrane GET review, and it checks every single example for downgrading to a high degree, and several more. And who knows if they would have gone ahead with it had he stayed in his position, but I assume it wouldn't.

    This can really only be described as an odd turd-polishing industry that isn't concerned with processing turds any more than it is concerned with their polishing process.

    And somehow there's nothing we can do about this. This madness is held above the scientific method, precisely because it's so arbitrary that after years of propping this as the response to bad biased studies, when it effectively accomplished the exact opposite, it's too embarrassing to admit. It turned the bad studies into an actual too-big-to-fail industry in its own right, one that produces turds not because it meets any need, but instead of doing that. The best turd producers and the best turd polishers working side-by-side producing nothing but fine grain polished turds.

    This is very likely the absolute worst industrial process ever used by professionals, and by a wide margin. And this is exactly why it's untouchable. It's like an entire industry had been captured by corruption, bribes, embezzlement, and so on, where the only way to be promoted is to be the best at being corrupt. There's no fixing this. Ever. Only replacing it entirely, and that's not about to happen any time soon.

    Meanwhile our lives literally depend on this being fixed and there's no chance of this happening while the biggest opportunity to solve this problem in history is being wasted, entirely because of how thoroughly corrupted the system has become.

    And of course with science being attacked illegitimately from all sides, even though this process has nothing to do with science, criticizing it rightfully puts us in the same category as trolls and conspiracy theorists. That's just great. I love everything about this system and how it works out for the greater good. /s
     

Share This Page