I've mostly focused on the issue of blinding. On this aspect, the new tool is less bad than one would think after first reading the paper. I'll try to explain below:
The old version
Suppose that a trial did not blind patients or therapists and that it used subjective outcomes. In the old tool, a reviewer has to answer the question whether "the outcome is likely to be influenced by lack of blinding". So in the case of subjective outcomes, the answer would be yes and the trial would be rated as high risk of bias. If the trial also didn't blind outcome assessors and it is likely that the outcome is influenced by lack of blinding, the trial would also be rated as high risk of bias for this domain.
So the trial would be rated as high risk of bias for at least 2 out of 7 domains. It can, however, get good scores (low risk of bias) for the other domains. And there is no rule for how these risk of bias domains should be added up to assess a trial. The Cochrane handbook writes:
Cochrane Handbook said:
any assessment of the overall risk of bias involves consideration of the relative importance of different domains. A review author will have to make judgements about which domains are most important in the current review. For example, for highly subjective outcomes such as pain, authors may decide that blinding of participants is critical.
Unfortunately, Larun et al. didn't do that. In the GET review, they wrote that "risk of bias across studies was relatively low." They briefly mentioned that not blinding patients and therapist and using subjective outcomes might cause bias, but then argue that "many patient charities are opposed to exercise therapy for chronic fatigue syndrome (CFS), and this may in contrast reduce the effect." In other words, the treatment must be really working because so many patients oppose it!
The new tool
The new tool is different as it specifies how the scores for the trials should be added up to the overall risk of bias, which is probably one of the reasons why it is more complex. Sterne et al. write:
Stern et al. 2019 said:
The overall risk of bias generally corresponds to the worst risk of bias in any of the domains.
So if a study scores high risk of bias in one of the domais it should be rated as high risk of bias overall, or more precisely: for the particular outcome assessed, because Rob 2 encourages splitting up the assessment per outcome or result. The previous version just noted that one could split up the results in objective or subjective outcomes if that is thought the be helpful.
So back to our hypothetical unblinded trial: how would it be rated in the new risk of bias tool?
In domain 2 they ask wether patients and therapist were aware of the assigned intervention. But that domain only assesses one part of the bias caused by a lack of blinding and in my view, it's the least important part. It deals with changes to the interventions received because people in the trial were aware of the assignments. So for example, if patients know they are in the control group they might follow other treatments during the trial (co-intervention) or therapist might treat patients differently if they know they are in the intervention group. This can only lead to a high risk of bias if many conditions are met. So let's skip this domain.
The bias due to blinding that we are interested in is assessed in domain 4: "Risk of bias in measurement of the outcome". Skip the first two questions, it really starts at 4.3 where they ask if outcome assessors were blinded. The thing is that "for participant-reported outcomes, the outcome assessor is the study participant." So that would be the case in our hypothetical trial. The next question asks "Could assessment of the outcome have been influenced by knowledge of intervention received?" This another way of asking whether it was subjective (such as pain/fatigue questionnaires) or objective (such as all-cause mortality). The elaboration reads:
Supplementary material Sterne et al. 2019 said:
Knowledge of the assigned intervention could influence participant-reported outcomes (such as level of pain), observer-reported outcomes involving some judgement, and intervention provider decision outcomes. They are unlikely to influence observer-reported outcomes that do not involve judgement, for example all-cause mortality
That seems to be the case for our hypothetical trial. So far so good.
It's mostly the next question that annoys me. After having already asked whether it is possible that the outcome was influenced by knowledge of the intervention received, it now asks whether this influence was likely or not. It doesn't give an example where it wasn't likely and I can't really think of a scenario where this is the case. As Jonathan Edwards' letter points out, the examples of outcomes likely being influenced by unblinding are rather extreme like a physiotherapist who assed the intervention he himself delivered or a homoeopathy trial with patient-reported symptoms. What I really don't like in their elaboration is the following sentence:
Supplementary material Sterne et al. 2019 said:
When there are strong levels of belief in either beneficial or harmful effects of the intervention, it is more likely that the outcome was influenced by knowledge of the intervention received.
Both the PACE authors and Larun et al. have argued that bias due to lack of blinding is not really a big concern because patients don't like GET. In the PACE-trial expectations for GET were no higher than for APT. I suspect researchers might use such measurements of expectations before the trial to argue that bias due to lack of blinding is not really a big concern.
In my view question 4.5 in the new Rob tool is redundant and at risk of being misused, so it would better be deleted.
But back to our imagined trial: an honest reviewer would say that it is indeed likely that "assessment of the outcome was influenced by knowledge of intervention received", so answering Yes to question 4.5. And yes to 4.5 means high risk of bias for this domain (it doesn't really matter anymore what the answers to 4.1 and 4.2 were). High risk of bias in one domain means that the overall risk of bias for that outcome of the trial should be rated as high risk of bias as well.
So overall, it seems that in Rob2, it's more difficult to rate a trial as high risk of bias for each domain. But if it is rated as high risk of bias for any of the domains, that should be the overall rating as well. Whether that's an improvement or a step backwards, will probably depend on how serious reviewers take the overall rating of bias. If they are allowed to ignore it and just say: 'some domains were good, some were bad', then the bar will be lowered. The traffic light representation of bias will show more green lights. If however the overall risk of bias is taken seriously and required to be included in the conclusions of the review paper, then this could be a step forward. Because this method is able to represent the fact that one major shortcomming in a trial is sufficient to see it as flawed (high risk of bias), no matter how good the other parts are.