Bias due to a lack of blinding: a discussion

So I was talking about the likelihood of blinding being associated with exaggerated effects estimates in any of the different scenarios they were looking at. Is it right that none of those analyses found a significant result?
Yes, that's how I understood it (see figure 2 in the paper).

The blinding of patients in trials with patient-reported outcomes came closest to finding an effect with a ROR and 95% confidence interval of 0.91 (0.61 to 1.34). The confidence intervals are quite large so it's possible that future replication attempts will find a significant effect and that this selection of meta-analysis happened to be one that resulted in no effect by coincidence. All in all, the number of meta-analyses for each analysis wasn't that big, for example 18 for blinding of patients in trials with patient-reported outcomes. It could be that two or three of these were misleading (the unblinded trials found low effects for other reasons) and messed up the analysis.

The supplementary material gives an overview of the individual meta-analyses.

upload_2020-1-23_11-51-12.png

Would like to have a look at a few examples of blinded and not-blinded comparisons to see what's happening here, although that will be a lot of work (There were 132 trials contributing to this comparison). Below are what the 18 reviews are about (by googling the codes):

upload_2020-1-23_11-52-33.png
 
Last edited:
I'm not sure how it would be possible to adequately blind a lot of these trials anyway (in terms of the patient knowing whether they received the intervention or not). The potential heterogeneity is huge. Grouping them by outcome seems to miss the point.
 
Would like to have a look at a few examples of blinded and not-blinded comparisons to see what's happening here, although that will be a lot of work (There were 132 trials contributing to this comparison). Below are what the 18 reviews are about (by googling the codes):

View attachment 9673

Thanks. That's really useful.

I just looked at the first participant blinded study in the MS review: https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD009131.pub3/epdf/full

That was Brissart 2012:

https://sci-hub.tw/https://www.tandfonline.com/doi/abs/10.1080/13554794.2012.701644

It's not clear to me why it was rated as the participants being blinded:

Patients were randomised into one of these two groups:

Cognitive program: ProCog-SEP(Appendix A)The ProCog-SEP program (Brissart & Leroy,2011)is composed of 13 sessions of exercises based on cognitive impairment reported in MS literature.In order to optimize metacognition, each cogni-tive function was explained at the beginning of each session, in order to link exercises with everyday life.Thirteen bi-monthly sessions were proposed dur-ing 6 months. These sessions consisted on exercises about semantic memory and lexical access (sessions1, 2, and 8), visual episodic memory (session 3),verbal episodic memory (session 4), working mem-ory (session 5), associative memory (session 6) and executive functions (session 7). The last 5 sessions were composed of “multifunction” exercises.In this program, before each session, we explained the main cognitive topic. For example,about working memory, we explained that this sys-tem takes place in most daily activities...with examples. We discuss about problems that patients could experience, as when you go into anotherroom, you can forget why you came there. We sug-gest strategies to implement. And we do the exer-cises on this topic.

Discussion program (Appendix B)The Discussion program is composed of 13 sessions of neutral discussions and various non-cognitiveexercises. For example, sessions 6 to 8 were about culinary specialties, in order to encourage patients talking about gastronomy, exchanging recipes.Thirteen bi-monthly sessions were proposed during6 months.Patients are not informed of the program in which they are included

This is a simple blind study: the neuropsychologist, who assessed patients before and after program, is not informed if the patient has completed cognitive intervention or control group.

Limit of this study is of course the small effec-tive, but it is very difficult to develop cognitive intervention in young adult people. Most of patients stopped their participation when they are in control group; or go in sessions but not regularly. Another difficulty is to develop a double blind study in cognitive rehabilitation, because patients can understand easily in which group they are assigned.

To me (after a five minute skim) it looks like that the Cochrane reviewers were wrong to class participants as blinded in that trial, but that's not to say that the authors of this new BMJ blinding review classed these participants as blinded: they said that they had done their own checking, contacting authors when things were not clear. That's made me realise that it's going to be even more difficult to read these results in the context of the original trails as we only have info on how the individual trials were assessed by the Cochrane review authors, not the new BMJ paper's authors.
 
Last edited:
I think Fiona Godlee's editorial illustrates the hopelessness of the current academic situation - where 'informed opinion' is created by people who lack any common sense understanding of the material. This seems to be the world we are now in - presumably because motivations have changed. The old motivation for being rigorous in biomedical science has presumably disappeared under the forces of business interest.

If the stalwarts here can make enough sense of this study to find obvious holes I welcome that. I am too old and tired I am afraid.

And we should always remember E12's original comment to me that for therapist-delivered treatments for ME the problem of lack of blinding is as acute as it could possibly be because the treatment is specifically designed to induce subjective bias. At least that is written in black and white in what will be my published statement to NICE. (In addition of course to being out there in published studies such as Wilshire et al and Geraghty).
 
The more I read this, the more disturbed I get. Now looking at the attached commentary: Fool's gold? Why blinded trials are not always best, yet the majority of the examples they use for arguing that blinding is bad, are instances where blinding *hasn't worked* (for whatever reasons). That's not the same thing. Sometimes it is really hard to find an adequate placebo that does the job properly - but that's a separate issue. In those instances, it is important to take other steps to try to minimise bias - like not relying on subjective outcomes, for example. *ahum*

I had a similar (frustrating) experience while teaching med students about RCTs when I was at the Royal Free. My teaching assistant kept undermining the session to say that actually RCTs were not good trials at all, because the trialists could cheat the randomisation process and pick which patients got the intervention (apparently this is what the psychiatry professor she worked for was doing all the time!!!). That's not the fault of the methodology though - that's the fault of the trialist!

I'm quite amazed that so far there are no rapid responses to the contrary. What is going on???!
 
The more I read this, the more disturbed I get. Now looking at the attached commentary: Fool's gold? Why blinded trials are not always best, yet the majority of the examples they use for arguing that blinding is bad, are instances where blinding *hasn't worked* (for whatever reasons). That's not the same thing. Sometimes it is really hard to find an adequate placebo that does the job properly - but that's a separate issue. In those instances, it is important to take other steps to try to minimise bias - like not relying on subjective outcomes, for example. *ahum*

I had a similar (frustrating) experience while teaching med students about RCTs when I was at the Royal Free. My teaching assistant kept undermining the session to say that actually RCTs were not good trials at all, because the trialists could cheat the randomisation process and pick which patients got the intervention (apparently this is what the psychiatry professor she worked for was doing all the time!!!). That's not the fault of the methodology though - that's the fault of the trialist!

I'm quite amazed that so far there are no rapid responses to the contrary. What is going on???!
i agree @Lucibee. One good example of something you can’t really blind would be IV fluids. Some people swear it helps them, while others are simply sipping from their cup. You can’t blind IV therapy, because you know if it’s going in and sometimes you can tell whether it’s fast or slow.
 
I think it may be very relevant that the 18 Cochrane reviews mentioned above are pretty much all looking at rather obscure interventions that are pretty unlikely to have any great value in the long term. They are fiddling around the edges interventions, not because the conditions being treated unimportant but because the treatments are a priori pretty unlikely to matter that much.

Why are there no studies of things that work, simply and obviously? Maybe because meta-analyses by nature deal with things that don't work much so are hard to prove effectiveness for in a single trial.

This is all about making it easier to do worthless trials, basically.
 
i agree @Lucibee. One good example of something you can’t really blind would be IV fluids. Some people swear it helps them, while others are simply sipping from their cup. You can’t blind IV therapy, because you know if it’s going in and sometimes you can tell whether it’s fast or slow.
You cannot blind to the fact of receiving IV fluids, but you can blind to whether you are being given any active medication or just placebo, same as with other medication.
 
I think that everyone should be cautious about the way this paper is assessed until it has been really dug into. There are fair reasons for concern that it could lead to problems being overlooked, and Sterne's involvement is going to lead to scepticism, but it's worth trying to avoid getting ahead of ourselves.
 
I think this paper is based on a flawed premise: That if, on average, blinded trials show broadly similar treatment effectiveness to that shown by unblinded trials, then blinding is potentially superfluous and unnecessary.

To me this is akin to saying that if, on average, aircraft with rotor blades show broadly similar flying capability as aircraft without rotor blades, then rotor blades on aircraft are potentially unnecessary.

In the same way that some aircraft, helicoptors, are designed with rotor blades for very good reason, then so also some trials are designed with blinding for very good reason. And if we assume that a reasonable majority of unblinded trials are well designed by competent researchers, then why should those trials not show similar treatment effectiveness to blinded trials also designed by competent researchers?

If you want to assess whether blinding is necessary or not, then you won’t prove anything by comparing trials properly designed to be blinded against trials properly designed to be unblinded. You would need to compare trials designed to be blinded, against those same trials being run without blinding.

I'm not sure this paper proves anything with regards to badly designed unblinded trials; trials that should have been blinded, or used objective outcomes, but did not.
 
Last edited:
Some quick notes on the second trial where the MS cochrane review rated participants as being blinded, Cerasa 2012: 'Computer-Assisted Cognitive Rehabilitation of Attention Deficits for Multiple Sclerosis: A Randomized Trial With fMRI Correlates'. I'm hoping to find the time to go through them all, and then maybe move on to another review. Sorry if that clogs this thread up a bit.

https://journals.sagepub.com/doi/full/10.1177/1545968312465194?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub=pubmed

They describe it as a double-blind study and made some real efforts to limit problems with bias. When the participants are doing quite different tasks it's difficult to know to what extent this should be viewed as being 'blind' to allocation. This made me think of a potential source of bias: those particularly keen to promote their results might also be particularly willing to class their trial as double blind? (edit: I'm not sure if I should say this when it's such a prejudice driven hunch, but there were some things about this paper that made me feel like they were really trying to sell their work. They were from an Italian uni so that could just be different styles of writing in English?)

The control group were using a programme created in-house, that they used at home while the treatment group used a commercially designed and sold software package at the clinical centre. Is that really double blind?

The patients were not informed about their group assignment, and the rationale about their training performed in our clinical center (experimental group) or at home (control Group).

The experimental group met twice a week for 1-hour sessions for 6 consecutive weeks. Sessions consisted of computer-assisted training of several attention ability and information processing tasks. Cognitive training was performed using the software RehaCom (www.Schuhfried.at). We decided to use this system because RehaCom is Europe’s leading software for CR, and this is now used in more than 1000 hospitals and practices across Europe. Moreover, several previous studies have used RehaCom to evaluate the impact of CR in patients with schizophrenia,27 MS,8,12 and head trauma.28 This software has a specific interface consisting of a special keyboard that has been designed to minimize the interferences caused by motor and coordination disabilities. This tool can be installed on a standard PC and stimulates specific cognitive functions at different levels of complexity and with different modes of input/output. Each cognitive performance and its progress can be recorded and saved on the hard drive. The study treatment consisted of “divided attention,” “attention and concentration,” and “vigilance” programs. In divided attention, the patient is required to simulate a train driver, carefully observing the control panel of the train and the countryside. Several distractions, such as crossing animals and train speed must be taken into account, with increasing levels of difficulty. In the procedure attention and concentration, an individual picture (target) is presented and then compared with a matrix of pictures. The patient has to recognize the target picture (coded as symbols, items, animals, or abstract figures) and select it from the matrix. The abilities to differentiate and to concentrate are trained simultaneously. The level of difficulty rises as the number and complexity of pictures to recognize increases. Finally, during the vigilance program, the patient is trained to sustain his or her attention for a long period of time by providing response times limited to the various items. The task of the patient is to control a conveyor belt and to select the objects that differ from a sample in one or more details.

The control group underwent a similar exposure to computerized tests, consisting of 12 individual 1-hour sessions over a 6-week period (2 sessions per week). In particular, the control group performed a visuomotor coordination task by using an in-house software with which they had to simply respond quickly and accurately to the appearance of target visual stimuli (numbers 2–4–6–8) on the screen by pressing the corresponding number key on the keyboard. The level of difficulty rises as the speed of external stimulation, defined as interstimulus interval, rises from 3 s (level 1) to 500 ms (level 12). All performances were recorded on an external hard disk and analyzed in our research center. All patients completed the training at home with optimal performance. An experienced treating clinician was responsible for setting up the rehabilitation program and assisting patients during the study and control treatments.

Finally, at the end of the 6-week training, participants from both groups were given a blind evaluation, using the same protocol as at a baseline (T1). Two MS patients from the control group and 1 from the experimental group decided to not continue with the protocol. Thus, 12 MS patients from the experimental group and 11 from the control group completed the protocol.

The authors of the paper speculate as to why they may have better results than some other trials:

However, conflicting findings have been reported about the effectiveness of the various CR techniques.6-12 The lack of conclusiveness in these studies appears to be mainly caused by methodological problems, such as high variability in the selection of clinical phenotypes, the use of unstandardized training tools, and the lack of a control group. On the other hand, the most convincing evidence comes from some studies8,11 that focused cognitive intervention on a limited cognitive domain. In particular, rehabilitation of attention abilities would seem to provide the most evident beneficial effects.

They were particularly selective with their participants:

Cognitive impairment involves approximately 50% of all MS patients and affects long-term memory, sustained attention, verbal fluency, information processing speed, and executive functions. That we have enrolled only 26 MS patients with predominant attention deficits from our initial group (n = 117 patients) highlights the fact that our findings cannot be generalized to all MS populations. Therefore, to sustain the usefulness of our CR therapy for minimizing the emergence of cognitive impairments in MS patients, further studies are warranted that use a larger sample and consider patients with additional cognitive deficits.

The only outcome measure that seemed to have a statistically significant result was the stroop test, but that seemed to be a relatively big change, and there were only a total of 26 participants. Also, it's certainly possible to imagine a computer-game like training package improving results for the computer game-like stroop test without leading to real improvements in patients wider life. It's difficult to judge the extent to which the 'treatment' computer package was training people with programes similar to the stroop test, but the control package sounded more simplistic (and boring).

I didn't read this paper closely but it looked like they used a number of objective outcomes and some subjective self-report ones.

tldr: This was a small trial that made some efforts to account for the problems of not being a double blind trial, but I don't think that means it should be classed as a blinded trial. If a pharma trial treated trial participants this differently it would not be classed as a double blind trial. Also though, the outcome they reported a positive result for was the stroop test, not a self-report outcome.
 
@Esther12 The authors of the BMJ study selected trials from each Cochrane review for their comparison, they didn't use all of them. They said they used 132 trials from these 18 Cochrane reviews but if you add up all the studies in these reviews you would come to a much large number. So it's possible that the studies you discuss weren't used in their comparison.

I don't know what No. high risk and No. low risk mean (I assume that this refers to the overall risk of bias by the Cochrane review authors) but if you add all those up, the total is 132 so this probably refers to the trials selected from each Cochrane review. For the MS trial you were looking at, that would mean they selected only 3 of the 20 studies in the review, presumably the ones with the most adequate comparison to test the effect of blinding.

upload_2020-1-23_11-51-12-png.9672


I don't think the study or supplementary material give any clues to what trials and (which outcomes in those trials) were actually used for their comparisons. We also don't know for which trials the blinding status was rated as "Definitely yes", " Probably yes", " Unclear", "Probably no", " Definitely no". Therefore I've sent an email to the corresponding author today, asking if she could give me a list of the trials and their assessment of blinding for each of them. I hope they respond soon. As you can see from the graph, for most comparisons only a handful of trials were used, so it should be possible to go over these to get an impression of what is being compared.
 
Last edited:
The supplementary material does give the total figures for each blinding assessment:

Definitely no - 16

Definitely yes - 15

Probably no - 73

Probably yes - 18

Unclear – 10​

The paper explains that the comparison was made between trials where blinding was assessed as Definitely yes and Probably yes, versus trials assessed as Unclear, Probably no, and Definitely no. So of the 132 trials that looked at blinding of patients, blinding was assessed as probably yes or definitely yes in only 33 trials (25%).

I think this is a bit strange. You would think that the non-blinded trials form the exception. If blinding is possible you would think that most randomized trials did that and that the trials that didn’t manage to blind patients, even though it was possible, form the exception. Here it seems to be the other way around.
 
Thanks Michiel - I knew that my approach was highly inefficient, but I was a bit interested in going through the trials of neurological rehabilitation rated as being participant blinded anyway, and thought it could be useful for understanding how the BMJ paper as they must have had at least classed at least one trial as being properly blinded.

I'm too scared to write to authors for further info... isn't that harassment nowadays?!
 
I'm too scared to write to authors for further info... isn't that harassment nowadays?!
I realize you're being sarcastic but in essence, I'm not asking for 'further info' but basic info about their methodology that probably should have been reported in the supplementary material of their paper. I think a paper should have sufficient information on how the results came about.
 
Back
Top Bottom