My estimate is that the sampling should either be local population based as for the ME Biobank cohort (rather than through internet enquiry) or, if via internet, it should be as high a percentage of the total ME population as possible. Larger numbers help with the statistical power problem but his is a different issue.
If the number needed for the statistics is more than you can get through local population searching - which they have decided it is, then I would like to see samples being collected from about 50% of all UK ME cases. If there are 120,000 PWME in the UK then that means 60,000. Their target is 20,000 which I think leaves room for a significant false positive signal to come up due to sampling bias. 60,000 is probably impractical so there has to be a compromise, but I would favour collecting as many samples as they can and then testing a random sample.
Let us say that there is a gene allele A that for some irrelevant hum-drum reason that may be very indirect means that people with this allele are more likely to volunteer for GWAS projects. It might be that they have an unusual blood group and are sought out by blood banks so are used to contact with healthcare systems. Even if when they get ME they cannot give blood they might be readier to volunteer for research. Let us say that 1% of the population have this allele. It might be an allele that goes with choosing science subjects at school. It could be all sorts of things.
If there are 120,000 PWME and 20,000 volunteer then you might find that 800 subjects have A, instead of the expected 200. The number could be up to 1,200 but will not be that bad. If 60,000 volunteer, with the same degree of bias you still cannot get more than 1,200 and probably will not get more than 1,000, instead of the expected 600, because as more people volunteer the bias from 'keen volunteers' will drop off. For 20,000 subjects you get a fourfold skewing that would show up strongly on the genetic risk analysis. For 60,000 you might get a 1.67fold skewing that would show up much less.
It may be that this is a problem that would show up with other GWAS studies and turns out not to be a big issue. But I don't think one can be sure that there will be no bias with ME just because there is no bias for diabetes or hypertension.