So my main point is still a concern here: that n-of-1 approaches will always end up quite biased towards overestimating the relevance of specific variants for symptoms that might be better explained otherwise. It bears asking what "well-supported" late-onset symptoms of a heterozygous variant actually means. As you say, for genotypes that don't present severely and with early onset within families, what we have as evidence of pathogenicitiy is most often a handful of case studies--which specifically become case studies because there is something of note about them. So in each of those instances (correct me if I'm wrong) we don't have information on what proportion of individuals have that potentially pathogenic heterozygous variant and are completely fine (but presumably it is a higher proportion given that the case studies are explicitly presented as unusual). Heterozygous mouse models, similarly, largely come from uniform genetic backgrounds that might cause those symptoms to appear when they almost never would in a human.
Which then begs the question of how we can be even reasonably confident that the variant is actually relevant to those instances of late onset milder symptoms. If we had a large cross sectional study tracking people with that variant and saw that a similar phenotype tends to come up quite often, then I would be more confident (and please let me know if this is where your information is actually coming and was just not cited for many of the variants in that table). But when the literature is only a handful of cases, as you say, these may well reflect the effect of another factor--that could be an unknown"second hit" working synergistically with the original variant, or it could be that we're just highlighting two or three people who happened to develop a chronic illness explaining the symptoms (and the only reason it got documented is because someone was already interested in the variant they have). In which case, this might just be a repeated practice of comparing individuals who happened to have a variant and unrelatedly developed something like ME/CFS in the literature (which is classified as "well-supported" evidence of pathogenicity) to someone who happened to have a variant and unrelatedly developed ME/CFS in your cohort.
It's entirely possible that the variant of interest actually predisposes to developing a chronic disease like ME/CFS, and that would definitely be interesting information. But in order to determine that you need a big cohort study--going about that at an n-of-1 level is always just going to be an unsubstantiated guess about pathology. The variants that are relevant for a given person might not be the ones that actually get pulled out by the n-of-1 analysis. So trying to use an n-of-1 strategy to narrow down subgroups would not actually be giving you the relevant subgroups.
If any part of this logic is wildly off base or missing critical context please let me know.
I'm not sure we will shift each other much off our current thinking in a slow back and forward here

. I am pretty convinced that as we said; we need large studies
and consideration of the n-of-1s. The n-of-1 is not so hard these days with the tools we have and should be easily part of any project where they are doing WGS or even WES...; they truly go hand in hand IMHO.
And I really don't follow the logic on why rare variants can't be relevant. I think about Occam’s razor. Going to break this out from a big paragraph as requested

.
For HADHA: if we find 5/100 ME/CFS participants (we currently have 2 I think 40 unrelated individuals; pending confirmation of the unrelated count since we have affected family members) carrying the same rare variant, then we exceed a reasonable likelihood threshold. Not proving, but more datapoints.
Where that variant is known pathogenic with documented heterozygous effects (small number of case report/animal findings) and these confirm expressed long-chain fatty-acid abnormalities, oxidative-stress differences.
Where this variant in a homozygous state is associated with a very severe neonatal disease.
Where milder/more moderate later onset forms (often with "milder" variants) have been noted presenting with exercise intolerance/early fatigability, chronic fatigue, weakness, peripheral neuropathy.
Where the disease triggers are known to be illness, infection, strenuous exercise, prolonged fasting, stress.
Where SARS‐CoV‐2 has been shown to be a preceding event in the death of a patient with this variant (homozygous) - acute respiratory failure and refractory hypotension from severe cardiomyopathy, and a different variant (homozygous) with a second death.
Where the impacted mechanism of fatty-acid oxidation/long-chain FA metabolism, has been shown altered across a dozen or more ME/CFS/Long COVID/PASC studies.
Where progressive disease worsening has been seen in HADHA associated disease with even low amounts of exercise due to rhabdomyolysis and where rhabdomyolysis has been published as a theory related to PEM in ME/CFS.
Where chronic fatigue and exercise intolerance is a core symptom of many mitochondrial diseases, including in a number like LHON where there is an early onset disorder with varied severe symptoms beyond those two symptoms, but also a extended phenotype in the carrier parents that is often very much focused on chronic fatigue, brain fog, and exercise intolerance.
Where HADHA has been noted as harboring SNPs related to outcome of COVID severity.
Then I don't think we can say that none of that is relevant. ... Leads shouldn’t be dismissed simply because they are rare or might not be large signals in a heterogeneous cohort. This signal/a signal from the higher order program may even be present in larger datasets; we just need people to look. So we felt we needed to put some testable hypotheses out there. This is why we wrote these things:
"Our findings support the hypothesis that at least a subset of ME/CFS cases represent distinct molecular disorders that converge on shared physiological pathways." "While replication in larger, more diverse cohorts remains critical, the rarity of many variants identified suggests that population-scale validation for all may be impractical." "By integrating genomics, transcriptomics, and deep patient-informed phenotyping, we uncovered plausible, testable molecular contributors in over a third of participants." "We suggest that progress will require experimental designs that integrate individual-level genomic data with deep, participant-informed deep phenotyping, capturing the combined effects of rare and common variants and environmental modifiers on disease expression and progression." "We identified likely molecular contributors in 13 families, with impacted loci mapping to biologically plausible pathways, offering hypothesis-generating insights consistent with previously reported disease mechanisms and observed symptoms." "Moving forward, we feel that to refine mechanistic understanding, efforts may need to integrate deep, participant informed phenotyping with assessments of rare variants, polygenic risk, and environmental exposures, while also accounting for the timing of sample collection."
As you note; we’re also testing enrichment for other diseases for these variants using All of Us data though with ~600k genomes, small expected counts and often poor phenotype capture (e.g. no ICD-10 for post-exertional malaise) it's a bit trickier than it sounds, but we’re pursuing it.
I will leave it there. Will be off for a while now, but happy to connect later on in person for follow ups. And always happy for experts to point out errors; that is how we all get closer to the goal.