Stanford Community Symposium 2018: Phair, Metabolic traps, Tryptophan trap

If it's true that 75% of the population might be at risk for ME/CFS via a metabolic trap, it would imply that there are some other factors of considerable importance that are necessary for ME/CFS to actually manifest.

What kind of triggers can induce an increase of tryptophan in the cytosol? I wonder how this list would compare to the list of stressors known or suspected to be capable of triggering ME/CFS.
 
Just read the genetics part of this paper (https://www.mdpi.com/2075-4418/9/3/82/htm) and because Kashi et al. propose that "any predisposing genetic mutation must be common" I turned once again to the huge UK Biobank study with 2017 individuals with self-reported diagnosis of ME/CFS. UK Biobank data does not support this hypothesis involving common DNA variants. View attachment 7851

What does "deemed significant" mean?

Ron Davis recently said at the Invest in ME conference that 75% of the population have IDO2 mutations, but 20 severe patients and 46 additional patients tested all had IDO2 mutations. This is very unusual.

Something is very wrong. Either their method, or this biobank data.

"Deemed significant" - this is the conventional genome-wide threshold, i.e. an industry standard. See https://www.ncbi.nlm.nih.gov/pubmed/30349118 or https://www.biorxiv.org/content/10.1101/176834v2 if the former is behind a paywall.
The UK Biobank data has been acquired and analysed to very high, international standards. This data set is human genetics' "go to" data set internationally.

@RDP
 
Systems and maths and knowledge of ME - few and far between . off top of my head - Gordon Broderick who has worked with Klimas and Craig Robertson who works with Myhill ( his premise on mathematical catastrophe theory as applied to ME is kind of like a bistability switch)

Yes. I suggested all of those as reviewers. Also Travis Craddock in the Broderick group.
 
Just read the genetics part of this paper (https://www.mdpi.com/2075-4418/9/3/82/htm) and because Kashi et al. propose that "any predisposing genetic mutation must be common" I turned once again to the huge UK Biobank study with 2017 individuals with self-reported diagnosis of ME/CFS. UK Biobank data does not support this hypothesis involving common DNA variants.
@Chris Ponting Would you mind providing a detailed explanation of the y-axis scale. No idea what I'm looking at here.

Also, the most common IDO2 mutation rs10109853 has a genotype frequency of 72%. Could you give an example on how would that show up in your graph on the y-axis if say ME/CFS patient population was 95%?
https://www.opensnp.org/snps/rs10109853
 
Last edited:
Just read the genetics part of this paper (https://www.mdpi.com/2075-4418/9/3/82/htm) and because Kashi et al. propose that "any predisposing genetic mutation must be common" I turned once again to the huge UK Biobank study with 2017 individuals with self-reported diagnosis of ME/CFS. UK Biobank data does not support this hypothesis involving common DNA variants. View attachment 7851

@Chris Ponting I'm always going to defer to your expertise on genomics. But I am going to ask questions. First question is this. Some people tell me they find no mutations in IDO2 when they look at WGS data and I find this very unlikely since the R248W and the Y359X are so common in the CEU population. I have wondered if their datasets have been filtered to remove SNPs that are more common than some threshold, like AF = 0.03. So an extension of this question is whether the UK BioBank site shows us data that are filtered in any way. I have other questions, but let's start with this one.
 
@Chris Ponting Would you mind providing a detailed explanation of the y-axis scale. No idea what I'm looking at here.

Also, the most common IDO2 mutation rs10109853 has a genotype frequency of 72%. Could you give an example on how would that show up in your graph on the y-axis if say ME/CFS patient population was 95%?
https://www.opensnp.org/snps/rs10109853

Sorry: the y-axis is the negative logarithm (base 10) of the p-value. In other words if p=0.01 then -log10p is 2; if p=0.0001, it is 4; etc. Because hundreds of thousands of DNA variants are tested, the p-value has to be very small (-log10 p > 8) before being deemed significant.

The IDO2 mutation rs10109853 has an allele frequency between 0.23 and 0.56 (depending on ancestry; https://gnomad.broadinstitute.org/variant/8-39862881-C-T) - this is the likelihood of this variant being "T" for either one of the two chromosomal copies.

On the graph the p-value rs10109853 is 0.63 (http://geneatlas.roslin.ed.ac.uk/search/?traits=615&variants=rs10109853+) or -log10 p of 0.20.

In other words, rs10109853 does not significantly distinguish pwME from others in the very large UK Biobank (500,000 people overall). Hope this helps.
 
@Chris Ponting I'm always going to defer to your expertise on genomics. But I am going to ask questions. First question is this. Some people tell me they find no mutations in IDO2 when they look at WGS data and I find this very unlikely since the R248W and the Y359X are so common in the CEU population. I have wondered if their datasets have been filtered to remove SNPs that are more common than some threshold, like AF = 0.03. So an extension of this question is whether the UK BioBank site shows us data that are filtered in any way. I have other questions, but let's start with this one.

There are mutations in the protein-coding parts of IDO2. You can see what and where they are here: https://gnomad.broadinstitute.org/gene/ENSG00000188676.

Probably most importantly, the Y359X variant on this website has a warning flag MNV: "Multinucleotide variant: Variant annotation dubious". I've only just spotted this.
 
Something I learned while trying to understand the biology a bit better: TGF-beta interacts with IDO1 (I find the details a little hard to understand). I know several studies found altered levels of TGF-beta in ME/CFS (while others did not).
 
"Self-reported CFS". The self-report was whether they have ever been diagnosed as having CFS (i.e. ME/CFS) and not whether they were self-diagnosing ME/CFS. One of the central members of the UK Biobank told me that she thinks that "self-reported" data can be more accurate than electronic health records essentially because individuals are experts on what clinicians have diagnosed them with.
Re self-reported ME/CFS:

From memory there are at least three, possibly four studies (that I can't lay my hands on right now) that looked at accuracy of diagnosis. They all found that roughly 50% of people with doctor-diagnosed ME/CFS do not meet strict criteria like CCC or ICC. At least one of those studies was from the UK (the others from Australia and maybe Norway?) so we have to assume a similar rate in the Biobank.

How does this 50/50 chance of a self-reported ME/CFS diagnosis being correct affect interpretation of the Biobank data? Are there any statistical methods that can deal with that much potentially dubious data input?
 
For anyone interested here are the 5 variants @RDP has discussed
R248W = rs10109853
Y359STOP = rs4503083
S252T = rs35212142
I127V = rs4736794
N257K = rs774492001

The Biobank resource lists 4 of them. PV=p value, MAF=allele frquency
Rich (BB code):
Variant     Chromosome  Position    Ref. beta       pv      MAF     HWE
                                  allele    
rs10109853  8           39862881    T   6.7374e-05  0.63    0.48    9.237e-06
rs35212142  8           39862893    T   -0.00034435 0.33    0.04    0.03393
rs4503083   8           39872935    T   0.00015727  0.37    0.19    0.002383
rs4736794   8           39840234    A   0.00010853  0.67    0.08    0.1469
Source : http://geneatlas.roslin.ed.ac.uk/search/?traits=615&variants=rs10109853+ rs4503083+ rs35212142+ rs4736794+ rs774492001+
 
Last edited:
I was only able to read a fraction of the paper, so I am lacking some context (and lots of technical insight), but I did find this sentence a little disheartening. I hope he is just being extra conservative.

"While it is possible that the IDO metabolic trap lays bare the etiology of ME/CFS, the probability that this is so is small."

I used this phrase because the strategy we employed was to search for genes coding for enzymes or transporters with common damaging mutations. Using our most conservative criteria, we still found 208 such genes. I liked IDO2 because it catalyzes the first reaction in an important metabolic pathway, and besides being the source of endogenous NAD production the kynurenine pathway produces neuroactive and immuno-active metabolites. I felt we had to say there is a low probability because even if I'm correct that ME/CFS epidemics/outbreaks require common damaging mutations, there are still 207 other possibilities.

Moreover, as @rvallee says, we need much more than plausibility. We need experimental tests.

But please do not be disheartened. We think there's something important here, and we're investing a lot of time and energy in it.
 
Last edited by a moderator:
I used this phrase because the strategy we employed was to search for genes coding for enzymes or transporters with common damaging mutations. Using our most conservative criteria, we still found 208 such genes. I liked IDO2 because it catalyzes the first reaction in an important metabolic pathway, and besides being the source of endogenous NAD production the kynurenine pathway produces neuroactive and immuno-active metabolites. I felt we had to say there is a low probability because even if I'm correct that ME/CFS epidemics/outbreaks require common damaging mutations, there are still 207 other possibilities.

Moreover, as @rvallee says, we need much more than plausibility. We need experimental tests.

But please do not be disheartened. We think there's something important here, and we're investing a lot of time and energy in it.
thank you - from someone with 2 family members significantly affected this means a lot.
 
Last edited by a moderator:
I used this phrase because the strategy we employed was to search for genes coding for enzymes or transporters with common damaging mutations. Using our most conservative criteria, we still found 208 such genes. I liked IDO2 because it catalyzes the first reaction in an important metabolic pathway, and besides being the source of endogenous NAD production the kynurenine pathway produces neuroactive and immuno-active metabolites. I felt we had to say there is a low probability because even if I'm correct that ME/CFS epidemics/outbreaks require common damaging mutations, there are still 207 other possibilities.

Moreover, as @rvallee says, we need much more than plausibility. We need experimental tests.

But please do not be disheartened. We think there's something important here, and we're investing a lot of time and energy in it.

@RDP That makes sense. I just want to say that I feel a tremendous amount of gratitude for you and your work. I find the bistability and common mutations perspectives extremely compelling. My deepest thanks.
 
Last edited by a moderator:
There are mutations in the protein-coding parts of IDO2. You can see what and where they are here: https://gnomad.broadinstitute.org/gene/ENSG00000188676.

Probably most importantly, the Y359X variant on this website has a warning flag MNV: "Multinucleotide variant: Variant annotation dubious". I've only just spotted this.

Chris, Does "Variant annotation dubious" mean that the Broad investigators think the existence of the variant is dubious or do they mean that someone's interpretation of the significance of the variant is dubious? Rob
 
For anyone interested here are the 5 variants @RDP has discussed
R248W = rs10109853
Y359STOP = rs4503083
S252T = rs35212142
I127V = rs4736794
N257K = rs774492001

The Biobank resource lists 4 of them. PV=p value, MAF=allele frquency
Rich (BB code):
Variant     Chromosome  Position    Ref. beta       pv      MAF     HWE
                                  allele    
rs10109853  8           39862881    T   6.7374e-05  0.63    0.48    9.237e-06
rs35212142  8           39862893    T   -0.00034435 0.33    0.04    0.03393
rs4503083   8           39872935    T   0.00015727  0.37    0.19    0.002383
rs4736794   8           39840234    A   0.00010853  0.67    0.08    0.1469
Source : http://geneatlas.roslin.ed.ac.uk/search/?traits=615&variants=rs10109853+ rs4503083+ rs35212142+ rs4736794+ rs774492001+

@wigglethemouse In your table from Biobank, there is a column labeled HWE, which I imagine is an abbreviation for Hardy-Weinberg equilibrium. Can you tell me what the numbers in that column signify? Or just a link to the Biobank explanation? Thanks.
 
What does "deemed significant" mean?

Ron Davis recently said at the Invest in ME conference that 75% of the population have IDO2 mutations, but 20 severe patients and 46 additional patients tested all had IDO2 mutations. This is very unusual.

Something is very wrong. Either their method, or this biobank data.

@strategist This is why I asked Chris if the Biobank data we see online are filtered in any way. As for the IDO2 sequencing method, it is Peidong Shen's multiplexed PCR (mPCR) method, which he developed for other projects (at least two published) and which he simply adapted to target the IDO2 exons and nearby pieces of introns. I am not an expert, but Peidong is, and he's convinced we can detect these SNVs accurately. For those 20 patients where we have both WGS and mPCR sequencing, the results are 100% in agreement.
 
@wigglethemouse In your table from Biobank, there is a column labeled HWE, which I imagine is an abbreviation for Hardy-Weinberg equilibrium. Can you tell me what the numbers in that column signify? Or just a link to the Biobank explanation? Thanks.
I've had trouble too understanding the columns. You are correct with the meaning. here is one explanation
Similarly, testing that Hardy-‐Weinberg Equilibrium (HWE) holds is a common approach for identifying poor quality SNPs, but departures from HWE can be expected in the context of strong population structure, again because of differences in allele frequency distributions. To account for the effects of population structure, we proceeded in two phases. For SNP-‐based QC metrics we used only individuals with similar ancestry (so that, for example, HWE is expected). To do this we identified a set of individuals with European ancestry by projecting individuals onto principal components computed from the 1000 Genomes project. We also characterised the population structure unique to UK Biobank by computing principal components using only UK Biobank individuals (after applying SNP QC). We used the UK Biobank-‐specific principal components analysis (PCA) results to account for population structure in all our sample-‐based QC metrics
Source : http://www.ukbiobank.ac.uk/wp-content/uploads/2014/04/UKBiobank_genotyping_QC_documentation-web.pdf

There is also this "Details and considerations of the UK Biobank GWAS"
http://www.nealelab.is/blog/2017/9/11/details-and-considerations-of-the-uk-biobank-gwas
- Specificity over Sensitivity: We are removing around 75% of analysis-ready SNPs, with many well-imputed SNPs with MAF < 0.1% and INFO scores < 0.8. For well-powered phenotypes where most individuals have a non-missing value, we will inevitably miss real associations at the lower end of the MAF spectrum. Additionally, the variants most likely to be under recent positive or negative selection may be excluded by the HWE filter the we’ve chosen (HWE p-value > 1e-10), as we are quite well-powered with 330K individuals to detect even minor deviations from expectation.

EDIT: More HWE info for @RDP
How is the reported HWE computed?
The HWE statistic was computed on the unrelated white-British individuals as defined by UK Biobank.
Source : http://geneatlas.roslin.ed.ac.uk/frequently-asked-questions/
 
Last edited:
Chris, Does "Variant annotation dubious" mean that the Broad investigators think the existence of the variant is dubious or do they mean that someone's interpretation of the significance of the variant is dubious? Rob

I needed some time to look into this. By reading the Broad investigators' manuscript on MNVs (here: https://www.biorxiv.org/content/biorxiv/early/2019/03/10/573378.1.full.pdf) and looking at the Supplemental Data for this paper this is the conclusion: the annotation of Tyr359Ter/Stop appears to be wrong, and for an interesting reason (see Figure 1 of their paper).

When Single Nucleotide Variants (SNVs) are interpreted they are mostly considered one-by-one and, indeed, if this one is considered alone then the annotation is Tyr359 to a Stop Codon implying that the protein is prematurely truncated making it effectively dead. What the Broad Investigators have done differently is to consider *multiple* variants together, and when they do this they declare that the annotation is dubious. Biologically, there is at least another variant nearby that 'repairs' the annotation to a Stop Codon. [More abstractly, they are calling each of two variants as MINUS, and most of the time these are independent of one another, but here the two MINUS variants (MINUS-MINUS) cancel each other out, making a PLUS, meaning that the protein (with the two MINUS variants) is still fully made and is still functional.]

So my reading of the Broad Investigators' analysis is that the DNA sequence surrounding the variant rs4503083 does *not* damage the IDO2 protein. If so, this casts doubt on the Stanford authors' interpretation of this variant. Whilst rs4503083 alone is known to abolish enzyme activity in an in vitro cell kynurenine production assay (reference [34]) my reading is that the protein made with the second variant that corrects the protein, keeping it at full length, will not. Apologies for the complicated answer
 
Re self-reported ME/CFS:

From memory there are at least three, possibly four studies (that I can't lay my hands on right now) that looked at accuracy of diagnosis. They all found that roughly 50% of people with doctor-diagnosed ME/CFS do not meet strict criteria like CCC or ICC. At least one of those studies was from the UK (the others from Australia and maybe Norway?) so we have to assume a similar rate in the Biobank.

How does this 50/50 chance of a self-reported ME/CFS diagnosis being correct affect interpretation of the Biobank data? Are there any statistical methods that can deal with that much potentially dubious data input?

Thanks. If we think - conservatively - that half of this cohort of people have one disease (with a genetic contribution) and the other half has another disease (with a genetic contribution) then - if the GWAS/genetics study is sufficiently powered - the *sum* of genetic signals for both would be observed. This, indeed, presents problems for interpretation. However, (1) the cohort should have questionnaire data that allows partitioning of the cohort so that the two genetic signals can be teased apart, and (2) the genetic signals could imply two completely separate biological pathways, allowing their separation. This is why "complex disease" (as it is known) is called this. The cohort can be stratified genetically and by phenotype. Hope this helps.
 
I'm not a scientist so apologies if these questions are naive.
Biologically, there is at least another variant nearby that 'repairs' the annotation to a Stop Codon. [More abstractly, they are calling each of two variants as MINUS, and most of the time these are independent of one another, but here the two MINUS variants (MINUS-MINUS) cancel each other out, making a PLUS, meaning that the protein (with the two MINUS variants) is still fully made and is still functional.]
As I read this, the protein is not precisely repaired back to its original form, but more 'patched' so its functionality is restored, albeit with with two additional components that cancel out each others' effects. From this, can we be certain that the protein's behaviour is fully restored? And can we be confident it will not exhibit some additional behaviour, interactions, that the original would not? Is there a risk in presuming such a refactored protein will behave the same, but might in fact have some additional side effects up its sleeve that may be hard to preempt?
(1) the cohort should have questionnaire data that allows partitioning of the cohort so that the two genetic signals can be teased apart,
In practice is that as simple as it sounds? If you don't know up front what it is you are trying to tease apart, how hard is it to know what questions to ask in order to do so?
 
Back
Top Bottom