Preprint Dissecting the genetic complexity of myalgic encephalomyelitis/chronic fatigue syndrome via deep learning-powered genome analysis, 2025, Zhang+

Discussion in 'ME/CFS research' started by SNT Gatchaman, Apr 17, 2025.

  1. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,241
    Interesting! They're very similar in function, so I think that'd be quite the coincidence if they both appeared by chance.

    Wikipedia
     
    Last edited: May 11, 2025 at 10:32 PM
  2. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    670
    Location:
    USA
    I think that's more good evidence that the function itself is involved in pathogenesis! NME1, NME2, and NME3 also all showed up in the top 115, as well as several proteasome subunits and DLGAPs. This tells me that it doesn't so much matter which part of the pathway is broken, but rather that proper functioning of the pathway itself is important.
     
  3. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    670
    Location:
    USA
    Also I take back my critique about cytotoxic CD4s—I just noticed that the dot plot in figure 4 shows CD4. I must not have been able to read the tiny text when I was looking at the figures on my phone.

    generally I still prefer to show markers as feature plots since only a proportion of cells in the cluster are expressing those markers and it might not be the same cells. But for these purposes I think it’s sufficient. I would, however, still like to see more figures showing the actual expression differences of the leading edge genes by cluster.

    [edit: it might not be a signature that’s exclusive, or even strongest, in cytotoxic CD4s but rather that was the only cluster where the pathway enrichment was significant because of other factors]
     
    Last edited: May 11, 2025 at 11:27 PM
    bobbler, hotblack, Sasha and 3 others like this.
  4. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,241
    I looked up the DLGAPs out of curiosity and thought it was interesting that like NLGN1 and NLGN2, the DLGAP genes (DLGAP1, DLGAP2, DLGAP3, DLGAP4 from the study) code for proteins located in a structure called the postsynaptic density. And these ones too:

    SYNGAP1
    From Wikipedia:

    SHARPIN
    Paper: Sharpin, a novel postsynaptic density protein that directly interacts with the shank family of proteins (2001, Mol Cell Neurosci)

    I went through every GeneCards link and searched the pages for mentions of synapses. These ones also mention postsynaptic density: CAMK2A, GRM1, DLG2, HOMER2

    And these ones mention synapses in general: CACNA2D3, ACE, PRKCZ, GABBR1, NRAS, BCL2L1, CA2, PDYN

    Edit: Missed one of the DLGAPs before.
     
    Last edited: May 12, 2025 at 2:29 PM
    Lilas, Kitty, ME/CFS Skeptic and 5 others like this.
  5. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,175
    Interesting AMPD1 came up fairly high in the list. It's been discussed before in ME/CFS but I always assumed you needed to be have a homozygous mutation. I checked Wikipedia and it discusses that it may not necessarily be black and white regarding what causes a deficiency.

    https://en.wikipedia.org/wiki/Adenosine_monophosphate_deaminase_deficiency_type_1
    This is very interesting
    Maybe on its own a mutation in AMPD1 is not damaging, but when combined with another disease it makes that issue worse.

    Or it could be that people homozygous with AMPD1 are misdiagnosed with ME/CFS.
     
    Last edited: May 12, 2025 at 3:59 AM
    wastwater, Kitty, bobbler and 4 others like this.
  6. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    670
    Location:
    USA
    That sort of thing seems to come up a lot in chronic illnesses—a mutation that would be deleterious if homozygous, and should be fine if heterozygous, but obviously isn’t under some perfect storm of homeostatic stress. It’s always the assumption that the non-deleterious allele should be able to compensate adequately, but as in all biology it’s important to look at the edge cases.

    In those situations I think it’s useful to look at situations where maximal functionality is really necessary—under circumstances like viral infection, nutrient deficiency, etc. etc. where you really need something to turn over properly
     
  7. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,219
    Location:
    London, UK
    So what about the first 20-25 genes (cuts off most easily at 23)
    Can we make a table of the proteins and functions for these and maybe check further down for sister genes where they show up?

    DNMT3A. DNA methyltransferase 3 alpha Epigenetic DNA methylation -gene expression control
    ADCY10 adenylyl cyclase 10. Formation of cAMP
    PPP2R2A. Protein Phosphatase 2 Regulatory Subunit Balpha. Cell cycle
    NLGN2. Neuroligin 2. Synapse formation
    LEP Leptin Weight control/appetite
    SYNGAP1. Synaptic Ras GTPase Activating Protein 1. Synapses MAPkinase signalling
    AHCYL2. Adenosylhomocysteinase Like 2. Brain signalling
    NLGN1. Neuroligin 1 Synapse formation
    DLGAP4. DLG Associated Protein 4. Synapses
    HDAC1. Histone Deacetylase 1. Regulation of gene transcription
    AMPD2 adenosine monophosphate deaminase 2
    AHCYL1. Adenosylhomocysteinase Like 1. Anti-inflammatory cytokine production
    SHARPIN. SHANK Associated RH Domain Interactor. Signalling in auto inflammation. Synapses
    NME2. NME/NM23 Nucleoside Diphosphate Kinase 2. DNA transcription. Risk factor for EBV-associated lymphoma
    NME1-NME2. NME1-NME2 Readthrough. DNA transcription
    CACNA2D3. Linked to brain development and autism TCR signalling
    NME3. NME/NM23 Nucleoside Diphosphate Kinase 3. Goes with NME1
    ZC3H13. NME/NM23 Nucleoside Diphosphate Kinase 3. RNA splicing
    CAMK2A. Calcium/Calmodulin Dependent Protein Kinase II Alpha. Synapses on dendrites in brain
    PIK3CA. Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha. Insulin responses, brain development
    MAX. MYC Associated Factor X. DNA transcription
    HLA-C HLA-C (MHC I) CD8 T cell receptor and NK cell receptor recognition events
    ACE. Angiotensin I Converting Enzyme
     
    Last edited: May 12, 2025 at 8:07 AM
    AliceLily, Lilas, Kitty and 6 others like this.
  8. SNT Gatchaman

    SNT Gatchaman Senior Member (Voting Rights) Staff Member

    Messages:
    6,740
    Location:
    Aotearoa New Zealand
    FWIW I ran the top 20 genes through Gemma3:27b on my laptop. The prompt was "I'm going to give you a list of 20 genes. Find interactions to explain mechanisms that produce the disease ME/CFS."

    I. Gene Summary & ME/CFS Relevance (Individual Genes)

    Here's a brief overview of each gene and its potential link to ME/CFS. I'll rate the evidence strength as: Strong, Moderate, Weak, or Speculative.

    1. DNMT3A (DNA Methyltransferase 3A): Epigenetic regulation. Moderate. Epigenetic changes are increasingly implicated in ME/CFS, and DNMT3A is a key player in DNA methylation. Altered methylation patterns have been observed in ME/CFS patients.

    2. ADCY10 (Adenylate Cyclase 10): Neurotransmitter signaling, cAMP production. Weak. May be involved in neuronal function and potentially affected by neuroinflammation in ME/CFS.

    3. PPP2R2A (Protein Phosphatase 2 Regulatory Subunit 2A): Cell signaling, regulation of neuronal excitability. Moderate. Linked to immune function and neuronal signaling, both disrupted in ME/CFS.

    4. NLGN2 (Neuroligin 2): Synaptic function, neuronal communication. Moderate. Implicated in neurodevelopmental disorders and potentially relevant to the neurological symptoms of ME/CFS.

    5. LEP (Leptin): Appetite regulation, immune modulation. Moderate. Leptin levels are often dysregulated in ME/CFS, and it plays a role in immune function and inflammation.

    6. SYNGAP1 (Synaptic GTPase-Associating Protein 1): Synaptic plasticity, learning, and memory. Weak. Linked to neurodevelopmental disorders; potential role in cognitive dysfunction in ME/CFS.

    7. AHCYL2 (Acyl-CoA Hydrolase Activating Protein 2): Mitochondrial function, homocysteine metabolism. Moderate. Mitochondrial dysfunction is a hallmark of ME/CFS, and AHCYL2 is crucial for mitochondrial health.

    8. NLGN1 (Neuroligin 1): Synaptic function, neuronal communication. Weak. Similar to NLGN2, potentially involved in neurological aspects of ME/CFS.

    9. DLGAP4 (Discs Large-Associated Protein 4): Synaptic scaffolding, neuronal development. Weak. May contribute to synaptic dysfunction in ME/CFS.

    10. HDAC1 (Histone Deacetylase 1): Epigenetic regulation, gene expression. Moderate. Like DNMT3A, HDAC1 is involved in epigenetic modifications and could contribute to altered gene expression in ME/CFS.

    11. AMPD2 (Adenosine Monophosphate Deaminase 2): Purine metabolism, energy production. Strong. Frequently implicated in ME/CFS, with mutations associated with fatigue and exercise intolerance. Plays a critical role in muscle energy metabolism.

    12. AHCYL1 (Acyl-CoA Hydrolase Activating Protein 1): Mitochondrial function, homocysteine metabolism. Moderate. Similar to AHCYL2, important for mitochondrial health and energy production.

    13. SHARPIN (SHANK-Associated Right-handed Protein 1): Protein trafficking, immune regulation. Weak. Linked to immune deficiencies; potential role in immune dysregulation in ME/CFS.

    14. NME2 (Nucleoside Diphosphate Kinase 2): Energy metabolism, nucleotide synthesis. Moderate. Involved in energy transfer and nucleotide metabolism; potentially affected in ME/CFS due to mitochondrial dysfunction.

    15. NME1-NME2 (Nucleoside Diphosphate Kinase 1-2): Energy metabolism, nucleotide synthesis. Moderate. Similar to NME2, plays a role in energy metabolism.

    16. CACNA2D3 (Calcium Voltage-Gated Channel Auxiliary Subunit Alpha2delta3): Neuronal excitability, pain signaling. Weak. May be involved in neurological symptoms and pain associated with ME/CFS.

    17. NME3 (Nucleoside Diphosphate Kinase 3): Energy metabolism, nucleotide synthesis. Weak. Similar to NME1-2, plays a role in energy metabolism.

    18. ZC3H13 (Zinc Finger C3H1 Type Containing 13): RNA metabolism, immune regulation. Weak. Potential role in immune dysregulation and altered gene expression.

    19. CAMK2A (Calcium/Calmodulin-Dependent Protein Kinase II Alpha): Synaptic plasticity, neuronal signaling. Moderate. Involved in synaptic function and neuronal signaling; potentially affected by neuroinflammation in ME/CFS.

    20. PIK3CA (Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha): Cell signaling, growth, and metabolism. Moderate. Involved in multiple signaling pathways; potential role in immune dysregulation and metabolic abnormalities in ME/CFS.
    II. Potential Interactions & Pathways

    Here's where things get complex. I'll group genes based on shared pathways and potential interactions:

    • Mitochondrial Dysfunction & Energy Metabolism (AHCYL1, AHCYL2, AMPD2, NME1-NME2, NME2, NME3): This is a strong cluster. AHCYL1/2 are critical for mitochondrial function. AMPD2 is directly involved in muscle energy metabolism and is frequently implicated in ME/CFS. NME1-3 are involved in nucleotide and energy transfer. Dysfunction in any of these could contribute to the profound fatigue and post-exertional malaise seen in ME/CFS. Interaction: Mutations or altered expression in AMPD2 could exacerbate the effects of AHCYL1/2 dysfunction, leading to impaired energy production.

    • Synaptic Function & Neuroinflammation (NLGN1, NLGN2, SYNGAP1, DLGAP4, CAMK2A): This group is involved in synaptic plasticity and neuronal communication. Disruptions in these genes could contribute to cognitive dysfunction ("brain fog") and neurological symptoms. Interaction: NLGN1/2 and SYNGAP1 likely interact to regulate synaptic development and function. CAMK2A is a downstream effector of synaptic activity and could be affected by changes in NLGNs or SYNGAP1. Neuroinflammation could further disrupt these interactions.

    • Immune Regulation & Epigenetics (DNMT3A, HDAC1, SHARPIN, ZC3H13, PIK3CA): This group links immune function with epigenetic regulation. DNMT3A and HDAC1 are key epigenetic modifiers. SHARPIN is involved in immune regulation. PIK3CA is a signaling molecule involved in immune cell activation. Interaction: PIK3CA signaling could influence epigenetic modifications mediated by DNMT3A and HDAC1, altering immune cell function. SHARPIN could be affected by these epigenetic changes.

    • Energy Metabolism & Signaling (NME1-3, PIK3CA): PIK3CA is a signaling molecule that can influence energy metabolism. Interaction: PIK3CA signaling could affect the activity of NME enzymes, influencing energy transfer and nucleotide synthesis.
    III. Proposed Mechanisms in ME/CFS

    Based on these interactions, here's a proposed model:
    1. Initial Trigger (e.g., viral infection, stress): An initial trigger leads to systemic inflammation and oxidative stress.
    2. Mitochondrial Dysfunction: The trigger impairs mitochondrial function (AHCYL1/2, AMPD2, NME enzymes), leading to reduced energy production and increased oxidative stress.
    3. Immune Dysregulation: Inflammation and oxidative stress activate immune cells via PIK3CA signaling. Epigenetic modifications (DNMT3A, HDAC1) alter immune cell function, potentially leading to autoimmunity.
    4. Neuroinflammation & Synaptic Dysfunction: Inflammation spreads to the nervous system, activating glial cells and disrupting synaptic function (NLGNs, SYNGAP1, CAMK2A). This contributes to cognitive dysfunction and neurological symptoms.
    5. Feedback Loop: Mitochondrial dysfunction, immune dysregulation, and neuroinflammation create a self-perpetuating cycle, exacerbating symptoms and leading to chronic illness.

    IV. Caveats

    • This is a complex disease, and many other factors likely contribute to ME/CFS.
    • The interactions described above are hypothetical and require further research to validate.
    • Genetic predisposition, environmental factors, and individual variability all play a role.
     
  9. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,219
    Location:
    London, UK
    That is certainly a quick way to get an answer!
    But it seems to have brought in all the usual memes about neuroinflammation and autoimmunity and stuff.

    Nevertheless, grist to the mill.
     
    Lilas, Kitty, MeSci and 6 others like this.
  10. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,219
    Location:
    London, UK
    There seems little doubt that the list points to signalling systems and neural systems in particular. Although it may be of note that neural synapses and immune cell 'synapses' can share mechanisms and do for some these genes.

    It looks as if some of these genes do point to skeletal muscle dysfunction being relevant. But the fact that people are well before they get ME/CFS suggests maybe that these kick in once some signalling pathway has disturbed immune or neural regulation and these genes are important for muscle coping in an adverse signalling environment - as suggested already.

    A lot of the genes seem to relate to high-level general regulatory processes such as DNA transcription and in that sense may not give very specific clues but there is still a sense that they may be important for 'learnt' dysregulation - rather as proposed for 'learnt' innate immune responses.

    The fact that leptin has come up again seems unlikely to be a coincidence.

    I think we are seeing real stuff here. And when DecodeME reports I think there must be a high chance that at least some things will chime with this, even if the different methodology may pick out different specific genes. Then we should surely be motoring. It is beginning to make sense I think. And some of the stuff from before we were uncertain about is likely to turn out to have been real. Having a gene signal just makes it so much easier to be confident about.
     
  11. Sasha

    Sasha Senior Member (Voting Rights)

    Messages:
    5,559
    Location:
    UK
    SequenceME intends to examine the DNA of up to 17,000 PwME with whole genome sequencing, presumably because they think the big numbers are necessary in order to give a useful signal. I don't understand much about what's going on in this paper but can we trust results based on only 247 cases?
     
    Kitty, MeSci, hotblack and 1 other person like this.
  12. Hutan

    Hutan Moderator Staff Member

    Messages:
    32,414
    Location:
    Aotearoa New Zealand
    The numbers in SequenceME might just be because that is how many samples they have, so they might as well use them?


    I don't think it's a coincidence that chronic ciguatera looks a lot like ME/CFS, and it seems to involve synaptic dysfunction (calcium as well as sodium gated channels). Gulf War Illness has been related to synaptic dysfunction too. I reckon there is something about synapses stopping working after use, and taking a long time to become useful again.
     
    Last edited: May 12, 2025 at 9:22 AM
  13. Sasha

    Sasha Senior Member (Voting Rights)

    Messages:
    5,559
    Location:
    UK
    That seems unlikely, given that that team seem desperate to get answers for PwME as soon as possible.

    I wonder if they want big numbers to go for a classic approach because deep learning models might not be trustworthy. They seem very black box. I worry that we might overlook the opacity of the method if the results fit with our ideas of what ME/CFS might be. Because lots of ideas have flown about over the years, there are perhaps quite a few models to choose from.

    I'm speaking from a position of pig-ignorance but should we worry that we're projecting our wishes onto these results? Confirmation bias?
     
    Kitty, MeSci, hotblack and 2 others like this.
  14. Hutan

    Hutan Moderator Staff Member

    Messages:
    32,414
    Location:
    Aotearoa New Zealand
    I don't think we have to worry too much, because these are just hints and better information is coming. I don't think people are going to claim they know what ME/CFS is and how to treat it on the basis of this study.
     
    Kitty, MeSci, bobbler and 4 others like this.
  15. Sasha

    Sasha Senior Member (Voting Rights)

    Messages:
    5,559
    Location:
    UK
    I think that raises the question of 'why not', though. If a small, deep-learning study spits out lots of statistically significant genetic associations, do we take them seriously or not? If we have to wait for better information, is this information not reliable? And if it isn't, what's the point of doing such studies?
     
  16. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    1,105
    Location:
    UK
    I understanding is that the bare minimum for whole genome sequencing is 1000, which I think is bigger than this study. But as with GWAS Bigger is much better. Sequencing is incredible expensive, so I’m pretty sure the largest sample size is for greater power to understand the problem. But I’d be surprised if they got funding for as many of 17,000.
     
  17. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,219
    Location:
    London, UK
    I think it is worth remembering that these numbers for genetic studies are the numbers people think you should collect in order to be reasonably sure that you have picked up most of the significant genetic links. Some genetic links will turn up with very small numbers.

    For instance, 95% of people with ankylosing spondylitis have HLA-B27 whereas in the general population it is 8%. Ten patients and ten controls would almost certainly be significant if you just looked at HLA-B. The numbers you need also depend on how many possibilities are likely to turn up with your chosen method - SNPs, rare gene panel etc.

    I don't think we should discount results on 250 cases if the stats look solid. Moreover, I am pretty sure that these results are not noise. There are definite patterns and replications from before.
     
    AliceLily, Lilas, Hutan and 9 others like this.
  18. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    1,105
    Location:
    UK
    I’m pretty sure the experience of GWAS was that you do need decent sample sizes to get reliable results – not just more results. The early history was getting almost everything wrong. The exceptions are where you are looking for very big differences, like your ankylosing spondylitis example, and also APOE2 in Alzheimer’s disease. I’m sure George Davey Smith published papers on this, which led to a big change in methodology, including much bigger sample sizes.
    Perhaps the telling thing was that in the GWAS field they noticed they had a replication crisis and did something about it.

    I don’t know if that experience translates to these newer types of analysis
     
    Last edited: May 12, 2025 at 2:50 PM
    Binkie4, Hutan, geminiqry and 11 others like this.
  19. hotblack

    hotblack Senior Member (Voting Rights)

    Messages:
    708
    Location:
    UK
    I’m jealous of your laptop :) (and have found even the smaller Gemma3 models and certainly Gemini 2.5 Flash useful for exploring topics and helping me understand roughly what papers or people are talking about at times, not sure asking them to deduce is worth much beyond a bit of fun though)

    That’s key isn’t it. These are newer methods to find the signals in the data and it is hoped they will find things more efficiently. But they are newer, different methods from what has been used in the past. GWAS is a known quantity.
    I really like the questioning though @Sasha and think you’re right to raise the points. Hopefully all will become clear and we’ll understand more in the coming months.
     
  20. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,219
    Location:
    London, UK
    I think big numbers is probably more important for SNP fishing, although I know less about the rare gene approach.
     
    Deanne NZ, Kitty, Simon M and 3 others like this.

Share This Page