Preprint Dissecting the genetic complexity of myalgic encephalomyelitis/chronic fatigue syndrome via deep learning-powered genome analysis, 2025, Zhang+

Discussion in 'ME/CFS research' started by SNT Gatchaman, Apr 17, 2025.

  1. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,190
    This is the Wikipedia entry for NOTCH signalling pathway. No idea what it means.
    https://en.wikipedia.org/wiki/Notch_signaling_pathway
    It also goes on to state that Notch signalling has a role in neuronal function and development (among many other roles).

    i checked my WGS and am heterozygous for a missense mutation on NOTCH1, the highest ranked gene on your list. Clinvar lists the variant as benign, and CADD = 22.5 which seems high.
    I also have a 2nd heterozygous intron variant listed as significant on NOTCH1.
     
    Last edited: May 21, 2025 at 1:33 AM
  2. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    771
    Location:
    USA
    It just refers to all the signaling that happens as a result of the NOTCH protein binding to one of its ligands, which is a pretty broad category. It’s involved in a lot of biological processes, most notably in early embryonic development and growth, which is why mutations in it are highly associated with cancer. In later life it’s very important for neural plasticity, angiogenesis (new blood vessels), basically any situation where you need new cells to grow.

    [Edit: this is a very well characterized pathway so it would not surprise me that the protein-protein interactions are strong in STRING]
     
    Last edited: May 21, 2025 at 1:43 AM
  3. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,349
    Final thing for now because I am exhausted. I ran GSEA with the cellular component collection since I already ran the same one on the Genebass data.

    Link to list of enriched component gene sets in Genebass

    I decided not to use collapsePathways here. Since I am comparing if any gene sets match between the two cohorts, I don't want any pathways to be removed that might potentially be in the other dataset.

    So here's the top 30 out of 469 gene sets:
    cc_nocollapse_30.png

    And in text form:
    The only ones in these top 30 that just might be enriched in the Genebass analysis as well (rank and FDR from the Genebass values):

    Rank | Gene Set | unadjusted p | FDR
    5 | GOCC_ESCRT_COMPLEX | .004 | 0.237.
    9 | GOCC_U5_SNRNP | .033 | 0.446
    21 |GOCC_EXCITATORY_SYNAPSE | .003 | 0.517

    Edit: Fixed rankings in the final 3 listed genes.

    Edit: Added unadjusted p to final list.
     
    Last edited: May 21, 2025 at 4:27 PM
  4. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,349
    I thought it might be interesting, but when I go to Genebass and filter by any genes with "NOTCH" in the name, none of the six NOTCH genes are significant at all.
     
    hotblack, Kitty, mariovitali and 2 others like this.
  5. DMissa

    DMissa Senior Member (Voting Rights)

    Messages:
    203
    Location:
    Australia
    Finally working through this paper thread armed with a clear brain, some time better spent on paper writing and most importantly a cuppa

    proteasome subunits consistently come up as dysregulated in our work in ME cell lines or primary cells from different tissues, have not looked yet at specific overlaps in subunits or directions between tissues, or at functional assays yet, but there's something here even if subtle. However it may just be the case that something as closely married to a central process such as protein translation would be expected to be affected by probably any flavour of aberrant or disrupted homeostasis. There are also a lot of these subunits, so coming up by chance is also possible (I can do the stats to determine the likelihood of this, but it's been less important than other things I am working on)

    for me the upstream hint is more interesting - what is the mechanism of regulation of the affected gene? that gives you a target to work back against. In terms of downstream function, if one was convinced it was worth pursuing it is directly assayable

    separate note, HDAC1 coming up is interesting https://pmc.ncbi.nlm.nih.gov/articles/PMC6787670/

    aaand I am out of time for now
     
    Last edited: May 21, 2025 at 2:42 AM
    hotblack, Kitty, Sean and 8 others like this.
  6. SNT Gatchaman

    SNT Gatchaman Senior Member (Voting Rights) Staff Member

    Messages:
    6,808
    Location:
    Aotearoa New Zealand
    HDAC1 also in Integrative Multi-Omics Framework for Causal Gene Discovery in Long COVID (2025, Preprint: MedRxiv)

     
    Last edited: May 21, 2025 at 3:06 AM
    hotblack, Kitty, Hutan and 4 others like this.
  7. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,422
    Location:
    London, UK
    Indeed, but then that is true for depressive illness it seems. It is all a bit handwaving, especially when alcohol is supposed to be a cerebral depressant and yet it makes people sing rude songs and toboggan down the Main Street.
     
    Sean, Kitty, hotblack and 1 other person like this.
  8. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    1,111
    Location:
    UK
    Thanks for the responses, @forestglip and @jnmaciuch.

    My concern about AUC wasn't anything to do with diagnosis (it's too low to be useful), but as a way to demonstrate the biological validity of the findings. The authors say:
    My italics above.

    In this case, they are stressing the generalisability re the independent test. But, while I can see the argument on generalizability, I don't really see why Heal2 wouldn't also perform better in cross-validation, even if not to the same extent, if it is better at picking up biologically relevant genes.

    My bigger concern is that the test cohort is so small at 36 cases/21 controls, and I wouldn't want to hang my hat on that when it comes to evidence of biological relevance.

    A solution would be to use UKB for a test cohort. @Hutan suggested the authors probably tried out UKB data and found little, but we know from the recent Samms/Ponting cohort quality paper that the four cohorts available all have issues. However, 95 cases appear in all 4 cohorts, they are probably reliable diagnoses, giving a bigger cohort than Cornell, with the likelihood of a much larger control group (and one matched very well since there are 500k in the UKB). Defining a decent cohort of around 500 should not be too difficult.

    That could give greater confidence in all the interesting more specifc results.
     
    Last edited: May 21, 2025 at 8:36 AM
    Snow Leopard, hotblack, Ron and 7 others like this.
  9. Hutan

    Hutan Moderator Staff Member

    Messages:
    32,523
    Location:
    Aotearoa New Zealand
    @forestglip subsequently did the comparative analysis of the identified rare variants on a UK Biobank group labelled CFS (around 2000 people) and found there was little commonality.

    But yes, a high quality ME/CFS group and an analysis that focussed on pathways rather than individual variants could be good.
     
    hotblack, Kitty, forestglip and 3 others like this.
  10. Braganca

    Braganca Senior Member (Voting Rights)

    Messages:
    408
    Curious.. Have any of the authors been invited to this thread, or will some of you update them on what you are finding/ thinking in comments on their preprint?
     
    Kitty, hotblack, forestglip and 4 others like this.
  11. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    1,111
    Location:
    UK
    Sorry, I haven’t been keeping up. Interesting analysis But 28% of that group reported good or excellent health, and they were other issues –, Though I don’t think it’s so bad a cohort because it’s not simply “self-reported“. People People were asked if they had a serious illness or disability diagnosed by a doctor. If they did, a nurse interviewing later asked which illness, without prompting. We don’t know how the nurse recorded the replies, but there were a large number of options to choose from, including chronic fatigue syndrome (not chronic fatigue) and the nurse could also say enter to other options in a text box. So it’s quite different from someone checking a CFS tick box
     
    Snow Leopard, hotblack, Kitty and 2 others like this.
  12. Sasha

    Sasha Senior Member (Voting Rights)

    Messages:
    5,607
    Location:
    UK
    If no one has already, should someone be contacting Zhang et al. to attempt to replicate using the DecodeME database, and feeding back to them any concerns about design and interpretability discussed here before they do so?

    Edit: I forgot that DecodeME is a GWAS and that Zhang et al. look at WGS data... o_O
     
    Last edited: May 21, 2025 at 9:45 AM
    Kitty and Deanne NZ like this.
  13. hotblack

    hotblack Senior Member (Voting Rights)

    Messages:
    803
    Location:
    UK
    Some of the symptoms for us seem to be the body trying to respond to whatever underlying problem exists. At least that’s what it sometimes feels like. That could explain the paradoxical nature?

    I also had a period of quite severe depression in my late teens (at least partially triggered by mefloquine) and that had periods of sleeplessness for days, or more wired, even borderline psychosis and plenty of anxiety, etc. So inhibitory and excitatory can often go hand in hand.

    This experience has helped me see the significant differences between depression and ME/CFS but it would be interesting if there’s an underlying predisposition to both.
     
    bobbler, Sean, Kitty and 1 other person like this.
  14. Kiristar

    Kiristar Senior Member (Voting Rights)

    Messages:
    211
    Like you I seemed to have an episode of depression some years prior to my actual ME diagnosis, although my experience was nothing but inhibatory. Interestingly my father had a bipolar diagnosis.

    I do worry a bit how that particular finding might be misused by the BPS lot.
     
    bobbler, Kitty and hotblack like this.
  15. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,349
    I have the same concern about the test set being so small.

    I wouldn't say their model definitely wouldn't replicate on UK BioBank cohort based on that. The comparison they and I did on the BB is much less sophisticated than if they had actually used their model for classification.

    Edit: But issues with participant selection in the BioBank might still water down the already very fairly low AUC.
     
    Last edited: May 21, 2025 at 11:28 AM
    Hutan, bobbler, Deanne NZ and 2 others like this.
  16. mariovitali

    mariovitali Senior Member (Voting Rights)

    Messages:
    575
    So I did some preprocessing to extract all genes from the links @forestglip gave, identify which genes are duplicated and feeded them to the information retrieval system I have been using.

    The results -from my Information Retrieval system- show that we are looking at the following concepts :

    Methyltransferases, DNA Methylation, protein stability and protein interaction, Post translational modifications, Proteasome, histone methyltransferases (subset of methyltransferases), histone deacetylase

    Plugging the genes that had multiple entries leads to the above results plus ATP hydrolysis which makes it to the ranking list as well as retinoic acid , STAT3 and G Actin.

    Interestingly @TamaraRC hypothesis discusses BHMT Gene which is a methyltransferase (tagging @DMissa) . The thread of this hypothesis can be found here : https://www.s4me.info/threads/a-sys...nergic-imbalance-in-me-cfs-2025-carnac.44116/

    Just in case, I contacted @TamaraRC to see whether issues in any other methyltransferases could lead to problems according to her hypothesis (mentioning this in case anyone else wishes to comment)

    Please note : I did not verify gene extractions manually so far, I will try to do so ASAP.
     
    Hutan, Jacob Richter, Sean and 2 others like this.
  17. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,349
    Just want to be sure I understand. You noted down all genes that are included in any of the top gene sets? And identified if they're duplicated in what way?
     
    Hutan and Kitty like this.
  18. Evergreen

    Evergreen Senior Member (Voting Rights)

    Messages:
    471
    Stumbled on this and thought of your question, @Hutan . There seem to be a bunch of studies on COVID-19 and sperm.
     
    Hutan, Deanne NZ and hotblack like this.
  19. mariovitali

    mariovitali Senior Member (Voting Rights)

    Messages:
    575
    OK, I began by taking the first set of genes s1 by following this link : https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/GOCC_SWI_SNF_COMPLEX and clicked on the "(show 30 source identifiers mapped to 30 genes)". This is the first set of genes, which were extracted.

    Then I did the same for s2....s29, which is total 30 sets.

    I then created a list L1 containing the unique names of these genes and a second list L2 which contained only genes that had more than one appearances across the 30 sets. Then I submitted L1 and L2 to the Information retrieval system.
     
  20. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,349
    Oh, I don't think this will tell us much. Most of the genes in those gene sets were not important, and the gene sets themselves might not be the best groupings of the genes that were important. Potentially, the overall gene sets/pathways themselves might provide clues, but I wouldn't do anything with specific genes in them, especially if the attention scores for those genes weren't high at all in the model.

    And replication across gene sets isn't useful. It just means a gene is involved in more than one pathway, and we aren't sure which, if any, is the important one in ME/CFS.
     
    Hutan and hotblack like this.

Share This Page