Preprint Dissecting the genetic complexity of myalgic encephalomyelitis/chronic fatigue syndrome via deep learning-powered genome analysis, 2025, Zhang+

Discussion in 'ME/CFS research' started by SNT Gatchaman, Apr 17, 2025.

  1. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,266
    Thanks @jnmaciuch. So maybe not quite as exciting as what I thought was happening which did seem too good to be true: lots and lots of synaptic genes coming up independently. As far as I understand your description, it's something more like, if the model sees some variants in, say, both NLGN1 and DLGAP3, then it says, might as well invite all the rest of the synaptic genes to the party.

    On another note, I wonder if there's any reason to be concerned about a point brought up by @mariovitali's AI:
    I don't know much about the machine learning world, but most or all of the reported confidence intervals do seem very small, for example AUROC: 0.670 ± 0.003. Maybe with a test set this small, it's not actually very unlikely to get an AUROC of 0.670 just by chance.
     
    hotblack likes this.
  2. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    698
    Location:
    USA
    Yeah it is an artificially small confidence interval, but in my opinion you don’t particularly need one for validating on an independent test cohort. And even increasing the confidence interval by an order of magnitude wouldn’t put the AUC in the range of being no better than random assignment (0.5).

    I think the bigger problem is what EndME already mentioned—we just don’t know which of the genes are recapitulated in the test cohort. But if we’re interpreting results simply based on possibility of involved pathways, not their predictive capacity for identifying ME/CFS vs. control, I think we can afford to be more lax.

    Others may have different feelings on these points.
     
    hotblack and bobbler like this.
  3. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,266
    Sure, I don't know how large of a CI would be expected if they had only run it once. I was just worried that it would actually be much bigger and cross 0.5, and that the genes may just be more or less random, and they happened to get 0.670 by chance on the test set. I'm thinking more about this after seeing that it looks like there's pretty much nothing special about any of these genes when looking at the LoF rare variant genes for CFS in the BioBank. Though I'm just learning the basics of genetic studies still, so I could definitely be wrong in this interpretation.

    Though depression being the top hit seems to be unlikely to be a coincidence with all rest of the 4000 other random phenotypes it could have been, so it does make me think there's some real genes there at least, though maybe not specific to an ME/CFS mechanism.
     
    hotblack likes this.

Share This Page