Preprint Mobius: Mixture-Of-Experts Transformer Model in Epigenetics of ME/CFS and Long COVID, 2025, Acharya & Jacoby

Discussion in 'ME/CFS research' started by John Mac, May 28, 2025 at 5:32 PM.

  1. John Mac

    John Mac Senior Member (Voting Rights)

    Messages:
    1,088
    https://www.biorxiv.org/content/10.1101/2025.05.25.656018v1

    Abstract

    Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Long COVID are chronic debilitating post-infectious illnesses that collectively affect up to 470 million individuals. Unlike illnesses of comparable scale, there are no validated blood or imaging tests for the clinical diagnosis of these conditions.

    Currently, these conditions are diagnosed through clinical exclusion, resulting in approximately 90% of ME/CFS patients being incorrectly diagnosed as Long COVID patients. This misdiagnosis contributes to delayed care and millions of dollars in healthcare burdens.

    We present Mobius, a transformer-based model that uses autoencoder-derived features from blood DNA methylation to distinguish ME/CFS, Long COVID, and healthy controls.

    Using 852 samples from 14 distinct datasets, our method employs three innovations:
    (i) self-supervised masked pretraining to learn epigenetic patterns,
    (ii) a sparsely-gated mixture-of-experts architecture to handle heterogeneous data, and
    (iii) an adaptive computation time mechanism for dynamic inference.

    Mobius achieved 97.06% accuracy (macro-F1 0.95, AUROC 0.96), outperforming current symptom-based diagnostics (58%) and baseline models such as XGBoost (82%).
    Ablation experiments showed that pretraining added 6% accuracy and that the gating and adaptive depth contributed an additional 7%.
    Our open-source pipeline could enable a much-needed objective blood test for these conditions and guide targeted precision medicine therapies.
     
  2. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,586
    Location:
    London, UK
    If symptom based diagnosis is only 58% accurate then who decided what 100% accurate was, in order to match up 97% with the test?
     
  3. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
    97.06% accuracy is very high. One thing is that all ME/CFS samples were pre-pandemic, so some of the difference between ME/CFS and LC might be related to time of sampling.

    But to not misclassify any samples between HC and disease is striking. It implies there were virtually no misdiagnoses in the samples they used. I guess potentially it's possible, but generally 100% accuracy is suspicious in machine learning.

    They don't seem to cite which, but I assume that 58% is from some other study that looked at under/over-diagnosis.
     
  4. InitialConditions

    InitialConditions Senior Member (Voting Rights)

    Messages:
    1,895
    Location:
    North-West England
    eh?
     
  5. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    17,586
    Location:
    London, UK
    So the authors alone have a magic MEdiagnostoscope?
     
  6. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
  7. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
    There's a whole fancy website showing off the model and recommending clinicians to use it in their practice: https://mobius.dnanalyzer.org/

    It includes some findings not included in the preprint:
    Screenshot_20250528-144002.png

     
  8. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,213
    It would be nice if they would apply to the CureME Biobank for blinded samples to prove out their technique before offering the tools for clinical use. I wonder what sort of funding they need to extract the DNA data and methylation patterns from such a cohort.

    Looks like they are planning an API interface and might post the code on GitHub for others to use. The GitHub link points to a non-existing or private repository. Their last update to their public repositories was 5 years ago.
    https://github.com/mobius?tab=repositories

    The repository has 1400 stars which seems relatively large.
     
    Peter Trewhitt, alktipping and Hutan like this.
  9. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,213
    The GitHub site above was from their website. Here is the GitHub repository for the DNAnalyzer listed in the paper.
    https://github.com/VerisimilitudeX/DNAnalyzer

     
  10. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
    The lead author, Piyush Acharya, seems to be in high school. A news article says he was in 10th grade in Jan 2024.

    He apparently started a non-profit called DNanalyzer three years ago. The GitHub page says "Supported by Anthropic. [...] Our impact has been recognized by Y Combinator, the organizers of the AI World's Fair Expo, and the CEO of DEV." I tried looking up any association of this company with Anthropic or Y Combinator but can't find anything online.

    The website says in 2023, "DNAnalyzer becomes a fiscally sponsored 501(c)(3) nonprofit organization (EIN: 81-2908499) to advance DNA analysis technology." The EIN number is for an organization called Hack Foundation which has been a non-profit since 2016.

    His personal website says he also leads the frontend development team for pickleball.com, and has worked there since Jan 2024.

    Edit: His GitHub also says he is "a researcher at the University of Victoria, where I apply machine learning techniques to retrieval-augmented generation and epigenetics" but I can't find any connection between him and that university.
     
    Last edited: May 28, 2025 at 9:07 PM
  11. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    794
    Location:
    USA
    Having recently supervised high school interns at my previous lab, I’m pretty certain that it’s a similar situation. There’s a whole contingent of high school students who are being encouraged to publicize their coding projects, do internships at local universities resulting in published papers, and register non-profits in their own name, all for their eventual application to an Ivy League college. Not to say this kid isn’t impressive or isn’t genuinely interested in these areas, it’s just an increasingly common phenomenon to have high school kids building up resumes and online presences that look like this.
     
    Yann04, hotblack, ukxmrv and 5 others like this.
  12. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,213
    They have a page for him here
    https://jsis.washington.edu/canada/people/piyush-awasthi/

    And another page with his story here
    https://jsis.washington.edu/canada/...ther-in-unity-corbett-scholar-piyush-awasthi/

     
    Last edited: May 28, 2025 at 9:39 PM
    Peter Trewhitt and alktipping like this.
  13. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    794
    Location:
    USA
    Peter Trewhitt and forestglip like this.
  14. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
  15. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,213
  16. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
    I don't think that GitHub page is related, even though they link to it. I think it's just a user named mobius. It includes things related to playstation emulators and Star Wars and has existed for at least 12 years.
     
  17. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,402
    I was emailing with the second author, Derek Jacoby. He is in fact working with Piyush. He says they have just recently presented this paper at the BioSMART conference in Paris, and they hope within the next few weeks that a peer-reviewed version will be published.

    He confirmed that, though the linked website says so, the model has not been used by any clinicians, and there has been no probing of the model for specific genes of interest. He'll look into the material presented on the website. He hopes for future work to look into specific mechanisms, but this paper will only focus on classification.
     

Share This Page