Preprint Mobius: Mixture-Of-Experts Transformer Model in Epigenetics of ME/CFS and Long COVID, 2025, Acharya & Jacoby

John Mac

Senior Member (Voting Rights)
https://www.biorxiv.org/content/10.1101/2025.05.25.656018v1

Abstract

Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Long COVID are chronic debilitating post-infectious illnesses that collectively affect up to 470 million individuals. Unlike illnesses of comparable scale, there are no validated blood or imaging tests for the clinical diagnosis of these conditions.

Currently, these conditions are diagnosed through clinical exclusion, resulting in approximately 90% of ME/CFS patients being incorrectly diagnosed as Long COVID patients. This misdiagnosis contributes to delayed care and millions of dollars in healthcare burdens.

We present Mobius, a transformer-based model that uses autoencoder-derived features from blood DNA methylation to distinguish ME/CFS, Long COVID, and healthy controls.

Using 852 samples from 14 distinct datasets, our method employs three innovations:
(i) self-supervised masked pretraining to learn epigenetic patterns,
(ii) a sparsely-gated mixture-of-experts architecture to handle heterogeneous data, and
(iii) an adaptive computation time mechanism for dynamic inference.

Mobius achieved 97.06% accuracy (macro-F1 0.95, AUROC 0.96), outperforming current symptom-based diagnostics (58%) and baseline models such as XGBoost (82%).
Ablation experiments showed that pretraining added 6% accuracy and that the gating and adaptive depth contributed an additional 7%.
Our open-source pipeline could enable a much-needed objective blood test for these conditions and guide targeted precision medicine therapies.
 
Furthermore, out of 170 samples in the independent dataset, the classifier misclassified only 1 ME/CFS sample as Long COVID and 2 LC samples as ME/CFS, with no healthy controls misclassified, which indicates excellent specificity.
97.06% accuracy is very high. One thing is that all ME/CFS samples were pre-pandemic, so some of the difference between ME/CFS and LC might be related to time of sampling.

But to not misclassify any samples between HC and disease is striking. It implies there were virtually no misdiagnoses in the samples they used. I guess potentially it's possible, but generally 100% accuracy is suspicious in machine learning.

If symptom based diagnosis is only 58% accurate then who decided what 100% accurate was, in order to match up 97% with the test?
They don't seem to cite which, but I assume that 58% is from some other study that looked at under/over-diagnosis.
 
There's a whole fancy website showing off the model and recommending clinicians to use it in their practice: https://mobius.dnanalyzer.org/

It would be nice if they would apply to the CureME Biobank for blinded samples to prove out their technique before offering the tools for clinical use. I wonder what sort of funding they need to extract the DNA data and methylation patterns from such a cohort.

Looks like they are planning an API interface and might post the code on GitHub for others to use. The GitHub link points to a non-existing or private repository. Their last update to their public repositories was 5 years ago.
https://github.com/mobius?tab=repositories

The repository has 1400 stars which seems relatively large.
 
Looks like they are planning an API interface and might post the code on GitHub for others to use. The GitHub link points to a non-existing or private repository. Their last update to their public repositories was 5 years ago.
https://github.com/mobius?tab=repositories
The GitHub site above was from their website. Here is the GitHub repository for the DNAnalyzer listed in the paper.
https://github.com/VerisimilitudeX/DNAnalyzer

GitHub Site said:
About DNAnalyzer
DNAnalyzer is a biotechnology research and deployment company. Supported by Anthropic, our mission is to revolutionize DNA analysis by making AI-powered genomic insights accessible to all through on-device computation.

Founded by Piyush Acharya, DNAnalyzer's team includes 46 leading computational biologists and computer scientists from Microsoft Research, the University of Macedonia, and Northeastern University.

Our impact has been recognized by Y Combinator, the organizers of the AI World's Fair Expo, and the CEO of DEV.
 
The lead author, Piyush Acharya, seems to be in high school. A news article says he was in 10th grade in Jan 2024.

He apparently started a non-profit called DNanalyzer three years ago. The GitHub page says "Supported by Anthropic. [...] Our impact has been recognized by Y Combinator, the organizers of the AI World's Fair Expo, and the CEO of DEV." I tried looking up any association of this company with Anthropic or Y Combinator but can't find anything online.

The website says in 2023, "DNAnalyzer becomes a fiscally sponsored 501(c)(3) nonprofit organization (EIN: 81-2908499) to advance DNA analysis technology." The EIN number is for an organization called Hack Foundation which has been a non-profit since 2016.

His personal website says he also leads the frontend development team for pickleball.com, and has worked there since Jan 2024.

Edit: His GitHub also says he is "a researcher at the University of Victoria, where I apply machine learning techniques to retrieval-augmented generation and epigenetics" but I can't find any connection between him and that university.
 
Last edited:
The lead author, Piyush Acharya, seems to be in high school. A news article says he was in 10th grade in Jan 2024.

He apparently started a non-profit called DNanalyzer three years ago. The GitHub page says "Supported by Anthropic. [...] Our impact has been recognized by Y Combinator, the organizers of the AI World's Fair Expo, and the CEO of DEV." I tried looking up any association of this company with Anthropic or Y Combinator but can't find anything online.

The website says in 2023, "DNAnalyzer becomes a fiscally sponsored 501(c)(3) nonprofit organization (EIN: 81-2908499) to advance DNA analysis technology." The EIN number is for an organization called Hack Foundation which has been a non-profit since 2016.

His personal website says he also leads the frontend development team for pickleball.com, and has worked there since Jan 2024.

Edit: His GitHub also says he is "a researcher at the University of Victoria, where I apply machine learning techniques to retrieval-augmented generation and epigenetics" but I can't find any connection between him and that university.
Having recently supervised high school interns at my previous lab, I’m pretty certain that it’s a similar situation. There’s a whole contingent of high school students who are being encouraged to publicize their coding projects, do internships at local universities resulting in published papers, and register non-profits in their own name, all for their eventual application to an Ivy League college. Not to say this kid isn’t impressive or isn’t genuinely interested in these areas, it’s just an increasingly common phenomenon to have high school kids building up resumes and online presences that look like this.
 
Edit: His GitHub also says he is "a researcher at the University of Victoria, where I apply machine learning techniques to retrieval-augmented generation and epigenetics" but I can't find any connection between him and that university.
They have a page for him here
https://jsis.washington.edu/canada/people/piyush-awasthi/

And another page with his story here
https://jsis.washington.edu/canada/...ther-in-unity-corbett-scholar-piyush-awasthi/

The Corbett British Columbia-Washington International Exchange Program Fund provides an opportunity for undergraduate students at the University of Washington to spend two semesters at the University of British Columbia or University of Victoria; and for students from the University of British Columbia and University of Victoria to spend three quarters at the University of Washington.
 
Last edited:
Looks like they are planning an API interface and might post the code on GitHub for others to use. The GitHub link points to a non-existing or private repository. Their last update to their public repositories was 5 years ago.
https://github.com/mobius?tab=repositories

The repository has 1400 stars which seems relatively large.
I don't think that GitHub page is related, even though they link to it. I think it's just a user named mobius. It includes things related to playstation emulators and Star Wars and has existed for at least 12 years.
 
I was emailing with the second author, Derek Jacoby. He is in fact working with Piyush. He says they have just recently presented this paper at the BioSMART conference in Paris, and they hope within the next few weeks that a peer-reviewed version will be published.

He confirmed that, though the linked website says so, the model has not been used by any clinicians, and there has been no probing of the model for specific genes of interest. He'll look into the material presented on the website. He hopes for future work to look into specific mechanisms, but this paper will only focus on classification.
 
Back
Top