Preprint BioMapAI: Artificial Intelligence Multi-Omics Framework Modeling of Myalgic Encephalomyelitis / Chronic Fatigue Syndrome, 2024, Xiong et al.

pooriepoor91

Established Member

https://www.biorxiv.org/content/10.1101/2024.06.24.600378v1

BioMapAI: Artificial Intelligence Multi-Omics Framework Modeling of Myalgic Encephalomyelitis / Chronic Fatigue Syndrome

Ruoyun Xiong, Elizabeth Fleming, Ryan Caldwell, Suzanne D. Vernon, Lina Kozhaya, Courtney Gunter, Lucinda Bateman, Derya Unutmaz, Julia Oh

Abstract

Chronic diseases like MECFS and long COVID exhibit high heterogeneity with multifactorial etiology and progression, complicating diagnosis and treatment. To address this, we developed BioMapAI, an explainable Deep Learning framework using the richest longitudinal multi-'omics dataset for MECFS to date. This dataset includes gut metagenomics, plasma metabolome, immune profiling, blood labs, and clinical symptoms. By connecting multi-'omics to a symptom matrix, BioMapAI identified both disease- and symptom-specific biomarkers, reconstructed symptoms, and achieved state-of-the-art precision in disease classification. We also created the first connectivity map of these 'omics in both healthy and disease states and revealed how microbiome-immune-metabolome crosstalk shifted from healthy to MECFS. Thus, we proposed several innovative mechanistic hypotheses for MECFS: Disrupted microbial functions - SCFA (butyrate), BCAA (amino acid), tryptophan, benzoate - lost connection with plasma lipids and bile acids, and activated inflammatory and mucosal immune cells (MAIT, gdT cells) with INFgamma; and GzA secretion. These abnormal dynamics are linked to key disease symptoms, including gastrointestinal issues, fatigue, and sleep problems.
 
We generated a rich longitudinal, multi-‘omics dataset of 153 ME/CFS patients and 96 age-gender-matched healthy controls, comprised of gut metagenomics, plasma metabolome, immune cell profiling, activation, and cytokines, together with blood labs, detailed clinical symptoms, and lifestyle survey data. We aimed to: 1) identify new disease biomarkers - not only for ME/CFS but also to specify biomarkers that could explain the complex symptomatology, and define interactions between microbiome, immune system, and metabolome – rather than studying single data types in isolation, we created the first connectivity map of these ‘omics. This map critically accounts for covariates such as age and gender, providing an important baseline in healthy individuals contrasted with aberrant connections identified in disease.

Here, applying to ME/CFS, it identifies both disease- and symptom-specific biomarkers, accurately reconstructing key clinical symptoms, achieves state-of-the-art precision in disease classification, and generates several innovative mechanistic hypotheses for disease.
That's a big claim for what is still a pretty small sample, and with their diagnostic model presumably not tested on another dataset.
 
I don’t see any suggestions for treatments here. I believe that Unutmaz said that they were going to list suggested treatments. If he/they have some potential treatments in mind based on this research, then they should put that in the paper. Maybe we need to wait another six years (according to Unutmaz, this paper took six years) for the treatment paper ?

https://twitter.com/user/status/1805317670704718227
 
Last edited:
We tracked 249 participants over 3-4 years, including 153 ME/CFS patients (75 'short-term' with disease symptoms < 4 years and 78 'long-term' with disease symptoms > 10 years) and 96 healthy controls

Throughout the study, we collected detailed clinical metadata, blood samples, and fecal samples. In total, 1471 biological samples were collected across all participants at 515 timepoints

data on 443 immune cells and cytokines

958 metabolites

demographic documentation and
questionnaires covering medication use, medical history, and key ME/CFS symptoms

whole-genome shotgun metagenomic sequencing of stool samples...
gut microbiome composition (1293 species detected) and KEGG gene function (9993 genes reconstructed).
 
While healthy individuals consistently presented low symptom scores (Supplemental Figure 1D), ME/CFS patients exhibited significant variability in symptom severity, with each individual showing different predominant symptoms (Figure 1B). Principal coordinates analysis (PCoA) of the ‘omics matrices highlighted the difficulty in distinguishing patients from controls, emphasizing the complex symptomatology of ME/CFS and the challenges in developing predictive models (Supplemental Figure 1E). Additionally, over time, in contrast to the stable patterns typical of healthy individuals (Supplemental Figure 1B), ME/CFS patients demonstrated distinctly varied patterns each year, as evidenced by the diversity in symptom severity and noticeable separation on the ‘omics PCoA (Figure 1B, Supplemental Figure 1C). Despite employing multiple longitudinal models (Methods), we found no consistent temporal signals, confirming the non-linear progression of ME/CFS.
It's not surprising that symptom severity varies significantly in the ME/CFS sample. I think for mild and mild/moderate ME/CFS, if you keep below an exertion threshold, you can have minimal symptoms. And life brings periods that demand more exertion. The symptom surveys were
DePaul Symptom Questionnaire (DSQ), Post-Exertional Fatigue Questionnaire, RAND-36, Fibromyalgia Impact Questionnaire-R, ACR 2010 Fibromyalgia Criteria Symptom Questionnaire, Pittsburgh Sleep Quality Index (PSQI), Stanford Brief Activity Survey, Orthostatic Intolerance Daily Activity Scale, Orthostatic Intolerance Symptom Assessment, Brief Wellness Survey, Hours of Upright Activity (HUA), medical history and family history
I'm not sure what the time period each survey was applied to (previous day(s), weeks, months?), but it's pretty complicated matching symptoms applying to weeks or so to biochemical measures that may or may not change momentarily. That's particularly true if they amalgamated surveys with different time frames together to make the 12 clinical scores.

Regarding the lack of any evidence for consistent worsening or improvement:
That may well be so, but we have previously discussed how reference points for assessments of severity can shift over time if a numerical scale is used. A 'severe' rating may not be the same in year 1 compared to year 4.
 
To determine the accuracy of reconstructed clinical scores by BioMapAI’s integration of ‘omics data, we compared their ability to discriminate ME/CFS patients from controls with the original clinical scores. We used one additional fully connected layer to regress the 12 predicted clinical scores Y(12, ) into a binary outcome of patient vs. control ŷ(1, ). Because the diagnosis of ME/CFS relies on clinical interpretation of key symptoms (i.e., the original clinical scores), the original clinical scores have near-perfect accuracy in classification as expected (AUC, Area Under the Curve >99%, Supplemental Figure 2C). Notably, BioMapAI’s predicted scores based on the ‘omics data achieved a 91% AUC, highlighting its leading-edge accuracy in disease vs. healthy classification (Figure 2D, Supplemental Figure 2D), which was also superior to the performance of three ML models - linear regression (LR), support vector machine (SVM), and gradient boosting (GDBT) - and one deep learning model (DNN) without the hidden 3, ‘spread out’ layer (Supplemental Table 3). BioMapAI particularly excelled utilizing immune features (AUC = 80%), KEGG genes (78%), blood measure models (71%) and combined ‘omics (91%). GDBT, however, led in the microbial species (75%) and metabolome (74%) models, likely due to its emphasis on specific features.
I think that means that BioMapAI was able to distinguish people with ME/CFS from healthy controls from the (baseline? or all?) 'omics data with a 91% success rate.

Finally, to assess the robustness of our BioMapAI model, we validated it with independent, published ME/CFS cohorts (Figure 2E, Supplemental Table 4). Using data from two microbiome cohorts, Guo, Cheng et al., 2023 (US)28 and Raijmakers, Ruud et al., 2020 (Netherlands)29, BioMapAI achieved 72% and 63% accuracy in species relative abundance and 58% and 60% accuracy in microbial KEGG gene abundance. When applied to two metabolome cohorts, Germain, Arnaud et al., 2022 (US)30 and Che, Xiaoyu et al., 2022 (US)31, BioMapAI attained 68% and 59% accuracy. These results were strong given that the metabolomic features only overlap by 79% and 19%, respectively, due to methodological variations.
So, they did validate their model. The accuracies in identifying people with ME/CFS are promising but not definitive.
 
Last edited:
‘Omics’ Strengths Varied in Symptom Prediction; Immune is the Most Predictive
A major innovation of BioMapAI is its ability to leverage different ‘omics data to predict individual clinical scores in addition to disease vs. healthy classification. We evaluated the predictive accuracy by calculating the mean squared error between actual () and predicted (̂) scores and observed that the different ‘omics showed varying strengths in predicting clinical scores (Figure 2C). Immune profiling consistently excelled in forecasting a wide range of symptoms, including pain, fatigue, orthostatic intolerance, and general health perception, underscoring the immune system's crucial role in health regulation. In contrast, blood measurements demonstrated limited predictive ability, except for cognitive efficiency, likely owing to their limited focus on 48 specific blood bioactives. Plasma metabolomics, which encompasses nearly a thousand measurements, performed significantly better with notable correlations with facets of physical health and social activity. These findings corroborate published metabolites and mortality35,36, longevity37,38, cognitive function39, and social interactions40,41,42. Microbiome profiles surpassed other ‘omics in predicting gastrointestinal abnormalities (as expected43,44), emotional well-being, and sleep problems, supporting recently established links in gut-brain health45,46,47.

BioMapAI is Explainable, Identifying Disease- and Symptom-Specific Biomarkers
Deep learning (DL) models are often referred to as ‘black box’, with limited ability to identify and evaluate specific features that influence the model’s predictions. BioMapAI is made explainable by incorporating SHAP values, which quantify how each feature influenced the model's predictions. BioMapAI’s architecture – two shared layers (1 and 2) for general disease pattern learning and one parallel layer for each clinical score (3 = [3, 3, ... , 3 ]) – 12 12 allowed us to identify both disease-specific biomarkers, which are shared across symptoms and models (Supplemental Figure 3, Supplemental Table 5), and symptom-specific biomarkers, which are tailored to each clinical symptom (Figure 3, Supplemental Figure 4-5, Supplemental Table 6).

Disease-specific biomarkers are important features across symptoms and models (Methods, Supplemental Figure 3). Increased B cells (CD19+CD3-), CCR6+ CD8 memory T cells (mCD8+CCR6+CXCR3-), and CD4 naïve T cells (nCD4+FOXP3+) in patients were pivotal for most symptoms, indicating a systemic dysregulation of the adaptive immune response. The species model highlighted the importance of Dysosmobacteria welbionis, a gut microbe previously reported in obesity and diabetes, with a critical role in bile acid and butyrate metabolism48,49. The metabolome model categorized increased levels of glycodeoxycholate 3-sulfate, a bile acid, and decreased vanillylmandelate (VMA), a catecholamine breakdown product50. These critical features for all symptoms were consistently validated across ML and DL models, demonstrating the efficacy of BioMapAI (Supplemental Table 5).
 
BioMapAI’s specialized deep neuron network structure with two shared general layers and one outcome-focused parallel layer is moreover generalizable and scalable to other cohort studies that aim to utilize ‘omics data for a range of outputs (e.g., not just limited to clinical symptoms). For instance, researchers could employ our model to link whole genome sequencing data with blood or protein measurements. Constructed to automatically adapt to any input matrix and any output matrix = [1,2,...,], BioMapAI defaults to parallelly align specific layers for each output, . Currently, the model treated all 12 studied symptoms, [1, 2, ... , 12], with equal importance due to the unclear symptom prioritization in ME/CFS85.

We computed modules to assign different weights to symptoms to enhance diagnostic accuracy. While this approach was not particularly effective for ME/CFS, it may be more promising for diseases with more clearly defined symptom hierarchies86,87. In such cases, adjusting the weights of symptoms in the model’s final layer could improve performance and help pinpoint which symptoms are truly critical.

I've only skim read this, and a lot of it is over my head. This does seem like a promising way to unravel patterns in huge masses of data, to examine the relationships between datapoints. I can see why the study took a long time, there's a lot of work there.

I think there may be a problem of garbage in, garbage out though. I think the granularity of the sampling has to come right down, because we know symptoms change quickly. I think the next step might be to get sizeable cohorts of the same sex and age and approximate BMI and take measures of symptoms on good and bad days, as objectively as possible to avoid the problem of change of reference, along with blood and faecal samples. That is, I think something like what Chris Armstrong @MelbME is doing. And then apply BioMapAI to that data.

I hope the developers of BioMapAI are trying it on diseases that have diagnostic and symptom biomarkers. They started with a very hard disease, given the likely heterogeneity of pathology in the ME/CFS cohort.

I'm interested to hear what others think.
 
I don’t see any suggestions for treatments here. I believe that Unutmaz said that they were going to list suggested treatments. If he/they have some potential treatments in mind based on this research, then they should put that in the paper.

These nuanced insights, while still premature for actual treatment applications, lay the groundwork for more precise controlled experiments and interventional studies. For instance, personalized treatment options could include supplementation of butyrate and amino acids for patients suffering from severe gastrointestinal and emotional symptoms, or targeted
treatments for chronic inflammation for those experiencing significant pain and fatigue.
 
I recall that @Jonathan Edwards wrote that the MRC group (he was part of) proposed funding a GWAS [common variant] genetic study [DecodeME] - the only other promising area was a sleep study - actimetry?
I'm also interested in sleep since my family member has poor sleep quality; therefore I searched the paper for "sleep" & noticed this-
"A notable strong connection increased in ME/CFS is benzoate transformation to hippurate, associated with emotional disturbances, sleep issues, and fatigue."
Anyone found the papers/references to support this statement i.e. link between hippurate & sleep disorder?

While this may be all we have to go on, and therefore we should be grateful to these scientists, potentially the results from genetic studies like DecodeME [common genetic variant study] &/or rare genetic variant studies may prove a clearer route to underlying disease mechanism/pathology.
 
Last edited:
This is a highly ambitious paper. Big sample and some very sophisticated statistical analysis. I like it. This study is worth combining with all our other metabolomic studies to see what is consistent and what is not.

It seems to find a role for gut microbes, and their products, espeically butyrate. Why might the gut be unable to maintain a healthy population of microbes? Perhap the cells lining the gut may be compromised in their function. And/or they may be infected with viruses (which other recent studies have suggested) and the viruses might be affectign the environment for their own purposes.

I remain very interested in the link with lipids:

the gut microbiome was expectedly linked to gastrointestinal
issues and unexpectedly, to pain, fatigue, and mental health problems, possibly due to
disruptions in the gut-brain axis from abnormal microbial metabolic functions, such as lost
network connections with key plasma metabolites, particularly lipids.


I think that even better studies than this one can be imagined though. It seems to lack exercise provocation and to me that's where you see this disease in all its glory.

I also doubt that this generation of AI is the best AI we can get. Doubtless in 20 years that technology will be much more refined. Although hopefully in 20 years we won't still be fishing for associations and will have a better grip on the core causal relationships!
 
I also doubt that this generation of AI is the best AI we can get. Doubtless in 20 years that technology will be much more refined. Although hopefully in 20 years we won't still be fishing for associations and will have a better grip on the core causal relationships!
It's about currently at the level of the first flip phones compared to modern smart phones.

It's almost funny seeing people whine about how it's looking like a flop, exactly like all the people who called the Internet a fad in the late 90s. Very few people are able to imagine a future different from the present. Hell even in sci-fi most futures are slight variations of the present, with almost no cultural change.
 
It's about currently at the level of the first flip phones compared to modern smart phones.

It's almost funny seeing people whine about how it's looking like a flop, exactly like all the people who called the Internet a fad in the late 90s. Very few people are able to imagine a future different from the present. Hell even in sci-fi most futures are slight variations of the present, with almost no cultural change.

At the same time you should not forget the state of scientific publishing in the AI field. A large majority of junk is churned out on a daily basis, methodologically often no different to the BPS stuff, where mathematical proofs, sometimes even erroneous, are quite often degraded to the supplementary material.
 
2025

Abstract
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a chronic illness with a multifactorial etiology and heterogeneous symptomatology, posing major challenges for diagnosis and treatment. Here, we present BioMapAI, a supervised deep neural network trained on a four-year, longitudinal, multi-omics dataset from 249 participants, which integrates gut metagenomics, plasma metabolomics, immune cell profiling, blood laboratory data, and detailed clinical symptoms.

By simultaneously modeling these diverse data types to predict clinical severity, BioMapAI identifies disease- and symptom-specific biomarkers and robustly classifies ME/CFS in both held-out and independent external cohorts. Using an explainable AI approach, we construct the first connectivity map spanning the microbiome, immune system, and plasma metabolome in health and ME/CFS, adjusted for age, gender, and additional clinical factors.

This map uncovers disrupted associations between microbial metabolism (e.g., short-chain fatty acids, branched-chain amino acids, tryptophan, benzoate), plasma lipids and bile acids, and heightened inflammatory responses in mucosal and inflammatory T cell subsets (MAIT, γδT) secreting IFNγ and GzA. Overall, BioMapAI provides unprecedented systems-level insights into ME/CFS, refining existing hypotheses and hypothesizing new pathways associated to the disease heterogeneous symptoms.

LINK
 
Back
Top Bottom