Leveraging Explainable Automated Machine Learning AutoML and Metabolomics for Robust Diagnosis and Pathophysiological Insights in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome ME/CFS
BACKGROUND/OBJECTIVES
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a debilitating complex disease with an elusive etiology, lacking objective diagnostic biomarkers. This study leverages advanced Automated Machine Learning (AutoML) to analyze plasma metabolomic and lipidomic profiles for the purpose of ME/CFS detection.
METHODS
We utilized a publicly available dataset comprising 888 metabolic features from 106 ME/CFS patients and 91 matched controls. Three AutoML frameworks—TPOT, Auto-Sklearn, and H2O AutoML—were benchmarked under identical time constraints. Univariate ROC and PLS-DA analyses with cross-validation, permutation testing, and VIP-based feature selection were applied to standardized, log-transformed omics data to identify significant discriminatory metabolites/lipids and assess their intercorrelations.
RESULTS
TPOT significantly outperformed its counterparts, achieving an area under the curve (AUC) of 92.1%, accuracy of 87.3%, sensitivity of 85.8%, and specificity of 89.0%. The PLS-DA model revealed a moderate but statistically significant discrimination between ME/CFS and controls. Explainable artificial intelligence (XAI) via SHAP analysis of the optimal TPOT model identified key metabolites implicating dysregulated pathways in mitochondrial energy metabolism (succinic acid, pyruvic acid, leucine), chronic inflammation (prostaglandin D2, 11,12-EET), gut–brain axis communication (glycocholic acid), and cell membrane integrity (pc(35:2)a).
CONCLUSIONS
Our results demonstrate that TPOT-derived models not only provide a highly accurate and robust diagnostic tool but also yield biologically interpretable insights into the pathophysiology of ME/CFS, highlighting its potential for clinical decision support and elucidating novel therapeutic targets.
Web | DOI | PDF | Diagnostics | Open Access
Yagin, Fatma Hilal; Colak, Cemil; Al-Hashem, Fahaid; Alzakari, Sarah A; Alhussan, Amel Ali; Aghaei, Mohammadreza
BACKGROUND/OBJECTIVES
Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a debilitating complex disease with an elusive etiology, lacking objective diagnostic biomarkers. This study leverages advanced Automated Machine Learning (AutoML) to analyze plasma metabolomic and lipidomic profiles for the purpose of ME/CFS detection.
METHODS
We utilized a publicly available dataset comprising 888 metabolic features from 106 ME/CFS patients and 91 matched controls. Three AutoML frameworks—TPOT, Auto-Sklearn, and H2O AutoML—were benchmarked under identical time constraints. Univariate ROC and PLS-DA analyses with cross-validation, permutation testing, and VIP-based feature selection were applied to standardized, log-transformed omics data to identify significant discriminatory metabolites/lipids and assess their intercorrelations.
RESULTS
TPOT significantly outperformed its counterparts, achieving an area under the curve (AUC) of 92.1%, accuracy of 87.3%, sensitivity of 85.8%, and specificity of 89.0%. The PLS-DA model revealed a moderate but statistically significant discrimination between ME/CFS and controls. Explainable artificial intelligence (XAI) via SHAP analysis of the optimal TPOT model identified key metabolites implicating dysregulated pathways in mitochondrial energy metabolism (succinic acid, pyruvic acid, leucine), chronic inflammation (prostaglandin D2, 11,12-EET), gut–brain axis communication (glycocholic acid), and cell membrane integrity (pc(35:2)a).
CONCLUSIONS
Our results demonstrate that TPOT-derived models not only provide a highly accurate and robust diagnostic tool but also yield biologically interpretable insights into the pathophysiology of ME/CFS, highlighting its potential for clinical decision support and elucidating novel therapeutic targets.
Web | DOI | PDF | Diagnostics | Open Access