Abstract Clinical diagnoses rely on a wide variety of laboratory tests and imaging studies, interpreted alongside physical examination findings and the patient's history and symptoms. Currently, the tools of diagnosis make limited use of the immune system's internal record of specific disease exposures encoded by the antigen-specific receptors of memory B cells and T cells, and there has been little integration of the combined information from B cell and T cell receptor sequences. Here, we analyze extensive receptor sequence datasets with three different machine learning representations of immune receptor repertoires to develop an interpretive framework, MAchine Learning for Immunological Diagnosis (Mal-ID), that screens for multiple illnesses simultaneously. This approach is effective in identifying a variety of disease states, including acute and chronic infections and autoimmune disorders. It is able to do so even when there are other differences present in the immune repertoires, such as between pediatric or adult patient groups. Importantly, many features of the model of immune receptor sequences are human-interpretable. They independently recapitulate known biology of the responses to infection by SARS-CoV-2 and HIV, provide evidence of receptor antigen specificity, and reveal common features of autoreactive immune receptor repertoires, indicating that machine learning on immune repertoires can yield new immunological knowledge. This framework could be useful in identifying immune responses to new infectious diseases as they emerge. https://www.biorxiv.org/content/10.1101/2022.04.26.489314v2
Extended to further datasets and clinical cohorts at population scale, this immune repertoire analysis strategy offers a strategy for disease definition refinement and diagnosis, as well as improving understanding of immune response features such as autoreactivity that are shared across different pathologies. We anticipate extending this approach to other autoimmune conditions, immunological treatment complications like transplantation rejection, and less well understood conditions suspected to have an immunological basis, like chronic fatigue syndrome. This analysis technique may be able to predict which patients respond to immuno-oncology checkpoint blockade therapy and illuminate the basis for low response rates.
"Disease diagnostics using machine learning of B cell and T cell receptor sequences" (Science): https://www.science.org/doi/10.1126/science.adp2407
Cort has written about it: https://www.healthrising.org/blog/2025/03/04/immunome-chronic-fatigue-long-covid/