Preprint Charting the Circulating Proteome in ME/CFS: Cross System Profiling and Mechanistic insights, 2025, Hoel, Fluge, Mella+

Discussion in 'ME/CFS research' started by SNT Gatchaman, May 31, 2025 at 11:38 PM.

  1. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,407
    Location:
    Belgium
    Also see no good solution.

    I kept them in, hence why there are two datapoints for, leptin or for 'Fatty acid-binding protein, adipocyte'. For example, if there are two rows for leptin in the Hoel dataset and 1 in the Germain dataset, it would duplicate the latter and have 2 rows for leptin in the merged dataset.
    That's not entirely correct as some datapoint are counted twice but given that this only occurs for a limited number of proteins (I found it for 52 out of 605) I think the correlation still gives a crude estimation of how the results of the two studies match?

    Was mostly interested to see which proteins are high or low in both datasets.
     
    Kitty and Peter Trewhitt like this.
  2. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,407
    Location:
    Belgium
  3. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,463
    Yeah, I think it's good enough for a rough idea.
     
  4. Hutan

    Hutan Moderator Staff Member

    Messages:
    32,765
    Location:
    Aotearoa New Zealand
    IL26 is one of the top hits in this study.
    From gene cards:
    There is a recent review of IL26
    Immunobiology of IL-26
    On the relationship between IL-26 and interferon gamma:
    Relationship between IL-26 and IFN-γ (AI response)
    • Co-expression:
      IL-26 and IFN-γ are often co-expressed in various immune cells, particularly T lymphocytes, suggesting a coordinated role in immune responses.
    • Close Gene Location:
      The genes for IL-22, IL-26, and IFN-γ are located in close proximity on chromosome 12q15, suggesting they may be regulated together.
    • Similar Functions:
      Both IL-26 and IFN-γ can activate certain signaling pathways (STAT1 and STAT3), suggesting they may share some functional overlap.
    • Antiviral Synergy:
      IL-26 and IFN-γ can work synergistically to enhance antiviral responses, as IL-26 can stimulate the production of IFN-γ and vice versa.
    • Potential Role in Autoimmune Diseases:
      Polymorphisms in the IFN-γ/IL-26 gene region have been linked to sex bias in susceptibility to rheumatoid arthritis, suggesting a role for these cytokines in autoimmune diseases.
     
  5. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,463
    @ME/CFS Skeptic My plot looks the same by the way.

    Here are the three genes that were changed in the same direction in both studies and had a q value less than .05 in both:
    Edit: Added links to GeneCards.
     
    Last edited: Jun 1, 2025 at 9:24 PM
    Murph, Peter Trewhitt, Kitty and 3 others like this.
  6. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,407
    Location:
    Belgium
    Thanks for checking! I hope will see more studies like this that test large amounts of data and make it all available online. It offers so much more possibilities for us (and other researchers) to check particular results and compare them to other studies.

    Making data available online is a strong sign that researchers mean serious business (that they do science instead of academics). The Keller study, Hanson group and this Norwegian team all seem truly dedicated to unravelling ME/CFS.
     
  7. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,407
    Location:
    Belgium
    I wonder if increased leptin and fatty acid-binding proteins (FABPs) are simply due to ME/CFS patients having relatively more fat. Patients and controls have a similar BMI but the composition of their body mass might be different: more fat and less muscle for ME/CFS patients?
     
    Last edited: Jun 2, 2025 at 9:10 AM
    Murph, Peter Trewhitt, Kitty and 2 others like this.
  8. Hutan

    Hutan Moderator Staff Member

    Messages:
    32,765
    Location:
    Aotearoa New Zealand
    Retinoic Acid in the Immune System
    So, as its name suggests, Cellular Retinoic Acid-binding Protein 2 binds retinoic acid in the cell and shuttles it into the cell nucleus. What it's doing in possibly higher than normal levels in ME/CFS blood, I'm not sure. But it does seem to increase when there is a viral infection. There are two forms of CRABP, the one that we are talking about is CRABP2 or CRABPII. I got a bit excited about that report above that CRABP2 has sex differences in expression (i.e. in ovaries and uterus), but the gene card suggests that lots of tissues express it.

    Downstream, assuming more CRABP2, it's looking pretty complicated. Assuming more CRABP2 means more retinoic acid affecting protein production: Retinoic acid seems to have different effects on macrophages and monocytes>dendritic cells:
    Link to that reference 39
     
    Peter Trewhitt, forestglip and Kitty like this.
  9. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    863
    Location:
    USA
    Sorry to be the same broken record: obligatory mention that JAK2 is not exclusive to interferon gamma
     
    Last edited: Jun 1, 2025 at 10:35 PM
  10. Hutan

    Hutan Moderator Staff Member

    Messages:
    32,765
    Location:
    Aotearoa New Zealand
    Sorry, I've missed this. Can you explain?
     
    Peter Trewhitt likes this.
  11. jnmaciuch

    jnmaciuch Senior Member (Voting Rights)

    Messages:
    863
    Location:
    USA
    Ah sorry it looks like I forgot to include the quote in my previous post. Updated with context!
     
    Peter Trewhitt and Hutan like this.
  12. DMissa

    DMissa Senior Member (Voting Rights)

    Messages:
    219
    Location:
    Australia
    Most journals mandate this these days (for omics data) so we will see pretty much everybody doing it in future
     
    Robert 1973, Utsikt, Trish and 8 others like this.
  13. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    282
    Should you filter for low p values before running your correlation? if the studies truly correspond you'd find lots of things measured at 1:1 in both studies, there will be a cloud around the centre of the plot and it won't affect your r^2. I think running the correlation on the full dataset is a simple unbiased way to do it that retains the full context.


    When I took this approach for some studies a few years ago I even found a negative correlation between studies in some cases (comparing naviaux against some others, amde me wonder if naviaux had aciidentally got his numerator and denominator confused): https://www.s4me.info/threads/mecfs-data-analysis-thread.37775/ I also found a disappointing number of "deep, untargeted" metabolomic studies that had almost no crossover with other deep untargeted metabolomic studies in terms of line items. Theres' just so much a lab could measure!

    Nevertheless I love diving in on this, it seems like if a person made a meta dataset of all the metabolomics data (starting with a list of the thousands of molecules measured and seeing which ones have een measured more than once) we could then exploit that.

    That said it's a ton of work, as you guys found there's a lot of ways to annotate the name of a protein, metabolite or other molecule, and a dozen different numbering systems.

    [​IMG]
    I found it really hard to match every single line in two databases, whether by KEGG or looking at the names manually, you'd find so many things measured that may or may not be the same (chirality, isotopes, a category that has since been expanded into two subtypes, a names used in only one country, names that became outdated, and a million more examples besides.)
     
    Utsikt, Hutan, ME/CFS Skeptic and 3 others like this.
  14. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,463
    Since they did the validation portion, it seemed worth checking if there was overlap between the 57 proteins in that part with the thousands in the first part and in Germain. There were overlaps in 51 genes between all three cohorts. In 19 genes, the fold change was in the same direction. Only one had a q-value less than .5 (yes, .5 not .05) in all three:

    Fatty acid-binding protein, adipocyte - FABP4
    hoel aptamer: q=.0001, log2fc=0.48
    hoel antibody: q=.063, log2fc=0.51
    germain aptamer: q=.257, log2fc=0.56

    Maybe the following is something. Is "muscle sympathetic nerve activity" something that could be higher in ME/CFS, maybe more of an increase after exercise, and could be making FABP4 higher?

    The relationship between muscle sympathetic nerve activity and serum fatty acid binding protein 4 at rest and during isometric handgrip exercise (2024, Physiol Rep)
     
  15. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,463
    Yes the fact that this is not a thing has been irritating me all day as I crudely try to match up genes from one study with a single other study. It basically feels like it would be free data if such a database existed. There are many 'omics studies that would be great to cross-reference against.

    I imagine there must also be good algorithms for dealing with summary statistics from different studies to find the most promising targets, instead of just kind of filtering by arbitrary p-values or looking at a chart for what looks high in two studies at once. More than that, it'd be useful if it could analyze related genes instead of just matching up genes that are exactly the same.

    If every study would not only attach a supplementary data spreadsheet to their paper, but also submit the data in a nice format to a centralized database, which can also run these algorithms, I think we'd be dealing with a much more statistical power to find proteins of interest.

    I think maybe mapMECFS was planning to do an actual standardized database? I'm not sure, I no longer have access as they started requiring institutional email addresses.

    I just can't believe it's not worth it. It would be even better for it to be all diseases, not just ME/CFS, so that one could compare between diseases and benefit from economies of scale. To instead run these huge panels in studies, list a few significant proteins in the text, and basically forget the rest of the not super significant proteins feels like a huge waste.
     
  16. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    282
    Peroxidasin is intersting to me, it might help explain POTS

    Mammalian Peroxidasin (PXDN): From Physiology to Pathology

    PXDN expresses in the endothelial cells and secretes into blood. PXDN exhibits with much higher concentration in plasma than MPO [20]. Therefore, it is reasonable to speculate that PXDN also plays an important role in vascular tone under physiological and pathological conditions.



    According to that review it also seems to be involved in extracellular matrix and fibronectin. i don't know too much about extracellular matrix and fibronectin but I do know that these are bits of biology that keep coming up! Collagen-associated functions suggest a possible link to Ehlers Danlos or similar connective tissue issues.
     
    Hutan, forestglip and Peter Trewhitt like this.
  17. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    282
    A good spreadhseet for the human eye would have studies in rows and molecules in columns.

    But I think this data would be too big for the eye. You'd need what they call long data, or tidy data, where each row contains only one patient-to-control ratio, and many markers by which you can filter it.

    A good machine readable table would have perhaps the following columns:

    molecule common name |control to patient ratio | year | study | fluid studied (e.g. serum, plasma, saliva, urine) | target (e.g. amino acids, proteins, immune proteins) | total sample size | male sample size | female sample size | sex to which ratio applies (M/F/All) | molecule other name | KEGG | PubChem ID | OR minimum | OR max | sd | published p-value | published q-value | exercise provocation status | Pathway molecule is in (general) | Pathway molecule is in (detailed) | data source url

    if we could wrestle a few datasources into that format then we could put them together and filter the whole thing by molecule/Gender etc to find what various studies said.

    What you'd lose is the richness in some of the Hanson datasets where she provides patient-by-patient data, which can be really useful to look at. But it'd be super valuable and easy to append new findings.

    Perhaps once you had 30,000 rows you can blast through some statistical noise and see patterns ,or perhaps after some molecules had come up in 10 studies without ever showing any consistency, you could make the argument metabolomics is a false friend and a waste of effort. Either way it's useful.
     
    Last edited: Jun 2, 2025 at 5:23 AM
  18. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,463
    Here's with merging on all rows from both studies using TargetFullName. Spearman r of .23
    upload_2025-6-2_0-17-28.png
     
  19. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    282
  20. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    2,463
    R stresses me out because I'm not very familiar with it, so when I'm not in a patient mood I just use Python. Computer's away for the night, but I'll do a linear model tomorrow.
     
    Peter Trewhitt, hotblack and Murph like this.

Share This Page