Comparing DNA Methylation Landscapes in Peripheral Blood from [ME/CFS] and Long COVID Patients, 2025, Peppercorn et al

Being mindful of confirmation bias and limitations of this study, I just realized that IRF2BPL is in the same family as IRF2BP2, through which malate/malic acid has been found to downregulate IL-1B response in macrophages under certain pH and ER stress conditions…like you’d also expect to find in active muscle.

Discussion here:https://www.s4me.info/threads/malat...lation-of-inflammation-2024-chen-et-al.43873/

Obviously I’d wait for a more robust study before running away with any conclusions, but just thought it was an interesting connection to note given how much malic acid helps me avoid PEM. The question I can’t answer though is why the effect of malic acid seems to be somewhat dependent on stimulants in me and the handful of others that reported an effect if it is mediated through IRF2BPs
 
Last edited:
Complaints can be sent to:
Research Integrity and Publication Ethics team
publication.ethics@mdpi.com

Well, I've sent a letter of concern about the data analysis issue and conclusions drawn from it, while expressing my appreciation of the investigation and effort. It's really difficult as I know Warren Tate does mean well and he is for the most part highly appreciated by people with ME/CFS in New Zealand. And of course we want younger researchers working on ME/CFS to have wonderful careers. There's no glory in criticising a small team valiantly trying to help people with ME/CFS with hardly any funding.

And yet, you know, truth, accuracy does matter a bit, for this paper, for later papers by this team and for other papers that will be assessed by the same editorial team and peer reviewers.
 
Here is the reply from the authors to some of the points I made, it has been forwarded by the journal with a query as to whether my concerns have now been addressed.

It seems to me that my concerns have not been addressed. I've asked the journal what their next steps are.

I think there is some new information in the reply. I was not sure if they had selected the differentially methylated fragments based on two-way comparisons (ie LC and ME/CFS versus the controls) or three-way comparisons (LC versus ME/CFS versus controls). It turns out it was the latter. I think that makes it worse. They chose the 4.6% of fragments that showed the biggest differences between all three of cohorts, and those are the ones they included in the PCA. It should therefore be of no surprise that the PCA showed three cohorts, tightly clustered within the cohort and each cohort widely separated.

They suggest that the PCA was just 'a display tool'. They have not replied to my contention that if you take only the 4.6% of fragments that show the widest separation between 3 groups, then any random set of 70,000 fragments could produce a PCA much like the one they showed.


Dear xxxx,

I hope this email finds you well.

I am forwarding you the authors' response. Can you check to see if all your concerns have been addressed?

**************************************
So first to explain why we used the PCA we think totally appropriately as a VISUAL TOOL to display the data we had generated and how that allowed us to highlight key questions to be further investigated in the study of how similar or different the ME/CFS and LC cohorts were in their methylomes.

Key steps.

1.We selected methylated fragments for study that were in all 15 participants.
2. We did a p<0.05 Anova to determine if there were many fragments with three way significant differences ie between ME/CFS and LC and CONTROLS
3. That gave a big reduction in fragments (73,239 down to 3,363), but 3,363 (or 4.6%) was a significant subfraction showing differences between ME/CFS and LC. For this reason we chose the PCA as a visual display of these differences.
4. We fully expected from the Anova data to see the three cohorts, Controls, ME/CFS , LC, separated in the PCA -but without pre-conceptions of how widely the ME/CFS and LC would be separated - the display showed wide separation. ( It should be noted we also did a PCA on the 73,239 fragments where there was no selection for significance difference selection and just got a random scatter of the 15 participants - -not shown in the paper)
5. An important additional purpose of the PCA on the selected subfraction was to see the contribution of individual patients and whether in each cohort there was a scatter of the patients or whether they were tightly grouped. This would suggest EACH patient of each cohort was making a similar contribution to the significant differences of that cohort from the other cohorts, whereas conversely a wide scatter would suggest much individual patient variation. We found tight grouping
6. The P<0.05 (Anova) and PCA suggested then two avenues to further investigate in the study
(a) what was the molecular explanation for the significant difference between ME/CFS and LC suggested. This further investrigatiion is shown Table 1 examining individual methylation sites - mostly it was the extent of change in methylation that was the difference (LC was greater) not different sites. Nevertheless we did find examples of sites where there was methylation change in one cohort but not the other and an example where the methylation levels had changed in opposite directions for there ME/CFS patients and LC patients as shown in the box plots of Figure 4)
(b) could we verify the suggested similar contributions of all patients in a cohort to the differences by examining individual genomic sites . This suggestion was validated in most cases in the box plots showing common individual patient responses at important genomic sites
7. Our specific conclusions came from these further investigations

So in conclusion
All authors felt that PCA had been used appropriately, and after further reflection after my contact from the complainant.

Specific responses to the comments of the complainant:
1. The paper claims that the Principal Components Analysis (PCA) shows that the Healthy Controls and Long Covid and ME/CFS cohorts are very distinct from each other, with a high level of similaritywithin each cohort, in terms of Differentially Methylated Fragments (DMFs). It goes on to suggest that the differences between the Long Covid and ME/CFS DMFs suggest that they may be different stages in the same disease.

No, It was the p<0.05 ANOVA showed there was a subfraction of the methylated fragments that were significantly different among the three cohorts and that was displayied in the PCA, with relatively tight clustering of the individual patients in each cohort . Interestingly, there were two outliers - in the healthy control cohort, the much older healthy control ( the methylome is know to change during ageing!) and the youngest ME/CFS patient whom has been shown to be somewhat of an outlier in most of our studies with her samples. The inference that stages of the disease might influence the methylome and explain why differences were found in the two cohorts was a suggestion that we recommended should be followed up when we have cohorts with the same disease duration.


2.It is therefore not surprising that the PCA chart showed the cohorts to be widely separated, with individuals tightly

We agree as we were wanting to display what the ANOVA was suggesting in the PCA ie there cohort with signicant differences - but not the comment about tightness of the clustering - and indeed as we indicated above there were outlier patients that di n to cluster so well


3. It is not clear if the authors selected DMFs based on a significant difference between the LC and ME/CFS cohorts combined versus the healthy controls, or between each of the LC and ME/CFS cohorts compared to the healthy controls.

We did a three way significance comparison with an ANOVA so our subfraction was fragments where each of the three cohorts were significantly different from the other two, as appropriate for such a three way analysis.


4. With that selection intensity, the differences could easily be the result of random variation. The paper does not specifically note this and instead draws firm conclusions about the diseases based on the identified “differences”

We wanted to select to see if there were many fragments where there were significant differences between all three cohorts - that was what we were potentially interested in for the study. If it were random variation among the 3363 fragments that allowed separation the three cohorts neatly into their 5 patients the subsequent analysis of individual methylation sites would not have shown the characteristics we derived. The trends and differences found in the detailed site studies were totally consistent with what the PCA had shown. Important to emphasis we were using the ANOVA and PCA as a guide to the further investigations.

5.The Discussion section comments on the PCA, incorrectly reporting that it shows that “the global DNA methylation patterns can separate the two disease groups from each other .. and both are well separated from the HC group”.

The ANOVA of the global methylation found a significant subfraction with significant differences between the ME/CFS and LC cohorts visually displayed in the PCA- that supports that conclusion.The subsequent data presented where an extra limit of a minimum of 10% methylation change is added showed the differences vs healthy controls at the specific sites and highlighted a few sites where there are major differences in methylation between the two ME/CFS and LC cohorts


I hope this is a sufficient explanation of our different points of view and why we think the way we used the PCA as a display tool of the significance ANOVA analysis was totally appropriate as a guide to the later more definitive analyses.

**************************************


Kind regards,
...
Section Managing Editor
 
Thanks for pushing back on their paper Hutan.
5. An important additional purpose of the PCA on the selected subfraction was to see the contribution of individual patients and whether in each cohort there was a scatter of the patients or whether they were tightly grouped. This would suggest EACH patient of each cohort was making a similar contribution to the significant differences of that cohort from the other cohorts, whereas conversely a wide scatter would suggest much individual patient variation. We found tight grouping
The grouping looks even tighter in chillier's random data.
 
Last edited:
I have received the following from the journal:



Dear xxx,

I hope this email finds you well.

I have received the following comments from an editorial board member. He believes the authors have provided sufficient explanation and that no further revisions are necessary. Do you agree with him?

*********************************************
XXX has raised concerns regarding the use and interpretation of Principal Component Analysis (PCA) in Figure 2 of Int. J. Mol. Sci. 2025, 26(14), 6631 (

). Specifically, the critique centers on the apparent use of PCA-DA (Principal Component Analysis followed by Discriminant Analysis), which—while described as PCA in the manuscript—appears to have involved prior selection of variables based on statistical significance.

The authors report identifying approximately 73,000 methylation fragments across all subjects, yet only around 3,300 were included in the PCA. These fragments were preselected based on ANOVA p-values, which introduces a bias that undermines PCA’s role as an unsupervised method for exploring group differences. Standard practice typically involves applying PCA-DA to the full, quality-filtered dataset to assess whether and how groups can be distinguished. Subsequent analyses may then focus on top candidates. In this case, such an approach would have revealed that the majority of methylation fragments did not differ between groups, with only about 5% showing variation—a conclusion that contrasts with the impression given by the PCA-DA plot.

While the visual separation between groups in the PCA-DA plot is striking, it is not supported by other data presentations, such as box plots, which do not reflect the same degree of distinction—particularly between LC and ME groups. Nonetheless, the authors have clearly outlined their analytical workflow in Figure 1, and the procedure is transparent. Their estimate that approximately 5% of methylation sites differ between groups is reasonable, though the interpretation might have been more balanced had it acknowledged that 95% of sites were not different and that the observed differences were relatively minor when considering the full dataset.

In their response, the authors note that they performed a PCA on all 73,239 fragments without preselection, which resulted in a random scatter of the 15 participants. However, this analysis was not included in the paper. Presenting such an unbiased PCA would have added valuable context and helped readers better assess the robustness of the findings.

Importantly, the full dataset is available in the GEO database (NCBI GSE297189), allowing readers to independently evaluate the data. While the PCA plot may have limited statistical value due to the preselection of variables, the transparency of the methodology enables informed judgment about the article’s conclusions.

The authors have responded to the complaint, though they do not consider it relevant. If the complaint and response are to be published, it may be advisable for the authors to revise their reply for clarity, as it currently contains numerous typographical and structural issues that hinder readability.

Therefore, while I find the complaint to be partially justified, the article demonstrates adequate transparency in its data analysis and provides access to the full dataset, allowing readers to form their own judgment. In light of this, I do not believe further action is necessary.

*********************************************

Kind regards,
xxxx
Section Managing Editor
 
I sent the following reply yesterday. It's not perfect, I should have taken some time before sending something off.



Thank you xxx. I appreciate your followup.

I found the analysis from the editorial board member useful and clear in its acknowledgement that there is an issue with the selection of the data for PCA. However, the board member's view seems to be that, since the authors have 1. documented their use of highly selected data for the PCA that maximises the differences between the cohorts and the similarities within the cohorts and 2. made the full dataset available for others to analyse, the use of the results to illustrate relationships between the three groups is acceptable. It seems to imply that readers of the paper will identify the problem and draw their own conclusions about the validity of the findings, and that therefore the journal does not need to take any action to correct the problem.

I am puzzled why the editorial board member finds the complaint only partially justified. If we are agreed that the methodology and resulting finding was wrong, I cannot see that the fact that it is possible for an observant reader to identify that the methodology was wrong from the information provided makes everything okay.

A point I believe the board member has not given sufficient weight to is that the paper refers multiple times to the results of the PCA as providing evidence of differences between the three cohorts and clustering within them, without noting that a random set of data would produce the same outcome.g.
  • "A PCA plot of the fifteen participants’ data showed that the patients of three cohorts formed tightly clustered groups in each case, which were well separated. This implied all members of each cohort, including all the HCs, had similar methylation levels of the fragments to the other members of their cohorts. The separation of the ME/CFS and LC cohorts implied either that there was a difference in the degree of change in the methylation between the ME/CFS patients and LC patients at the same sites, or that there may be methylation changes at specific positions in one of the cohorts but not the other."
  • "Principal Component Analysis (PCA) of differentially methylated fragments identified in all patients of the three cohorts showed distinct clustering of the HC, LC, and ME/CFS cohorts, demonstrating that despite LC and ME/CFS having many changes in methylation at genomic sites in common, the global DNA methylation patterns can separate the two disease groups from each other as well, and both are well separated from the HC group."
  • In the abstract: "A principal component analysis (PCA) analysing significant methylation changes (p < 0.05) separated the ME/CFS, LC, and HC cohorts into three distinct clusters."
The caption on the figure does not make the nature of the data selection clear - the comparisons the p value criterion applies to are not defined.

I don't think it is safe to assume that all readers will see the problem and understand that randomly distributed data would produce a very similar PCA chart. It is my experience that incorrect findings are often referenced in other papers, and can take on a life of their own as peer-reviewed facts, with scarcely any readers ever referring back even to the abstract, much less the detail of the original paper or the dataset.

I am particularly concerned that even the authors do not seem to understand that the use of the highly selected data (less than 5% of the total data) for the PCA has led to conclusions about the differences between the groups that are not substantiated by the analysis. There is no indication from the authors, both in my discussions with them at the preprint stage, and in their subsequent replies to the journal, that they will not preselect data for a PCA again. In communications with me at the preprint stage, an author noted that one of the authors was very experienced with data analysis and had previously used this approach in cancer research. I think it is very possible that these authors may go on to apply the same methodology to draw incorrect conclusions from other datasets. Researchers reading this paper may also not see the problem, and may apply the same approach themselves.

With respect to this paper, I will be concerned if the conclusions based on the PCA are allowed to stand. If most readers of the paper do in fact identify the problem, they will draw conclusions not only about the capability of the authors, but also about your journal. There are findings of value in this paper and good work has been done, but the error with the PCA is likely to result in those findings and even the team being dismissed.

I find myself in a difficult situation, having started the conversation with an author at the preprint stage in order to try to help ensure that their work was well founded. I do not want to diminish the ability of this team to do further work on ME/CFS. However, I also do not want errors to stand and potentially be replicated. I think the journal's review process has been part of the problem here. Perhaps it is missing other errors in the papers it publishes? I have not seen anything from the journal acknowledging that it made an error in not requiring revision of the paper before it was published.

I am not sure of the best way forward; I don't know what would normally happen. I think the paper contains objective errors and does require revision. I don't believe it is reasonable to conclude that the existing transparency around how the analysis was done will lead to most people exposed to the findings of this paper understanding that some conclusions are unfounded.

Warm regards,
 
While the PCA plot may have limited statistical value due to the preselection of variables, the transparency of the methodology enables informed judgment about the article’s conclusions.
I thought the purpose of the editorial process was to ensure that the articles are robust, not just «transparent»..

This statement is essentially greenlighting making completely unsubstantiated claims with irrelevant data, as long as you show how you came to the conclusion. If the logic holds up is irrelevant. That isn’t science in my book..
 
Back
Top Bottom