Evidence of White Matter Neuroinflammation in [ME/CFS]: A Diffusion-Based Neuroinflammation Imaging Study 2026 Yu et al

This is because in the model you made, there is no relationship between ME/CFS and niXX, so it is all due to chance. niXX is just a function of age, so there should only be about 5% significant as false positives when testing the association with mecfs_status.

If niXX actually depends on mecfs_status, then adding the covariate can make it more significant, and can take p from greater than 0.05 to less than 0.05. I added mecfs_status to the initial niXX model in your code:
But that's the exact scenario i'm flagging. That's what they said in the paper. There was no association between all the other NII-whatever variables (besides NII-RF) when tested individually, but they did become significant when a covariate was added. So what you're simulating here is not the issue I'm bringing up.

I'm saying that this is likely human error because you can't force an initially unassociated variable to become significant with a covariate (more than by chance) even if the covariates are significantly associated with outcome.
 
There was no association between all the other NII-whatever variables when tested individually, but they did become significant when a confounder was added. I'm saying that this is likely human error because you can't force that to happen even if the covariates are significantly associated with outcome.
We just did force it to happen. In the updated code I posted, in 92% of the trials where niiXX was not significantly associated with ME/CFS status (analogous to them showing no association for the univariate test in the paper), adding a covariate made it a significant association (significant with covariates in the paper).
 
We just did force it to happen. In the updated code I posted, in 92% of the trials where niiXX was not significantly associated with ME/CFS status (analogous to them showing no association for the univariate test in the paper), adding a covariate made it a significant association (significant with covariates in the paper).
when you coded the outcome variable to be significantly associated with mecfs_status. Which is not what was in the paper.
 
when you coded the outcome variable to be significantly associated with mecfs_status. Which is not what was in the paper.
What was in the paper? They were using real-life data where we're going into it not knowing if the outcome variable is actually dependent on mecfs_status or not. I showed that in the case where it is dependent on mecfs_status, the scenario we're discussing is possible.
 
Imagine a feature that increases by 1 every decade of life so 10 year olds have, on average, a level of 1, 20 year olds have a level of 2, 30 year olds have a level of 3 and 40 year olds have a level of 4 and 50 year olds have a level of 5.

But for ME/CFS people, although the value changes with age, it is always 1 less than the normal value.

If you have badly matched controls, you might have five 20 year olds and five 30 year olds and so the mean value is 2.5. And in your ME/CFS sample you might have five 30 year olds and five 40 year olds, so the mean value is also 2.5. It looks as though there is no difference.

But, if you adjust for age, with good information about population norms or a big enough sample, you would find that ME/CFS in fact lowers the level of the value by one.

It's possible. (sorry, just adjusted my maths)

I agree though that the controlling for confounders that went on in the paper is something that we need answers about. I can't see why you would control for age and sex if you have matched controls. And there is no plausible reason to control for anxiety and depression, especially when using the HADS instrument that we know is bad at assessing anxiety and depression in ME/CFS.

As I said up thread, I'm surprised that controlling for anxiety and depression (which, with that measurement tool should correlate with physical function in the ME/CFS people) didn't reduce the difference between controls and ME/CFS people (because you would be making the ME/CFS people more like the physically able healthy ones). Instead it is reported to have magnified differences. That makes no sense to me, and so it makes me wonder what went on in the adjustments.
 
Last edited:
As I said up thread, I'm surprised that controlling for anxiety and depression (which, with that measurement tool should correlate with physical function in the ME/CFS people) didn't reduce the difference between controls and ME/CFS people (because you would be making the ME/CFS people more like the physically able healthy ones). Instead it is reported to have magnified differences.
Do we actually know if it increased differences? I don't see effect sizes or coefficients in the text. The predicted effect of ME/CFS status on the brain metric can decrease (which I agree, I think I would expect that with controlling for HADS), while the p-value still becomes even more significant (for example if age is very highly correlated to the brain metric relative to other variables, and so controlling for it removes a lot of noise, increasing significance.)
 
I showed that in the case where it is dependent on mecfs_status, the scenario we're discussing is possible.
I think I understand what's happening here. The simulation itself is breaking something. How do you get a variable that has a >0.9 correlation with the outcome variable at all times but is significantly associated with the outcome variable only 5% of the time

library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
"mecfs_associated",
"mecfs_associated_with_covariate",
"age_cor_outcome")

for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 20)
mecfs_status <- rbinom(50, 1, prob = 0.5)

# Create that is more associated with age than me/cfs
niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

# Double check association with age
model1 <- lm(niiXX ~ age)
pval1 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05

# Check association with ME/CFS
model2 <- lm(niiXX ~ mecfs_status)
pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05

# Check association of ME/CFS with age as a covariate
model3 <- lm(niiXX ~ mecfs_status + age)
pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05

# Check correlation of age with outcome variable
cor <- cor(age, niiXX) > 0.9

pvals <- c(pval1, pval2, pval3, cor)

results[i, ] <- pvals
}

results %>%
as.data.frame() %>%
group_by(age_associated, age_cor_outcome) %>%
count()

Screenshot 2026-03-18 at 6.02.02 PM.png
age_associated is p-value of age in niiXX ~ age (0 is p>0.05, 1 is p<0.05)
age_cor_outcome is whether the correlation coefficient between age and niiXX is > 0.9 (ended up being true in all cases)
 
I think I understand what's happening here. The simulation itself is breaking something. How do you get a variable that has a 99% correlation with the outcome variable at all times but is significantly associated with the outcome variable only 5% of the time
I haven't fully parsed this code yet, but I think there's an error here, calling model2 instead of model1:
# Double check association with age
model1 <- lm(niiXX ~ age)
pval1 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
"mecfs_associated",
"mecfs_associated_with_covariate",
"age_cor_outcome")

for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 20)
mecfs_status <- rbinom(50, 1, prob = 0.5)

# Create that is more associated with age than me/cfs
niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

cor(mecfs_status, age)

# Double check association with age
model1 <- lm(niiXX ~ age)
pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05

# Check association with ME/CFS
model2 <- lm(niiXX ~ mecfs_status)
pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05

# Check association of ME/CFS with age as a covariate
model3 <- lm(niiXX ~ mecfs_status + age)
pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05

# Check correlation of age with outcome variable
cor <- cor(age, niiXX)

pvals <- c(pval1, pval2, pval3, cor)

results[i, ] <- pvals
}

results %<>%
as.data.frame()

results %>%
group_by(mecfs_associated,
mecfs_associated_with_covariate) %>%
count()

results$age_cor_outcome %>% mean()

Okay it was bugging me so I did one more quick thing, which seems to confirm the results that @forestglip got are dependent on a near-perfect correlation between one of the covariates and the outcome variable.
Using forestglip's original code for deriving the outcome variable, the average correlation coefficient between age and niiXX across 1000 simulations is 0.998. 88% of simulations landing in the scenario of "non-significant univariate association and significant multivariate association" (0 in first column, 1 in second column)
Screenshot 2026-03-18 at 6.27.54 PM.png

When I change niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1) to sd = 10, the average correlation drops down to 0.89. Now, that scenario happens 6% of the time. 0.89 is still an extraordinarily strong correlation

Screenshot 2026-03-18 at 6.30.09 PM.png

cor.png

^ an example with 0.87 correlation coefficient, for reference. That's better than most data I ever look at.

Even just going down from an average correlation of 0.998 to 0.994 (sd = 2) brings the incidence rate down from 88% to 37%
Screenshot 2026-03-18 at 6.39.56 PM.png

The variables they controlled for were sex, age, BMI, MET (task speed), depression and anxiety scores. I'd be shocked if any of these had a >0.99 correlation with any biological variable, but who's to say. I'd bet on human error, though.

Pretty sure I don't have a silly error in my code this time but let me know if you spot something
 
I may be missing some of the detail here but both of what you two said on pages 3 and 4 makes sense to me. @jnmaciuch and @forestglip. (And Hutan)

From what I'm seeing online, the scenario @forestglip is describing is sometimes called a suppressor variable, as in the 3rd variable is secretly suppressing the relationship between your first two until you account for it (see, for ex. this paper describing it, which I have only skimmed). It seems mathematically possible and maybe relevant in some scenarios.

But jnmaciuch just seems to be saying that even if this scenario is possible, it's not standard practice to check for it:
You’re supposed to start with the univariate analysis, seeing if ME/CFS on its own is associated. If it is, then you control for confounders.
I took this to mean normally you only add additional variables after you've confirmed the data shows a statistical relationship between the two variables of interest without the additional variables. That seems like a reasonable practice to me, because there's a lot of opportunity for p-hacking with this. While you could reveal an otherwise hidden relationship by adding a variable, some of the time that apparent relationship is just going to be a fluke, and if you test lots and lots of possible additional variables eventually you'll get lucky.
 
When I change niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1) to sd = 10, the average correlation drops down to 0.89. Now, that scenario happens 6% of the time. 0.89 is still an extraordinarily strong correlation
I'm not at a PC currently, so can't test it.

But I think what's happening is that by increasing the influence of random noise, not only is the correlation of age with niiXX decreasing, but so is the correlation of mecfs_status with niiXX. In which case it's not surprising that mecfs_status is significant less often.

To test decreasing the correlation with age, without obscuring the real mecfs_status correlation with niiXX that was coded in, the true model can be altered by decreasing the influence of age on niiXX:

niiXX <- (0.5 * age) + mecfs_status + rnorm(50, mean = 0, sd = 1)
 
I'm not at a PC currently, so can't test it.

But I think what's happening is that by increasing the influence of random noise, not only is the correlation of age with niiXX decreasing, but so is the correlation of mecfs_status with niiXX. In which case it's not surprising that mecfs_status is significant less often.

To test decreasing the correlation with age, without obscuring the real mecfs_status correlation with niiXX that was coded in, the true model can be altered by decreasing the influence of age on niiXX:
Maybe, though that seems unlikely to be the issue if in my last example I changed the sd from just 1 to 2 and the incidence rate of 0,1 plummeted already.

Plus if you decrease the influence of age, you increase the influence of ME/CFS, which increases the frequency of 1,1 on [edit; that table rather than just 0,1. You’d need to maintain a high percentage of specifically 0,1 to be in line with so many NII measurements falling away.] I think i need to finish coding for the day but maybe will get around to it later this week to confirm
 
Last edited:
Do we actually know if it increased differences? I don't see effect sizes or coefficients in the text. The predicted effect of ME/CFS status on the brain metric can decrease (which I agree, I think I would expect that with controlling for HADS), while the p-value still becomes even more significant (for example if age is very highly correlated to the brain metric relative to other variables, and so controlling for it removes a lot of noise, increasing significance.)
Yes, I was speaking in terms of significance, significant difference. Without the adjustment they made for the confounding variable, hardly anything was judged to be significantly different, just RF I think. After the adjustment for age, sex, BMI, anxiety and depression, and MET, a whole lot of measures became significant.
 
Maybe, though that seems unlikely to be the issue if in my last example I changed the sd from just 1 to 2 and the incidence rate of 0,1 plummeted already.
I had to multiply by a much smaller number than 0.5 to decrease the correlation meaningfully. But here is the result if multiplying age by 0.05:
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create that is more associated with age than me/cfs
  niiXX <- 0.05 * age + mecfs_status + rnorm(50, mean = 0, sd = 1)
 
  cor(mecfs_status, age)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor <- cor(age, niiXX)
 
  pvals <- c(pval1, pval2, pval3, cor)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

results$age_cor_outcome %>% mean()
1773882068365.png

The correlation is now 0.65, and still about a quarter of the time, it's only significant after adding covariates.

But okay, 0.65 is still a high correlation to expect. But the thing is, if you include multiple covariates, all with low correlations, controlling for all of them can still decrease noise to where mecfs_status becomes significant, but without correction, there's too much noise for significance.

Here is similar code, but with four variables instead of just age. All with relatively low correlations (~0.25 - 0.4).
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 7, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome",
                       "bmi_corr_outcome",
                       "met_corr_outcome",
                       "hads_corr_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  bmi <- rnorm(50, mean = 10, sd = 3)
  met <- rnorm(50, mean = 10, sd = 3)
  hads <- rnorm(50, mean = 10, sd = 3)
 
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create model dependent on mecfs_status and four other variables
  niiXX <- (0.02 * age) + (0.2 * bmi) + (0.2 * met) + (0.2 * hads) +
    mecfs_status +
    rnorm(50, mean = 0, sd = 1)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age + bmi + met + hads)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor_age <- cor(age, niiXX)
  cor_bmi <- cor(bmi, niiXX)
  cor_met <- cor(met, niiXX)
  cor_hads <- cor(hads, niiXX)
 
 
  pvals <- c(pval1, pval2, pval3, cor_age, cor_bmi, cor_met, cor_hads)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

colMeans(results[, 4:7])

1773882314336.png

Around a third of the time, we see that 0,1 scenario. And the study had even more covariates than this, so the correlations could have been lower where we'd still expect to see this happen a large portion of the time
 
Yes, I was speaking in terms of significance, significant difference. Without the adjustment they made for the confounding variable, hardly anything was judged to be significantly different, just RF I think. After the adjustment for age, sex, BMI, anxiety and depression, and MET, a whole lot of measures became significant.
Yes, so I think if the strongest correlation is with age, controlling for it can remove so much noise that the increase in significance from better precision outweighs the decrease in significance due to decreased effect size you might expect from controlling for HADS.
 
Yes, so I think if the strongest correlation is with age, controlling for it can remove so much noise that the increase in significance from better precision outweighs the decrease in significance due to decreased effect size you might expect from controlling for HADS.
Yes, but my concern was that I thought the study had controls that were already well matched for age and sex, I thought they claimed that. So, I'd be surprised if controlling for those things made much difference. But, I'd have to go look at the table with demographics to be sure.

I thought that most of the things that might make a difference, BMI, MET, "depression" and "anxiety" actually were probably somewhat well aligned with the physical effects of having ME/CFS. So, controlling for them should reduce difference between ME/CFS and the controls.
 
Yes, but my concern was that I thought the study had controls that were already well matched for age and sex, I thought they claimed that. So, I'd be surprised if controlling for those things made much difference. But, I'd have to go look at the table with demographics to be sure.
Hmm. I'd need to think about it more, but I think if they were well-matched for age, that would mean that without controlling for age, we can expect the predicted coefficient to be accurate (or at least not biased by age).

But the variance due to differing age among the cohort (e.g. if some individuals were 20 and some were 70) still adds noise to the model, making it more likely to not reach significance, while controlling for that variance can make it more significant.
 
I had to multiply by a much smaller number than 0.5 to decrease the correlation meaningfully. But here is the result if multiplying age by 0.05:
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create that is more associated with age than me/cfs
  niiXX <- 0.05 * age + mecfs_status + rnorm(50, mean = 0, sd = 1)
 
  cor(mecfs_status, age)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor <- cor(age, niiXX)
 
  pvals <- c(pval1, pval2, pval3, cor)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

results$age_cor_outcome %>% mean()
View attachment 31185

The correlation is now 0.65, and still about a quarter of the time, it's only significant after adding covariates.

But okay, 0.65 is still a high correlation to expect. But the thing is, if you include multiple covariates, all with low correlations, controlling for all of them can still decrease noise to where mecfs_status becomes significant, but without correction, there's too much noise for significance.

Here is similar code, but with four variables instead of just age. All with relatively low correlations (~0.25 - 0.4).
Code:
library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 7, nrow = total_trials)
colnames(results) <- c("age_associated",
                       "mecfs_associated",
                       "mecfs_associated_with_covariate",
                       "age_cor_outcome",
                       "bmi_corr_outcome",
                       "met_corr_outcome",
                       "hads_corr_outcome")

for (i in 1:total_trials) {
 
  age <- rnorm(50, mean = 45, sd = 20)
  bmi <- rnorm(50, mean = 10, sd = 3)
  met <- rnorm(50, mean = 10, sd = 3)
  hads <- rnorm(50, mean = 10, sd = 3)
 
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  # Create model dependent on mecfs_status and four other variables
  niiXX <- (0.02 * age) + (0.2 * bmi) + (0.2 * met) + (0.2 * hads) +
    mecfs_status +
    rnorm(50, mean = 0, sd = 1)
 
  # Double check association with age
  model1 <- lm(niiXX ~ age)
  pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05
 
  # Check association with ME/CFS
  model2 <- lm(niiXX ~ mecfs_status)
  pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
  # Check association of ME/CFS with age as a covariate
  model3 <- lm(niiXX ~ mecfs_status + age + bmi + met + hads)
  pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05
 
  # Check correlation of age with outcome variable
  cor_age <- cor(age, niiXX)
  cor_bmi <- cor(bmi, niiXX)
  cor_met <- cor(met, niiXX)
  cor_hads <- cor(hads, niiXX)
 
 
  pvals <- c(pval1, pval2, pval3, cor_age, cor_bmi, cor_met, cor_hads)
 
  results[i, ] <- pvals
}

results <- as.data.frame(results)

results %>%
  group_by(mecfs_associated,
           mecfs_associated_with_covariate) %>%
  count()

colMeans(results[, 4:7])

View attachment 31186

Around a third of the time, we see that 0,1 scenario. And the study had even more covariates than this, so the correlations could have been lower where we'd still expect to see this happen a large portion of the time
all your changes still result in the largest frequency being 1,1 though. To be reasonably likely to get the results from this study where only one association was significant in univariate but nearly all had significance in multivariate, you’d need 0,1 to generally dominate
 
Back
Top Bottom