Evidence of White Matter Neuroinflammation in [ME/CFS]: A Diffusion-Based Neuroinflammation Imaging Study 2026 Yu et al

Here is some R code simulating this with random data where NII-RF is correlated to age and to ME/CFS status:
Code:
total_trials <- 1000
more_significant_count <- 0

for (i in 1:total_trials) {

  age <- rnorm(50, mean = 45, sd = 10)
  mecfs_status <- rbinom(50, 1, prob = 0.5)
 
  niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)
 
  model1 <- lm(niirf ~ 1) # Intercept only
  model2 <- lm(niirf ~ mecfs_status)
 
  anova_1_2 <- anova(model1, model2)
  anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]]
 
  model3 <- lm(niirf ~ age)
  model4 <- lm(niirf ~ mecfs_status + age)
 
  anova_3_4 <- anova(model3, model4)
  anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]]
 
  if (anova_1_2_pval > anova_3_4_pval) {
    more_significant_count <- more_significant_count + 1
  }

}

print(paste0("In ", more_significant_count, " out of ", total_trials, " total trials (", 100*more_significant_count/total_trials, "%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."))

anova_1_2
anova_3_4

Output:


To get an idea of what the p-values are, I looked at the ANOVA results for the last trial:
Code:
> anova_1_2
Analysis of Variance Table

Model 1: niirf ~ 1
Model 2: niirf ~ mecfs_status
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     49 3947.8                
2     48 3906.9  1    40.879 0.5022 0.4819

> anova_3_4
Analysis of Variance Table

Model 1: niirf ~ age
Model 2: niirf ~ mecfs_status + age
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     48 52.230                  
2     47 47.129  1    5.1011 5.0872 0.0288 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
When comparing the models without age, the p-value was 0.48. With age, it was 0.03.
Your intercept only comparison is not correct:
total_trials <- 1000
more_significant_count <- 0

for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 10)
mecfs_status <- rbinom(50, 1, prob = 0.5)

niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

model1 <- lm(niirf ~ 0) # Intercept only
model2 <- lm(niirf ~ 0 + mecfs_status)


anova_1_2 <- anova(model1, model2)
anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]]

model3 <- lm(niirf ~ age)
model4 <- lm(niirf ~ mecfs_status + age)

anova_3_4 <- anova(model3, model4)
anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]]

if (anova_1_2_pval > anova_3_4_pval) {
more_significant_count <- more_significant_count + 1
}

}

print(paste0("In ", more_significant_count, " out of ", total_trials, " total trials (", 100*more_significant_count/total_trials, "%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."))

anova_1_2
anova_3_4
Result:
"In 13 out of 1000 total trials (1.3%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."

or, allowing intercept fitting (the better comparison):
for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 10)
mecfs_status <- rbinom(50, 1, prob = 0.5)

niirf <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

model1 <- lm(niirf ~ 0) # Intercept only
model2 <- lm(niirf ~ mecfs_status)


anova_1_2 <- anova(model1, model2)
anova_1_2_pval <- anova_1_2[['Pr(>F)']][[2]]

model3 <- lm(niirf ~ age)
model4 <- lm(niirf ~ mecfs_status + age)

anova_3_4 <- anova(model3, model4)
anova_3_4_pval <- anova_3_4[['Pr(>F)']][[2]]

if (anova_1_2_pval > anova_3_4_pval) {
more_significant_count <- more_significant_count + 1
}

}
Result:
"In 0 out of 1000 total trials (0%), the p-value when adding ME/CFS status to an intercept-only model was higher (less significant) than the p-value when adding ME/CFS status to a model with an age covariate."

[note: sorry for all the editing, was writing quickly. I don't think the intercept is the major problem here--I figured it would be faster to write code that tests the issue I brought up directly (see post #86)]
 
Last edited:
So the upshot is, I believe NII-RF and NII-HR are defined just using the parts of the signal they think are isotropic, according to their model.

That sounds right, although of course in neuroinflammation I suspect there will be a degree of anisoprotpy of water diffusion in white matter even for extracellular free water because of the linear architecture.

The separation of these various different parameters seems very clever. What I am a bit dubious about its the interpretation. Especially when looking at the colour pictures, as SNTG says.
 
Your intercept only comparison is not correct:

You replaced model1 <- lm(niirf ~ 1) with model1 <- lm(niirf ~ 0). 1 is the formula for an intercept. 0 means no intercept.

From the R docs:
A formula has an implied intercept term. To remove this use either y ~ x - 1 or y ~ 0 + x.

So in the first updated code you posted, in the comparison of model1 and model2, it's comparing a model that just predicts 0 for every point with a model that predicts based on mecfs_status, but is forced through the point (0,0).

In the second code, it's comparing a model which predicts 0 for every point to a model that includes mecfs_status and an intercept.
 
You replaced model1 <- lm(niirf ~ 1) with model1 <- lm(niirf ~ 0). 1 is the formula for an intercept. 0 means no intercept.

From the R docs:


So in the first updated code you posted, in the comparison of model1 and model2, it's comparing a model that just predicts 0 for every point with a model that predicts based on mecfs_status, but is forced through the point (0,0).

In the second code, it's comparing a model which predicts 0 for every point to a model that includes mecfs_status and an intercept.
Sorry, short on time today otherwise would write more. [edit the important thing] to check is that the case I'm talking about is where the original NII-XX ~ ME/CFS association wasn't significant initially. Even if you take mecfs_status out of the outcome variable, you might still generate a spurious association randomly. The issue I've been talking about is the measurements besides NII-RF that dropped out in the univariate
 
Last edited:
Okay here is code testing this specific scenario:
I'm not sure about that.

Say you have a feature that increases a lot by age but is reliably lower in ME/CFS at each age. If you had a badly matched control group, a lot younger, then there might not be a significant difference between the ME/CFS group and the control group for that feature. So, there would be no obvious association to start with. But then, if you did control for age, the difference would appear.

library(dplyr)
library(car)

total_trials <- 5000
null_univariate <- 0
null_univariate_sig_multivariate <- 0

for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 20)
mecfs_status <- rbinom(50, 1, prob = 0.5)

# Create variable that is associated with age
niiXX <- age + rnorm(50, mean = 0, sd = 1)

# Check that mecfs_status is NOT significantly associated with outcome on its own
model1 <- lm(niiXX ~ mecfs_status)
pval <- car::Anova(model1)$`Pr(>F)`[1]

# If univariate association is not significant
if (pval > 0.05) {
null_univariate <- null_univariate + 1

# Determine if mecfs_status is significant in a model with age as a covariate
model2 <- lm(niiXX ~ mecfs_status + age)
pval_multi <- car::Anova(model2)$`Pr(>F)`[1]

if (pval_multi < 0.05) {
null_univariate_sig_multivariate <- null_univariate_sig_multivariate + 1
}
}
}

print(paste0("In ", null_univariate_sig_multivariate, " out of ", null_univariate, " total trials (", signif(100*null_univariate_sig_multivariate/null_univariate, 2), "%), the ME/CFS variable with significant with the covariate of age but not on its own."))

[1] "In 243 out of 4721 total trials (5.1%), the ME/CFS variable with significant with the covariate of age but not on its own."
I randomly generated values for an outcome variable (niiXX) that was deliberately coded to correlate with age. I then checked the association between niiXX and mecfs_status. If that univariate association was not significant (p > 0.05), I tested a second model including mecfs_status and age.

The situation I flagged, where a variable was not significant in a univariate association but is significant with age as a covariate, only occured 5% of the time, so exactly what was expected by chance. Meaning that a confounder that is associated with the outcome variable does not induce significance in the association between ME/CFS and the outcome var if included as a covariate. Meanwhile, in the paper, 6/7 measured NII variables showed this trend.

@forestglip let me know if this makes sense
 
Last edited:
I randomly generated values for an outcome variable (niiXX) that was deliberately coded to correlate with age. I then checked the association between niiXX and mecfs_status. If that univariate association was not significant (p > 0.05), I tested a second model including mecfs_status and age.

The situation I flagged, where a variable was not significant in a univariate association but is significant with covariates, only occured 5% of the time, so exactly what was expected by chance.
This is because in the model you made, there is no relationship between ME/CFS and niXX, so it is all due to chance. niXX is just a function of age, so there should only be about 5% significant as false positives when testing the association with mecfs_status.

If niXX actually depends on mecfs_status, then adding the covariate can make it more significant, and can take p from greater than 0.05 to less than 0.05. I added mecfs_status to the initial niXX model in your code:
niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

Resulting in:
In 4345 out of 4731 total trials (92%), the ME/CFS variable with significant with the covariate of age but not on its own.
 
This is because in the model you made, there is no relationship between ME/CFS and niXX, so it is all due to chance. niXX is just a function of age, so there should only be about 5% significant as false positives when testing the association with mecfs_status.

If niXX actually depends on mecfs_status, then adding the covariate can make it more significant, and can take p from greater than 0.05 to less than 0.05. I added mecfs_status to the initial niXX model in your code:
But that's the exact scenario i'm flagging. That's what they said in the paper. There was no association between all the other NII-whatever variables (besides NII-RF) when tested individually, but they did become significant when a covariate was added. So what you're simulating here is not the issue I'm bringing up.

I'm saying that this is likely human error because you can't force an initially unassociated variable to become significant with a covariate (more than by chance) even if the covariates are significantly associated with outcome.
 
There was no association between all the other NII-whatever variables when tested individually, but they did become significant when a confounder was added. I'm saying that this is likely human error because you can't force that to happen even if the covariates are significantly associated with outcome.
We just did force it to happen. In the updated code I posted, in 92% of the trials where niiXX was not significantly associated with ME/CFS status (analogous to them showing no association for the univariate test in the paper), adding a covariate made it a significant association (significant with covariates in the paper).
 
We just did force it to happen. In the updated code I posted, in 92% of the trials where niiXX was not significantly associated with ME/CFS status (analogous to them showing no association for the univariate test in the paper), adding a covariate made it a significant association (significant with covariates in the paper).
when you coded the outcome variable to be significantly associated with mecfs_status. Which is not what was in the paper.
 
when you coded the outcome variable to be significantly associated with mecfs_status. Which is not what was in the paper.
What was in the paper? They were using real-life data where we're going into it not knowing if the outcome variable is actually dependent on mecfs_status or not. I showed that in the case where it is dependent on mecfs_status, the scenario we're discussing is possible.
 
Imagine a feature that increases by 1 every decade of life so 10 year olds have, on average, a level of 1, 20 year olds have a level of 2, 30 year olds have a level of 3 and 40 year olds have a level of 4 and 50 year olds have a level of 5.

But for ME/CFS people, although the value changes with age, it is always 1 less than the normal value.

If you have badly matched controls, you might have five 20 year olds and five 30 year olds and so the mean value is 2.5. And in your ME/CFS sample you might have five 30 year olds and five 40 year olds, so the mean value is also 2.5. It looks as though there is no difference.

But, if you adjust for age, with good information about population norms or a big enough sample, you would find that ME/CFS in fact lowers the level of the value by one.

It's possible. (sorry, just adjusted my maths)

I agree though that the controlling for confounders that went on in the paper is something that we need answers about. I can't see why you would control for age and sex if you have matched controls. And there is no plausible reason to control for anxiety and depression, especially when using the HADS instrument that we know is bad at assessing anxiety and depression in ME/CFS.

As I said up thread, I'm surprised that controlling for anxiety and depression (which, with that measurement tool should correlate with physical function in the ME/CFS people) didn't reduce the difference between controls and ME/CFS people (because you would be making the ME/CFS people more like the physically able healthy ones). Instead it is reported to have magnified differences. That makes no sense to me, and so it makes me wonder what went on in the adjustments.
 
Last edited:
As I said up thread, I'm surprised that controlling for anxiety and depression (which, with that measurement tool should correlate with physical function in the ME/CFS people) didn't reduce the difference between controls and ME/CFS people (because you would be making the ME/CFS people more like the physically able healthy ones). Instead it is reported to have magnified differences.
Do we actually know if it increased differences? I don't see effect sizes or coefficients in the text. The predicted effect of ME/CFS status on the brain metric can decrease (which I agree, I think I would expect that with controlling for HADS), while the p-value still becomes even more significant (for example if age is very highly correlated to the brain metric relative to other variables, and so controlling for it removes a lot of noise, increasing significance.)
 
I showed that in the case where it is dependent on mecfs_status, the scenario we're discussing is possible.
I think I understand what's happening here. The simulation itself is breaking something. How do you get a variable that has a >0.9 correlation with the outcome variable at all times but is significantly associated with the outcome variable only 5% of the time

library(dplyr)
library(car)
library(ggplot2)

total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
"mecfs_associated",
"mecfs_associated_with_covariate",
"age_cor_outcome")

for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 20)
mecfs_status <- rbinom(50, 1, prob = 0.5)

# Create that is more associated with age than me/cfs
niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

# Double check association with age
model1 <- lm(niiXX ~ age)
pval1 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05

# Check association with ME/CFS
model2 <- lm(niiXX ~ mecfs_status)
pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05

# Check association of ME/CFS with age as a covariate
model3 <- lm(niiXX ~ mecfs_status + age)
pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05

# Check correlation of age with outcome variable
cor <- cor(age, niiXX) > 0.9

pvals <- c(pval1, pval2, pval3, cor)

results[i, ] <- pvals
}

results %>%
as.data.frame() %>%
group_by(age_associated, age_cor_outcome) %>%
count()

Screenshot 2026-03-18 at 6.02.02 PM.png
age_associated is p-value of age in niiXX ~ age (0 is p>0.05, 1 is p<0.05)
age_cor_outcome is whether the correlation coefficient between age and niiXX is > 0.9 (ended up being true in all cases)
 
I think I understand what's happening here. The simulation itself is breaking something. How do you get a variable that has a 99% correlation with the outcome variable at all times but is significantly associated with the outcome variable only 5% of the time
I haven't fully parsed this code yet, but I think there's an error here, calling model2 instead of model1:
# Double check association with age
model1 <- lm(niiXX ~ age)
pval1 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05
 
Either way, if the p-values are spurious, it means that only 1 out of 7 neuroinflammation measures was actually significant
 
total_trials <- 1000
results <- matrix(ncol = 4, nrow = total_trials)
colnames(results) <- c("age_associated",
"mecfs_associated",
"mecfs_associated_with_covariate",
"age_cor_outcome")

for (i in 1:total_trials) {

age <- rnorm(50, mean = 45, sd = 20)
mecfs_status <- rbinom(50, 1, prob = 0.5)

# Create that is more associated with age than me/cfs
niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1)

cor(mecfs_status, age)

# Double check association with age
model1 <- lm(niiXX ~ age)
pval1 <- car::Anova(model1)$`Pr(>F)`[1] < 0.05

# Check association with ME/CFS
model2 <- lm(niiXX ~ mecfs_status)
pval2 <- car::Anova(model2)$`Pr(>F)`[1] < 0.05

# Check association of ME/CFS with age as a covariate
model3 <- lm(niiXX ~ mecfs_status + age)
pval3 <- car::Anova(model3)$`Pr(>F)`[1] < 0.05

# Check correlation of age with outcome variable
cor <- cor(age, niiXX)

pvals <- c(pval1, pval2, pval3, cor)

results[i, ] <- pvals
}

results %<>%
as.data.frame()

results %>%
group_by(mecfs_associated,
mecfs_associated_with_covariate) %>%
count()

results$age_cor_outcome %>% mean()

Okay it was bugging me so I did one more quick thing, which seems to confirm the results that @forestglip got are dependent on a near-perfect correlation between one of the covariates and the outcome variable.
Using forestglip's original code for deriving the outcome variable, the average correlation coefficient between age and niiXX across 1000 simulations is 0.998. 88% of simulations landing in the scenario of "non-significant univariate association and significant multivariate association" (0 in first column, 1 in second column)
Screenshot 2026-03-18 at 6.27.54 PM.png

When I change niiXX <- age + mecfs_status + rnorm(50, mean = 0, sd = 1) to sd = 10, the average correlation drops down to 0.89. Now, that scenario happens 6% of the time. 0.89 is still an extraordinarily strong correlation

Screenshot 2026-03-18 at 6.30.09 PM.png

cor.png

^ an example with 0.87 correlation coefficient, for reference. That's better than most data I ever look at.

Even just going down from an average correlation of 0.998 to 0.994 (sd = 2) brings the incidence rate down from 88% to 37%
Screenshot 2026-03-18 at 6.39.56 PM.png

The variables they controlled for were sex, age, BMI, MET (task speed), depression and anxiety scores. I'd be shocked if any of these had a >0.99 correlation with any biological variable, but who's to say. I'd bet on human error, though.

Pretty sure I don't have a silly error in my code this time but let me know if you spot something
 
I may be missing some of the detail here but both of what you two said on pages 3 and 4 makes sense to me. @jnmaciuch and @forestglip. (And Hutan)

From what I'm seeing online, the scenario @forestglip is describing is sometimes called a suppressor variable, as in the 3rd variable is secretly suppressing the relationship between your first two until you account for it (see, for ex. this paper describing it, which I have only skimmed). It seems mathematically possible and maybe relevant in some scenarios.

But jnmaciuch just seems to be saying that even if this scenario is possible, it's not standard practice to check for it:
You’re supposed to start with the univariate analysis, seeing if ME/CFS on its own is associated. If it is, then you control for confounders.
I took this to mean normally you only add additional variables after you've confirmed the data shows a statistical relationship between the two variables of interest without the additional variables. That seems like a reasonable practice to me, because there's a lot of opportunity for p-hacking with this. While you could reveal an otherwise hidden relationship by adding a variable, some of the time that apparently relationship is just going to be a fluke, and if you test lots and lots of possible additional variables eventually you'll get lucky.
 
Back
Top Bottom