Evidence of White Matter Neuroinflammation in [ME/CFS]: A Diffusion-Based Neuroinflammation Imaging Study 2026 Yu et al

Yes. And if the recent literature on CSF flux that I have been reading is anything to go by it is quite plausible that the whole idea of these fractions and compartments is bad physical chemistry. The data will mean something but maybe not what these names imply. Mathematical modellers, in my experience, only too often fail to understand the dynamic geometry at fine grain.
 
Yes. And if the recent literature on CSF flux that I have been reading is anything to go by it is quite plausible that the whole idea of these fractions and compartments is bad physical chemistry. The data will mean something but maybe not what these names imply. Mathematical modellers, in my experience, only too often fail to understand the dynamic geometry at fine grain.
And it seems like it would take a massive amount of ground work to be able to interpret correctly, if the findings aren't a proxy for neuroinflammation. Which seems more and more likely. If there is something here, my sense is that we'll make sense of it by working backwards once the mechanism of ME/CFS is already known, rather than being lead to the mechanism by these findings.
 
I'm not that good at math but I think the very free water (with diffusion > 2.5 μm2/ms) doesn't even enter their equation.
1773934780232.png

The second part represents the isotropic components, an integral from a to b where "a and b are the low and high diffusivity limits for the isotropic diffusion spectrum f(D)." Given the further explanation in the text, one is tempted to assume that a = 0.3 and b = 2.5 μm2/ms. That would explain why the free fraction is never mention in the paper.
Technically I'm supposed to be good at math, they gave me a PhD. This is the first time in years I've been unfoggy enough to think about it though, so apologies for rambling :)

Anyway, I think the bounds on this integral have to to be big enough to cover any water diffusion that could contribute to the signal the MRI machine sees for the math to work out**. So, I'd assume b at least goes up to 3 μm2/ms, unless they are confident a priori there's no free diffusion, I guess. But either way, something at least equivalent to what you're saying is true, because after they use this equation to figure out what the function f is, there is an additional step (which is kind of glossed over in the paper) where they use that function f to actually compute NII-RF (and the other values). And in that additional step they are ignoring the free fraction.

**Rough explanation of how I think the math works:
Suppose we knew there was no anisotropic diffusion, so the sum on the left is zero and we can ignore it. Then we can think of the integral on the right as (loosely speaking) adding up the amount of signal coming from each rate of diffusion D, and the sum of all that should equal the signal the MRI machine sees, S_k. For example, f(2.5) sort of represents how much water diffusing at D=2.5 μm2/ms is contributing to the signal. I'm also going to think of the integral as a sum so we can reason about it. Then that equation in the screenshot (really, the system of equations they are considering for k=1,...,K) becomes a system of equations of the form
MRI signal = sum of f(D) x exp(D), added up over all the possible diffusion rates D.
(where the MRI signal and exponential function differ for each k). Then they are doing some kind of fourier transform to figure out what f is from that system of equations. If the sum (really the integral) excluded some diffusion rate D that actually did contribute to the MRI signal, the f they get as a result would be artificially large in other places to compensate.
 
because after they use this equation to figure out what the function f is, there is an additional step (which is kind of glossed over in the paper) where they use that function f to actually compute NII-RF (and the other values). And in that additional step they are ignoring the free fraction.
Thanks for the explanation, that's really helpful. It's been quite a long time since I've done diff eq, I'm sure a lot has leaked out of my brain by now.

So is the idea that once they get f from fourier transformation, they're computing values for everything in the 0 < D < 0.3 range and calling that NII-RF? Or rather, if it's a fraction, they're computing it for every "bucket" of diffusion coefficients within a voxel and then expressing each bucket as a fraction of the whole? I don't know how the latter would work if they're not taking the free fraction into account but also I'm likely to be misunderstanding something here
 
Yes, but my concern was that I thought the study had controls that were already well matched for age and sex, I thought they claimed that. So, I'd be surprised if controlling for those things made much difference. But, I'd have to go look at the table with demographics to be sure.
Hmm. I'd need to think about it more, but I think if they were well-matched for age, that would mean that without controlling for age, we can expect the predicted coefficient to be accurate (or at least not biased by age).

But the variance due to differing age among the cohort (e.g. if some individuals were 20 and some were 70) still adds noise to the model, making it more likely to not reach significance, while controlling for that variance can make it more significant.

Ok, I did a bit of visualization to confirm and demonstrate why even if the groups are matched by age, controlling for age can still make the effect of ME/CFS status more significant.

We can imagine that we have 20 controls and 20 cases, where each control has a matching case in terms of age. Lets say that we are looking at the effect of ME/CFS status on a "Brain Metric", and lets say the actual effect of having ME/CFS is to increase the brain metric by 1 unit. And for every 5 years of age, the brain metric increases by 1 unit.

I also added a bit of randomness.

We plot and do linear regression on our matched cohort, just based on ME/CFS status:

newplot(32).png1773951673807.png

This is essentially a t-test with a lot of variance. So even though the predicted slope (1.0019) is fairly close to the true effect of ME/CFS (1.0), the result is not very significant because the model is noisy (p=0.293). We can't be very sure about the true slope from the model because of how much variance there is in the brain metric in each group.

Now we add age as a covariate in the regression:

newplot(33).png1773951835677.png

The predicted slope is again accurate (1.000), but now that we control for age and have a model with very little variance unexplained by the included variables, the result is very significant (p<0.001). In the plot, one can see that the brain metric increases by 1 when ME/CFS status increases by 1, and there's very little noise in the data for the slope to change much.

What happens if we don't match for age? Here I made the ME/CFS group about 10 years older than the control group on average, meaning that the brain metric for the ME/CFS group will be about 1 unit higher from having ME/CFS, plus about 2 units higher from being older (1 unit per 5 years).

newplot(34).png1773952037614.png

With simple linear regression, now the predicted slope is 3.0072, close to the expected 3. The slope is biased by age, so we can't be sure what part of the effect is caused by age and what part by ME/CFS status.

And with a regression that controls for age in the unmatched cohort:

newplot(36).png1773952092132.png

Now the effect of ME/CFS status on the brain metric is again fairly accurate (0.9799). So controlling for age takes care of the age bias without having to match for age. There may still be reasons to match for age, even when also controlling for age in the regression, but I'm not sure exactly what they are.

Python:
#!/usr/bin/env python
# coding: utf-8

# In[118]:


import numpy as np
import pandas as pd
import statsmodels.api as sm
import plotly.express as px
import plotly.graph_objects as go


# In[180]:


base_age = np.random.normal(loc = 50, scale = 15, size = 20)
age = np.concatenate([base_age, base_age]) + np.random.normal(0, 0.5, 40)

mecfs_status = np.concatenate([np.full(20, 0), np.full(20, 1)])

brain_metric = (0.2 * age) + mecfs_status + np.random.normal(loc=0, scale=0.1, size=40)

df = pd.DataFrame({
    "Age": age,
    "ME/CFS Status": mecfs_status,
    "Brain Metric": brain_metric,
})
df.head()


# In[181]:


X = sm.add_constant(df['ME/CFS Status'])
y = df['Brain Metric']

model = sm.OLS(y, X)
results = model.fit()

fig = px.scatter(df, x='ME/CFS Status', y='Brain Metric', color="Age", opacity=0.65)
fig.update_traces(marker=dict(size=10))

fig.add_traces(go.Scatter(x=X['ME/CFS Status'], y=results.predict(X), showlegend=False))

fig.update_layout(
    width=500,
    height=500,
    title={
        'text': "Regression with only ME/CFS Status (Matched for age)",
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    }
)
fig.show()

print(results.summary())


# In[182]:


X = sm.add_constant(df[['ME/CFS Status', 'Age']])
y = df['Brain Metric']

results = sm.OLS(y, X).fit()

xrange = [X['ME/CFS Status'].min(), X['ME/CFS Status'].max()]
yrange = [X['Age'].min(), X['Age'].max()]

xx, yy = np.meshgrid(xrange, yrange)

pred = results.predict(sm.add_constant(np.c_[xx.ravel(), yy.ravel()]))
pred = pred.reshape(xx.shape)

# Generate the plot
fig = px.scatter_3d(df, x='ME/CFS Status', y='Age', z='Brain Metric', color="Age")
fig.update_traces(marker=dict(size=3))
fig.add_traces(go.Surface(
    x=xrange,
    y=yrange,
    z=pred,
    opacity=0.5,
    colorscale=[[0, 'lightblue'], [1, 'lightblue']],
    showscale=False,
))
fig.update_layout(
    width=800,
    height=600,
    title={
        'text': "Regression with ME/CFS Status and Age (Matched for age)",
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    })
fig.show()

print(results.summary())


# In[ ]:





# In[184]:


base_age = np.random.normal(loc=50, scale=15, size=20)
age = np.concatenate([base_age, base_age + 10]) + np.random.normal(0, 0.5, 40)

mecfs_status = np.concatenate([np.full(20, 0), np.full(20, 1)])
brain_metric = (0.2 * age) + mecfs_status + np.random.normal(loc=0, scale=0.1, size=40)

df = pd.DataFrame({
    "Age": age,
    "ME/CFS Status": mecfs_status,
    "Brain Metric": brain_metric,
})
df.head()


# In[185]:


X = sm.add_constant(df['ME/CFS Status'])
y = df['Brain Metric']

model = sm.OLS(y, X)
results = model.fit()

fig = px.scatter(df, x='ME/CFS Status', y='Brain Metric', color="Age", opacity=0.65)
fig.update_traces(marker=dict(size=10))

fig.add_traces(go.Scatter(x=X['ME/CFS Status'], y=results.predict(X), showlegend=False))

fig.update_layout(
    width=500,
    height=500,
    title={
        'text': "Regression with only ME/CFS Status (Unmatched ages)",
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    }
)
fig.show()

print(results.summary())


# In[186]:


X = sm.add_constant(df[['ME/CFS Status', 'Age']])
y = df['Brain Metric']

results = sm.OLS(y, X).fit()

xrange = [X['ME/CFS Status'].min(), X['ME/CFS Status'].max()]
yrange = [X['Age'].min(), X['Age'].max()]

xx, yy = np.meshgrid(xrange, yrange)

pred = results.predict(sm.add_constant(np.c_[xx.ravel(), yy.ravel()]))
pred = pred.reshape(xx.shape)

# Generate the plot
fig = px.scatter_3d(df, x='ME/CFS Status', y='Age', z='Brain Metric', color="Age")
fig.update_traces(marker=dict(size=3))
fig.add_traces(go.Surface(
    x=xrange,
    y=yrange,
    z=pred,
    opacity=0.5,
    colorscale=[[0, 'lightblue'], [1, 'lightblue']],
    showscale=False,
))
fig.update_layout(
    width=600,
    height=600,
    title={
            'text': "Regression with ME/CFS Status and Age (Unmatched ages)",
            'y':0.95,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top'
    })
fig.show()

print(results.summary())
 
Lies, damned lies and statistics. Very nice FG.

I see that the samples in this study have quite a spread in age (19 to 65 for the controls; 24 to 65 for the ME/CFS group).

I guess the issue with controlling for age (or anything) with a statistical model is that usually we don't know exactly what the effect of the variable truly is. It's fine when we know for sure that age increases the variable by 1 each decade. But, when we are using our sample to calculate the effect that we will control for, it all becomes very circular, and possibly adjustments are spurious.

It makes me think that
1. ideally we want to control as much as possible in the sample selection, so, in this case, maybe try to get all females in the age range 35 to 45 - or get a big enough sample that some stratification can be done to see if the effect shows up in different subsets without adjustment of the data

2. be even more suspicious of studies that make extensive use of models that control for 'confounders' than I was already

Is that reasonable?
 
Last edited:
So is the idea that once they get f from fourier transformation, they're computing values for everything in the 0 < D < 0.3 range and calling that NII-RF? Or rather, if it's a fraction, they're computing it for every "bucket" of diffusion coefficients within a voxel and then expressing each bucket as a fraction of the whole?
Yes, that is what I think is going on, though they don't go into detail so I'm not certain. I assume this is somewhere in earlier papers and I just haven't found it yet.

I don't know how the latter would work if they're not taking the free fraction into account but also I'm likely to be misunderstanding something here
Sorry, "ignoring" was the wrong word. I just meant that for some reason they aren't calculating/reporting a value ("NII-Free") that corresponds to those larger values of D in the same way "NII-RF" corresponds to small D (other papers using the same methods/equation do report a free value). You're right (or at least, it would be my guess too) that they still need to take the free fraction into account, in the denominator, to report all these values as fractions of the total.
 
I guess the issue with controlling for age (or anything) with a statistical model is that usually we don't know exactly what the effect of the variable truly is. It's fine when we know for sure that age increases the variable by 1 each decade. But, when we are using our sample to calculate the effect that we will control for, it all becomes very circular, and possibly adjustments are spurious.
I don't think it's circular. When we give age to the model, it can predict how much the brain metric would change due to age if ME/CFS status was held steady. For example, you can look at the 2nd plot above (the 1st 3d plot) and imagine the model is predicting a line for age but only in controls (ME/CFS status = 0). It's not unreasonable to think it can fit a nice slope for age. It just does it for the whole sample at once to get one slope for everyone.

1. ideally we want to control as much as possible in the sample, so, in this case, maybe try to get all females in the age range 35 to 45 - or get a big enough sample that some stratification can be done to see if the effect shows up in different subsets
If the sample is restricted to a small age range, then it does decrease variance. And it also reduces the possibility of non-linear effects at extreme ages. So using a restricted age range increases the likelihood of detecting a true effect. But if a study fits a model where the age range is very wide, and still gets a significant result, I don't think it's inherently an issue.

2. be even more suspicious of studies that make extensive use of models that control for 'confounders' than I was already
I'm not sure why what I posted would make you more suspicious of controlling for confounders.

An issue is if a study keeps rerunning a model, controlling for more and more confounders each time until they get a significant result. This would be p-hacking. But if they had a pre-registered design for a model from the start and only reported that, I think it should normally be fine, statistically speaking. (Maybe the choice of confounders to control for may be up for debate, but for something like age, I don't see an issue here.)
 
I don't think it's circular. When we give age to the model, it can predict how much the brain metric would change due to age if ME/CFS status was held steady. For example, you can look at the 2nd plot above (the 1st 3d plot) and imagine the model is predicting a line for age but only in controls (ME/CFS status = 0). It's not unreasonable to think it can fit a nice slope for age. It just does it for the whole sample at once to get one slope for everyone.
What I mean by circular is that the effect of age is calculated from the sample. If by chance something weird is going on in our sample that skews the relationship between age and the signals for some individuals, that might result in us making an adjustment that doesn't reflect what is true in the wider population.

Garbage in, in terms of how age or sex or 'anxiety' affects the signals, results in garbage out.

If we take the depression and anxiety as calculated by the HADS survey, the controls will score relatively low except for maybe a few who actually do have depression and anxiety. The ME/CFS people will score relatively high, regardless of how anxious or depressed they are. If you then adjust the signals based on the depression and anxiety scores, it will be falsely skewing the data.
 
What I mean by circular is that the effect of age is calculated from the sample. If by chance something weird is going on in our sample that skews things, that might result in us making an adjustment that doesn't reflect what is true in the wider population.

Garbage in, in terms of how age or sex or 'anxiety' affects the signals results in garbage out.
I'm not entirely sure what you mean. Yes, various things can also be affecting the brain metric apart from the variables in the model, leading to additional noise and confounding, but this is an issue whether age is added as another covariate with ME/CFS status or not. If we add age, we can at least then estimate what the slope would be if age were held constant. The resulting slope for ME/CFS status may still be confounded by other things, but at least we can be more assured that the slope isn't due to comparing 20 year olds to 40 year olds.

If we take the depression and anxiety as calculated by the HADS survey, the controls will score relatively low except for maybe a few who actually do have depression and anxiety. The ME/CFS people will score relatively high, regardless of how anxious or depressed they are. If you then adjust the signals based on the depression and anxiety scores, it will be falsely skewing the data.
I wouldn't describe it as "falsely skewing the data" to control for HADS. We can interpret the result as the effect of ME/CFS status on the brain metric if the HADS score is held constant. In other words, if we have a healthy person and a person with ME/CFS that have identical HADS, what would be the difference in the brain metric. I'd expect it to show a smaller effect of ME/CFS on the brain metric, since I think HADS would somewhat be a proxy for ME/CFS severity. So it may tend to somewhat obscure a true ME/CFS effect.
 
If we add age, we can at least then estimate what the slope would be if age were held constant.
I mean that the estimate of what the slope would be if age were held constant could be wrong, the assessed impact of age on the signals could be wrong, as in not reflective of a true population effect.
 
I mean that the estimate of what the slope would be if age were held constant could be wrong, the assessed impact of age on the signals could be wrong, as in not reflective of a true population effect.
Mm, I don't think the topic is intuitive enough for me to be able to respond in a way that actually helps, so I'll probably just leave it there. You may be describing a real issue, but I don't have the capacity to think about exactly what you're talking about right now.
 
Last edited:
The anxiety and depression adjustment is an example of issues with the correction of confounders with an extra twist.

In the healthy controls, the HADS is measuring anxiety and depression - probably not perfectly but maybe well enough, although I doubt there were people scoring high levels of anxiety and depression on the sample. So, you have a relationship between the HADS scores and the signals, but perhaps that relationship is pretty shaky at the higher levels of anxiety and depression due to lack of data points. Nevertheless, you have your straight-line or whatever between the two variables and you extrapolate it out to the high scores.

In the ME/CFS group, the HADS is measuring something else - maybe something closer to physical function or specifically lack of physical function. And probably most will be scoring at least mid-level 'anxiety and depression' and some might have high levels. The data might be concentrated at a different end of the scale.

So, that extrapolated relationship between mental health in healthy people and the signals is applied to the relationship between ME/CFS physical health and the signals. And the relationship is extrapolated out beyond the range where the control data was solid.

That is getting pretty hairy.
 
I mean that the estimate of what the slope would be if age were held constant could be wrong, the assessed impact of age on the signals could be wrong, as in not reflective of a true population effect.
It wouldn’t be adding any problems that weren’t already there in your participant selection. What adding age as a covariate allows you to do is:
1) start with an association between ME/CFS and the outcome variable on its own
2) determine whether or not that already existing association can be better explained by a confounder in your data.

Thats going to be the one and only use case of covariates in the vast majority of papers where it is employed.
 
I've received a very nice reply back from the author Qiang Yu. He confirms that there was an error with the RF and HR interpretations - they are both low in ME/CFS and so the interpretation of increased inflammatory cells that other authors have applied to high RF does not apply. The authors are talking to the journal to clarify things.

Qiang suggests that the finding might have something to do with having fewer microglia, or them taking a different (an activated amoeboid shape rather than a branched resting shape). He also suggests that lower HR might be due to water moving out of extracellular spaces and into cells and axon fibres - he calls it cytotoxic swelling. I'm not quite understanding how the signals would make either of those things true, but it is quite possible that they are.

I'm impressed with Qiang Yu. We all make mistakes but we've definitely seen that not every researcher is open to acknowledging them and fixing them. And these are new techniques, I guess people are still getting their heads around how they work and what they mean.
 
Last edited:
Just wanted to quickly note that, while the percentage of the discussion here that I have actually been able to follow in any detail has been minuscule (though I am trying hard), I nevertheless hold this thread to be a perfect example of what makes S4ME so valuable. My thanks to all of you - and especially to @Hutan for getting in touch with the authors, what a fantastic result!
 
Thanks for reporting that Hutan, that's really great to hear!

It is interesting to ponder what low RF could mean in that case. Might be worth going back and reading some of the earlier papers where they established these techniques on known samples to see if there are any clues. (I'll take a peek if my energy holds up)

Of the suggestions he made, this one: "lower HR could be water moving from the extracellular space into axon fibers" at least aligns with the idea that axon fibers have more anisotropic diffusion and less isotropic diffusion. And I think JE was speculating earlier that there might be extracellular matrix stuff limiting the movement of water (correct me if I'm misremembering). If that were true, then indeed maybe some of the higher HR value in controls was water stuck in the extracellular matrix rather than cells. (?)
 
Back
Top Bottom