A crumb of a clue on epidemiology

As @Yann mentioned the graphs sometimes seem to some large extent be driven by things related to public awareness which might work in all sorts of unusual ways
I agree with you and Yann, and I think the most likely reason for the correlation is for some reason more awareness of ME/CFS in people with British ancestry. But I think it's a fun idea that has a small chance of actually being about prevalence.
 
I agree with you and Yann, and I think the most likely reason for the correlation is for some reason more awareness of ME/CFS in people with British ancestry. But I think it's a fun idea that has a small chance of actually being about prevalence.
Maybe it's all the Guardian and George Monbiots fault after all! I agree nothing is lost with playing around with this as @Murph suggested.
 
The census data includes 108 ancestries. I ran the regression on all of them against the same Google Trends values. Here are the top 20 highest R^2 values:
1774240791839.png
This was showing lots of UK countries at the top, and I realized that it may be because people could report multiple ancestries.

From an overview of these datasets:
The ACS asks each respondent to write their ancestry or ethnic origin, and records up to two ancestries per person (the first two ancestries written by the respondent).
The table for People Reporting Single Ancestry (B04004) shows data for those who reported only one ancestry, while People Reporting Multiple Ancestry (B04005) shows data for those who reported more than one ancestry. People Reporting Ancestry (B04006) shows data for those who reported any ancestry, regardless of whether it was the only ancestry or part of multiple ancestries they reported.

Note: this means that values in B04005 and B04006 will not necessarily add up to match totals, because one person may be represented under more than one ancestry.

I (and I think Murph) used B04006, which counts up to two ancestries per person. So, for example, the correlations for Scottish and English might both be high because the same people reported both ancestries.

I tried again with B04004, which only includes people who reported one ancestry, to avoid double counting and allow better comparison between ancestries.

In this case, the correlation with English is pretty much gone (the sample size is also much smaller for this dataset, so the ancestry values may be less precise):
1774308417525.png

There are still some large correlations in this analysis, and British is still near the top at #9, though less significant, with an R^2=0.26 and p=0.0018.
Screenshot from 2026-03-23 18-24-19.png

The top 5 correlations with Google Searches for ME/CFS are Northern European, European, Swedish, Icelander, and New Zealander.

Here for example is the plot of the top correlation, Northern European ancestry vs Google Trends for ME/CFS:
1774308463026.png

(Note that only 35 states were included as the rest had missing data.)

However, it looks like the correlation might be mainly driven by just the five states in the upper right. If I exclude those, the correlation is much less apparent:
1774308486210.png

Edit: Added confidence bars to plots.

Edit: Maybe worth checking if the association is due to income level.
 
Last edited:
Are these searches in English only?
This page seems to suggest that if you are looking at a simple search term, then it will show trends just for that specific term in that language. But if you search by a "Topic", then it tracks trends across different translated versions of that topic.

- https://support.google.com/trends/answer/4359550?hl=en#zippy=,compare-terms-across-languages
if you enter ねこ, the Japanese characters for "cat," you won’t find much data for the US since many people in the US search for "cat."
Topics are a group of terms that share the same concept in any language. You can find topics below your search term.

For example, if you search London, and choose the corresponding topic, your search includes results for topics like "Capital of the UK" and "Londres," which is "London" in Spanish.

So I think the "Myalgic encephalomyelitis/chronic fatigue syndrome" I linked before is a "Topic" that should transcend language.
 
Hm. I really thought it might be that people with higher income search ME/CFS more, as they might have better access to resources that would make them aware of what ME/CFS is. And maybe states with larger proportions of Northern European or British ancestry would have higher average income.

But the correlation is very weak between search trends for ME/CFS and median household income in each state.

This is again the Trends data I used in post #4: https://trends.google.com/trends/explore?q=/m/0dctd&geo=US&legacy&hl=en

And I used income data from the following dataset, using the "Households" row, and the "Median income (dollars)" column for each state: https://data.census.gov/table/ACSST...ag-Grid-AutoColumn~(Margin+of+Errorundefined)

Screenshot from 2026-03-23 22-26-50.png

Controlling for Income in the regression of search trend vs. English ancestry, ancestry still significantly predicts the trend value.
1774322202317.png

Edit: Replaced model summary with the regression results using original unscaled values. I had originally used scaled values because I thought it was causing issues with the regression due to income and ancestry being around 7 orders of magnitude apart, but the results are the same.

"Households" is the income variable.
 
Last edited:
A Norwegian study found that higher education was correlated with getting an ME/CFS diagnosis, but lower income was also correlated.

It was speculated that high education = more access to knowledge and ability to pursue better medical care, and that lower income = higher need for a diagnosis for benefit purposes etc.
 
A Norwegian study found that higher education was correlated with getting an ME/CFS diagnosis, but lower income was also correlated.

It was speculated that high education = more access to knowledge and ability to pursue better medical care, and that lower income = higher need for a diagnosis for benefit purposes etc.
I like your thinking.

I used the 2024 5-year average S1501 "Educational Attainment" dataset, specifically the column for percent of those at least 25 years old that have a bachelor's degree.

Same 5-year English ancestry and 1-year Trends data as before.

There does seem to be a correlation of educational attainment with the Google Trends for ME/CFS:

1774375408831.png

But it still doesn't make the English ancestry correlation less significant to control for proportion of people who have a bachelor's degree:

1774375998526.png

I also tried with a 5 year time frame from 2019 to 2024 for the Trends data to try to roughly match the time frames of the 5 year Census datasets I'm using, and both English ancestry and Educational attainment are still significantly associated with searching for ME/CFS.

1774376566671.png

Edit: Note that the variable name for English is called "Estimate!!Total:!!English" as the dataset has the total number with that ancestry, not a proportion, but I converted it to a proportion of the total population in the state. I probably should change the variable name after converting to make that clear.
 
both English ancestry and Educational attainment are still significantly associated with searching for ME/CFS.

I wonder if there are associations with searching for 'chronic fatigue syndrome'?

(Sorry, that's an idle thought with no idea of the work involved—I don't understand maths and statistics!)
 
I wonder if there are associations with searching for 'chronic fatigue syndrome'?
For comparison, here are the plots for association of 5 year search data for the ME/CFS topic with English ancestry and education:
1774377903898.png1774377926565.png

Interestingly, if instead using the specific search term "chronic fatigue syndrome" over the same time frame, the association with English ancestry gets stronger, but the association with education disappears:
1774378167436.png1774378152253.png
 
Interesting! I suppose the thing we can't know is what people think 'English ancestry' means, and whether they were asked about other ancestries?
 
Interesting! I suppose the thing we can't know is what people think 'English ancestry' means, and whether they were asked about other ancestries?
the specific question asked is “what is this person’s ancestry or ethnic origin?” and then it lists several examples like Italian, Jamaican, Korean, etc.

It’s a write-in question so I guess people could technically write anything and then it gets coded by the bureau staff as appropriate. It seems like they code the first two someone lists


Thats why im pretty certain people are just guessing based on their last name unless their family recently emigrated. You do have the option of writing “I don’t know” but the question doesnt really make that obvious
 
Thats why im pretty certain people are just guessing based on their last name unless their family recently emigrated.

That's what I suspected.

In parts of the US, 'English' tends to include Welsh, Scots, Irish and Cornish, which might make no difference at all or add another layer of meaninglessness. It depends how far back you want the definition of ancestry to go, or whether you really mean ancestral origins or cultural origins.

ETA: or just the origins you know about, which might only consist of 50%.
 
Back
Top Bottom