Running FLAMES on DecodeME data

Murph · May 21, 2026

Quick DecodeME question: did they look for WASF3 (aka WAVE3) and it wasn't significant?

Trying to dial in the extent to which I should hold out hope for Hwang's work. (which tested wasf3 based on an earlier genetic study. https://pmc.ncbi.nlm.nih.gov/articles/PMC3089886/ )

Andy · May 21, 2026

Murph said:
Quick DecodeME question: did they look for WASF3 (aka WAVE3) and it wasn't significant?

Trying to dial in the extent to which I should hold out hope for Hwang's work. (which tested wasf3 based on an earlier genetic study. https://pmc.ncbi.nlm.nih.gov/articles/PMC3089886/ )

We didn't look for any particular gene, and reported those that reached significance (or near to it) - WASF3 wasn't one of those.

ME/CFS Science Blog · 2026-05-27T20:49:53+0100

Looks like the FUMA website has been updated and now has a portal to do FLAMES.

Functional Mapping and Annotation of Genome-wide association studies

I've tried to give the correct input, but it would be great if others with more skills could try it out as well.

For the sample size they require, I gave the total DecodeME sample size of cases (15,579) + controls (259,909) = 275,488 rather than the effective sample size of 4 * 275,488 * 15,579/275,488 * (275,488 - 15,579)/275,488 = 58,792

ME/CFS Science Blog · 2026-05-27T22:19:04+0100

Here's what I got. Looks like it only does the 6 hits that reached significance for the main DecodeME analysis

Which are these:

And for the first two it could not make a prediction, too many competitors. For the other 4 he gave the following answers. MMS22L for position 6:98432302:C:CA is new.

ChronicallyOverIt · 2026-05-27T22:52:02+0100

ME/CFS Science Blog said:
Here's what I got. Looks like it only does the 6 hits that reached significance for the main DecodeME analysis

Which are these:

View attachment 32556

And for the first two it could not make a prediction, too many competitors. For the other 4 he gave the following answers. MMS22L for position 6:98432302:C:CA is new.
View attachment 32555

Are red blood cells back on the table:

MMS22L is a novel key actor of normal and pathological erythropoiesis - PMC
https://pmc.ncbi.nlm.nih.gov/articles/PMC12723439/

ChronicallyOverIt · 2026-05-27T23:52:02+0100

Hmm doing a bit of googling, with AI, it sounds like if you have a bad MMS22L variant you could have oxidative stress build up in the cell. This could trigger cGAS-STING (something I just learned about) which in turn shoots out type I interferon. This could weave in CD38 over expression on immune cells. (Maybe this is why daratumab works?). Supposedly this is NAD+ heavy which in turn with interferon could mess up the eMSNs and dump CRH, which then blunts the bodys natural response. All sounds bit too good to be true Mr. Gemini....

forestglip · 2026-05-28T01:22:17+0100

ME/CFS Science Blog said:
Looks like the FUMA website has been updated and now has a portal to do FLAMES.
Functional Mapping and Annotation of Genome-wide association studies
I've tried to give the correct input, but it would be great if others with more skills could try it out as well.

For the sample size they require, I gave the total DecodeME sample size of cases (15,579) + controls (259,909) = 275,488 rather than the effective sample size of 4 * 275,488 * 15,579/275,488 * (275,488 - 15,579)/275,488 = 58,792

I just did it as well. I think probably effective sample size makes more sense, so that's what I used. For the LD reference, I used "UKB release2b 10k European". I don't know if that or "UKB release 2b 10k White British" more closely matches the DecodeME cohort.

I basically got the same results, but without MMS22L.

This seems to be because the main SNP2GENE task only identified 5 loci for me, as opposed to 6 like you got. I'm not sure why that is. The file I upload has the genome-wide significant SNP near that 6th MMS22L locus.

Could you maybe compare the parameters I used to the parameters you used to see what might be different that caused this? It's found under the Parameters tab on the Results page for SNP2GENE.

created_at	2026-05-28 01:23:53
title	decodeme_ukb-eur_3
FUMA	v1.8.2
MAGMA	v1.08
GWAScatalog	e0_r2022-11-29
ANNOVAR	2017-07-17
gwasfile	gwas_1_grch37_fuma_valid_chr.tsv.gz
keepinfiles	1
chrcol	CHROM
poscol	GENPOS
rsIDcol	NA
pcol	P
eacol	ALLELE1
neacol	ALLELE0
orcol	NA
becol	BETA
secol	SE
leadSNPsfile	NA
addleadSNPs	1
regionsfile	NA
GRCh38	0
N	58792
Ncol	NA
exMHC	1
MHCopt	annot
extMHC	NA
ensembl	v102
genetype	protein_coding
leadP	5e-8
gwasP	0.05
r2	0.6
r2_2	0.1
refpanel	UKB/release2b
pop	EUR_10k
MAF	0
refSNPs	1
mergeDist	250
magma	1
magma_window	0
magma_exp	GTEx/v8/gtex_v8_ts_avg_log2TPM, GTEx/v8/gtex_v8_ts_general_avg_log2TPM
posMap	1
posMapWindowSize	10
posMapAnnot	NA
posMapCADDth	0
posMapRDBth	NA
posMapChr15	NA
posMapChr15Max	NA
posMapChr15Meth	NA
posMapAnnoDs	NA
posMapAnnoMeth	NA
eqtlMap	0
eqtlMaptss	NA
eqtlMapSig	1
eqtlMapP	1
eqtlMapCADDth	0
eqtlMapRDBth	NA
eqtlMapChr15	NA
eqtlMapChr15Max	NA
eqtlMapChr15Meth	NA
eqtlMapAnnoDs	NA
eqtlMapAnnoMeth	NA
xqtlsMap	1
xqtlsMapdss	pQTL/1_suhre_2017/sig_pairs/Plasma_1_suhre_2017.txt.gz, sceQTL/bryois2022Brain/sig_pairs/Brain_bryois2022Brain_Excitatory.neurons.txt.gz
xqtlP	1e-3
ciMap	0
ciMapBuiltin	NA
ciMapFileN	0
ciMapFiles	NA
ciMapFDR	NA
ciMapPromWindow	NA
ciMapRoadmap	NA
ciMapEnhFilt	0
ciMapPromFilt	0
ciMapCADDth	0
ciMapRDBth	NA
ciMapChr15	NA
ciMapChr15Max	NA
ciMapChr15Meth	NA
ciMapAnnoDs	NA
ciMapAnnoMeth	NA

My guess is a different LD reference panel, where the one I used doesn't include that SNP.

Edit: As far as whether to use UKB European or UKB White British, I think European is probably right. The DecodeME paper says:

This association was robust to testing restricted to the genetically more homogeneous White British genetic ancestries subset

They did a sensitivity analysis only on White British, so I guess that means the full cohort was more diverse than that. Plus they say the cohort had European ancestry multiple times.

ME/CFS Science Blog · 2026-05-28T06:55:16+0100

forestglip said:
My guess is a different LD reference panel, where the one I used doesn't include that SNP

Yes I used 1000G Phase3 EUR, yours might have been a better choice.

Not sure if that explains the difference though because it's strange that you only got 5 hits. Couldn't it be due to the sample size given? My interpretation is that they want the total number of individuals in the GWAS as explained in the tutorial:

Functional Mapping and Annotation of Genome-wide association studies

created_at	2026-05-27 22:07:21
title	MECFS2
FUMA	v1.8.2
MAGMA	v1.08
GWAScatalog	e0_r2022-11-29
ANNOVAR	2017-07-17
gwasfile	fuma.txt.gz
keepinfiles	1
chrcol	CHR
poscol	BP
rsIDcol	NA
pcol	P
eacol	A1
neacol	A2
orcol	OR
becol	BETA
secol	SE
leadSNPsfile	NA
addleadSNPs	1
regionsfile	NA
GRCh38	0
N	275488
Ncol	NA
exMHC	1
MHCopt	annot
extMHC	NA
ensembl	v102
genetype	protein_coding
leadP	5e-8
gwasP	0.05
r2	0.6
r2_2	0.1
refpanel	1KG/Phase3
pop	EUR
MAF	0
refSNPs	1
mergeDist	250
magma	1
magma_window	0
magma_exp	GTEx/v8/gtex_v8_ts_avg_log2TPM, GTEx/v8/gtex_v8_ts_general_avg_log2TPM
posMap	1
posMapWindowSize	10
posMapAnnot	NA
posMapCADDth	0
posMapRDBth	NA
posMapChr15	NA
posMapChr15Max	NA
posMapChr15Meth	NA
posMapAnnoDs	NA
posMapAnnoMeth	NA
eqtlMap	0
eqtlMaptss	NA
eqtlMapSig	1
eqtlMapP	1
eqtlMapCADDth	0
eqtlMapRDBth	NA
eqtlMapChr15	NA
eqtlMapChr15Max	NA
eqtlMapChr15Meth	NA
eqtlMapAnnoDs	NA
eqtlMapAnnoMeth	NA
xqtlsMap	1
xqtlsMapdss	eQTL/metabrain/sig_pairs/Brain_basalganglia.txt.gz, eQTL/metabrain/sig_pairs/Brain_cerebellum.txt.gz, eQTL/metabrain/sig_pairs/Brain_cortex.txt.gz, eQTL/metabrain/sig_pairs/Brain_hippocampus.txt.gz, eQTL/metabrain/sig_pairs/Brain_spinalcord.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Amygdala.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Anterior_cingulate_cortex_BA24.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Caudate_basal_ganglia.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Cerebellar_Hemisphere.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Cerebellum.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Cortex.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Frontal_Cortex_BA9.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Hippocampus.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Hypothalamus.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Nucleus_accumbens_basal_ganglia.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Putamen_basal_ganglia.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Spinal_cord_cervical_c1.txt.gz, eQTL/gtex_v10/sig_pairs/Brain_Substantia_nigra.txt.gz
xqtlP	1e-3
ciMap	0
ciMapBuiltin	NA
ciMapFileN	0
ciMapFiles	NA
ciMapFDR	NA
ciMapPromWindow	NA
ciMapRoadmap	NA
ciMapEnhFilt	0
ciMapPromFilt	0
ciMapCADDth	0
ciMapRDBth	NA
ciMapChr15	NA
ciMapChr15Max	NA
ciMapChr15Meth	NA
ciMapAnnoDs	NA
ciMapAnnoMeth	NA

ME/CFS Science Blog · 2026-05-28T07:14:32+0100

Here are the FLAMES result I got (FLAMES_scores.raw) but put into an excel file with the chromosome positions added to it.

mariovitali · 2026-05-28T07:44:22+0100

I found the following study on pediatric POTS which mentions both ARFGEF2 and UNC13C

https://pmc.ncbi.nlm.nih.gov/articles/PMC12137463/

ME/CFS Science Blog · 2026-05-28T13:41:46+0100

In the SNP2GENE you can set the p-value threshold lower so that FLAMES will try to predict more genes for loci that didn't reach the 10^-8 threshold. I've tried it using the lower 5*10^-6 threshold to get a higher number of potential genes. It resulted in 56 independent regions with a hit.

I've used the 1000G Phase3 EUR LD reference panel, will try to use the UK biobank LD panel later. The sample size (effective or total) didn't seem to make much of a difference from what I can tell.

For 32 out of 56 regions, FLAMES was confident enough to predict a gene. Here's the list.

ARFGEF2
CA10
UNC13C
MMS22L
SHISA6
SOX6
OLFM4
PEBP1
ZNF644
LRRC7
DCC
RPP40
PLCL1
CACNA1E
VRK2
ALK
VRK2
MICALL2
KIAA1239
NEURL1
NEK1
VPS54
STT3B
RIMS1
PTPRE
NR2F1
PTBP2
RP11-147C23.1
SMCHD1
ADARB2
LAMA2

More info about the results in the attached file. It contains the FLAMES results but I also added the info about the genomic regions from FUMA itself.

forestglip · 2026-05-28T14:07:38+0100

ME/CFS Science Blog said:
Not sure if that explains the difference though because it's strange that you only got 5 hits. Couldn't it be due to the sample size given? My interpretation is that they want the total number of individuals in the GWAS as explained in the tutorial:

I think you're right about them wanting total sample size. Someone asked them about it, and they said refer to the MAGMA manual since that's the only part of FUMA that uses it. (Though this was before FLAMES was added, so it's possible FLAMES uses it for other reasons as well.)

The manual (linked at the top of the MAGMA webpage) says this:

The N modifier is used to specify the sample size directly (the total sample size, also when using case-control analysis results).

I don't think that should affect whether a locus exists there for SNP2GENE, though. I think that's just calculated based on whether a SNP has a p-value below the given threshold.

Nightsong · 2026-05-28T14:13:58+0100

In the SNP2GENE section of the guide -

Sample size (N) Mandatory
The total number of individuals in the GWAS or the number of individuals per SNP. This is only used for MAGMA to compute the gene-based P-values. For total sample size, input should be an integer. When the input file of GWAS summary statistics contains a column of sample size per SNP, the column name can be provided in the second text box.

and in the FUMA quick start guide -

In this section, the only mandatory parameter is the sample size (N). You can specify the sample size in 2 ways:
Put in an integer represent the same size. For example: 50000 if there were 50000 individuals total (cases and controls) in your GWAS. Do not put in 50000.0 or 50000,0

If sample size is a column in your input GWAS summary statistics, you can specify the name of the column that represent the sample size.

forestglip · 2026-05-28T15:58:10+0100

ME/CFS Science Blog said:
For 32 out of 56 regions, FLAMES was confident enough to predict a gene. Here's the list.

ARFGEF2
CA10
UNC13C
MMS22L
SHISA6
SOX6
OLFM4
PEBP1
ZNF644
LRRC7
DCC
RPP40
PLCL1
CACNA1E
VRK2
ALK
VRK2
MICALL2
KIAA1239
NEURL1
NEK1
VPS54
STT3B
RIMS1
PTPRE
NR2F1
PTBP2
RP11-147C23.1
SMCHD1
ADARB2
LAMA2

I put these into an online functional gene enrichment tool called g:GOSt. Here is a link directly to the results: https://biit.cs.ut.ee/gplink/l/awfBEst46QG

Here are the four enriched gene sets based on testing many different gene set sources at once:

term_id	term_name	adjusted_p_value	term_size	query_size	intersection_size	effective_domain_size	intersections
GO:0007268	chemical synaptic transmission	0.000416	765	27	9	20972	UNC13C,SHISA6,DCC,PLCL1,CACNA1E,NEURL1,VPS54,RIMS1,LAMA2
GO:0043197	dendritic spine	0.00855	161	27	4	22155	ARFGEF2,SHISA6,NEURL1,LAMA2
GO:0045202	synapse	0.0159	1608	27	9	22155	ARFGEF2,UNC13C,SHISA6,DCC,CACNA1E,NEURL1,VPS54,RIMS1,LAMA2
GO:0031175	neuron projection development	0.0405	1010	27	8	20972	DCC,ALK,MICALL2,NEURL1,VPS54,RIMS1,NR2F1,LAMA2

* There are actually 11 significantly enriched gene sets, but g:GOSt uses a method to try to narrow it down when several significant gene sets are closely related and thus might be significant just because another one is significant. It's explained in their documentation.

forestglip · 2026-05-28T16:04:08+0100

ME/CFS Science Blog said:
For 32 out of 56 regions, FLAMES was confident enough to predict a gene. Here's the list.

ARFGEF2
CA10
UNC13C
MMS22L
SHISA6
SOX6
OLFM4
PEBP1
ZNF644
LRRC7
DCC
RPP40
PLCL1
CACNA1E
VRK2
ALK
VRK2
MICALL2
KIAA1239
NEURL1
NEK1
VPS54
STT3B
RIMS1
PTPRE
NR2F1
PTBP2
RP11-147C23.1
SMCHD1
ADARB2
LAMA2

Interestingly, FLAMES pointed to VRK2 from two different loci. Looking at the spreadsheet you provided, the locations are 2:58035555:A:G and 2:58862232:C:T (these are GRCh37 while the plot is GRCh38). These two loci can be seen here:

ME/CFS Science Blog · 2026-05-28T16:14:52+0100

forestglip said:
I put these into an online functional gene enrichment tool called g:GOSt. Here is a link directly to the results: https://biit.cs.ut.ee/gplink/l/awfBEst46QG

Thanks, seems to confirm what we were already thinking. With the additional tissue enrichment in the brain, cell types pointing to eMSN, and functional gene categories, I think we now have strong evidence that DecodeME points to neural communication as key to the pathology of ME/CFS.

This might be the common pathway for many different subgroups within the ME/CFS label and many different causes for those brain signals. These subgroups and their causes might only show up with much more statistical power.

Running FLAMES on DecodeME data

Murph

Senior Member (Voting Rights)

Andy

Senior Member (Voting rights)

ME/CFS Science Blog

Senior Member (Voting Rights)

ME/CFS Science Blog

Senior Member (Voting Rights)

ChronicallyOverIt

Senior Member (Voting Rights)

ChronicallyOverIt

Senior Member (Voting Rights)

forestglip

Moderator

ME/CFS Science Blog

Senior Member (Voting Rights)

ME/CFS Science Blog

Senior Member (Voting Rights)

Attachments

mariovitali

Senior Member (Voting Rights)

ME/CFS Science Blog

Senior Member (Voting Rights)

Attachments

forestglip

Moderator

Nightsong

Senior Member (Voting Rights)

forestglip

Moderator

forestglip

Moderator

ME/CFS Science Blog

Senior Member (Voting Rights)