Running FLAMES on DecodeME data

ChronicallyOverIt

Senior Member (Voting Rights)
Hi all,

Something that has been in the back of my head since the DecodeME data was released was running the newer FLAMES analysis:

Somewhat discussed here: https://www.s4me.info/threads/initi...2025-decodeme-collaboration.45490/post-636145

My intention with this thread is that if enough of us get our heads together, I think we me able to run this? I wanted to make a separate thread as I believe if we start working on this it will not be very scientific discussions of the data, and more of a troubleshooting stack overflowish thread. Previously I was able to get the repository up and running with the example, but got stuck at the FINEMAPPING part.

I guess to start off, did anyone else get further? What are the bottle necks? Am I being naive in thinking we can do this? Is there key data we need from the DecodeMe team to accomplish this? Is this something that they are working on?
 
Update for my work tonight, I got a new PC so had to rebuild the environment. Rebuilt and got the example data working.


Interesting I was looking to accomplish step 1 & 2 and found some has already done most of the work!


Anyone in contact with this guy? I think he commented his data on a blog by @ME/CFS Science Blog. I also believe these scripts solve the fine mapping issues I ran into before...

Edit, it was this blog: https://mecfsscience.org/genes-pointing-to-the-brain-decodeme-part-ii/
 
Last edited:
Thanks for trying this. I gave up because my coding skills are rather amateurish. There was also the issue that I wouldn't be entirely sure if it worked or if I had misapplied it somehow.

I was hoping that either researchers would use this in future publications (e.g. the official DecodeME paper) or that the FLAMES authors would integrate this tool into an easier user interface.

But it would be really great if you (with the help of others) could make it work!
 
Thanks for trying this. I gave up because my coding skills are rather amateurish. There was also the issue that I wouldn't be entirely sure if it worked or if I had misapplied it somehow.

My fear as well. I’m hoping by documenting my steps extremely clearly I can at least have others replicate and potentially check my work.

It’s more the biology decisions of the data and lift overs that worry me. Dropping something without having the correct translation. I’ll try to outline my next steps tonight.

How far did you get @ME/CFS Science Blog
 
I spoke with Paolo and he said that he should re run the analysis shown on his blog. I will PM him again.

If you want any help on coding I may be of help as I have the time now !
I read some of Paolos blog, sounds like he is worse in winter :( if it’s too much I would rather not bother him.

I’ll start a google collab tonight. I’m happy to coordinate on here or start a small discord server, up to y’all.
 
Hmm I’m more interested in just running FLAMES on only the decodeME data. I’m not so up to date but i believe this is the highest quality of GWAS data, and want to see what the FLAMES analysis provides more clues.

Still interesting work! Does he have any interest in joining this forum or discussion. If you have the chance please let him know about this thread.
 
Has anyone compared the genes he has highlighted here to the genes in the decodeMe Magma?

From what I saw, the following genes from Paolo's list are new :

OTX1
WDPCP
UGP2
VPS54
PREX1

There are several reasons why these may be important. More specifically (not meant to be a complete list of related functions):

OTX1 : Neuronal involvement and could be important for susceptibility to neuronal excitability (study in mice) :

https://onlinelibrary.wiley.com/doi/abs/10.1046/j.0953-816x.2001.01723.x

WDPCP : Related to cilia signalling. With a bit of searching, a hypothesis here is that we may be looking at functional ciliopathy not congenital.

UGP2 : Very important for N-Linked glycosylation (a target identified in 2015 via machine learning).

VPS54 : Points to vesicular trafficking

PREX1 : GPCR-related signalling. Could be a signal amplifier.
 
Sorry have not had as much free time to work on this.

I think my next step is to tackle step 1 & step 2, which can be done through the online tool: https://fuma.ctglab.nl/

Interestingly I see Paolo has already done this with "munged" ( GRCh38 reference genome, which the original DecodeME data uses, to GRCh37):


I am a bit weirded out after seeing this.... After running FUMA on the lift over data you see the candidate genes do not match that of the DecodME's paper? Shouldn't it hopefully still give the same analysis results, unless the lift over significantly drops data?[S] I will be submitting my own job without the data lifted over.[/S]

1765602288527.png
 
Last edited:
Alright next problem, I see why Paolo has munged/lifted over.

This tool only uses hg19 where DecodeME is hg38. We need hg19, so the liftover was necessary:
https://groups.google.com/g/fuma-gwas-users/c/E4BeXe8vox0

"Of course there will be some loss of variants due to the conversion not being perfect."

Anyone give me insight if we can use the munged/lifted data from Paolo's upload? He went from "munges allele columns and lifts coordinates from GRCh38 to GRCh37" This is hg38->hg19.

Am I missing something with the geonome-wide plots not looking the same, is this the expected variant loss?
 
Last edited:
After running FUMA on the lift over data you see the candidate genes do not match that of the DecodME's paper? Shouldn't it hopefully still give the same analysis results, unless the lift over significantly drops data?
This is based on a meta-analysis of different studies, right? So I don't think it should match exactly.

Anyway, that plot shows the output of the MAGMA gene-based analysis, not the main analysis DecodeME did that turned up all the candidate genes like RABGAP1L and CA10. In theory, it should match the genes listed in Supplementary Table 4, also listed here. It still doesn't exactly match, but some genes do.

When I previously ran FUMA after liftOver, I got the same 13 genes as the paper, so it's possible to do. In case you want to try using the gene Z scores from my FUMA/MAGMA run, I'll upload them here. Or I can try to help you do the liftover to do it yourself. (Note, these scores are from when I ran it a second time using UK BioBank as a reference instead, which gave better results than 1000 Genomes.)
 

Attachments

This is based on a meta-analysis of different studies, right? So I don't think it should match exactly.
Looking at his github he has two repositories doing two things:
  1. DecodeMe - this looks like to me additional statistics that does fine mapping on only lifted over DecodeMe data. Which leads me to believe something when wrong with his lift over/munging
    • "This analysis generates two supplementary loci (on chr 10 and chr 15) that are not present in the results by DecodeME. This may be due to the lift-over from GRCh38 to GRCh37 that was performed on the summary statistics to allow for the use of available UKB LD matrices. I manually removed these two loci from the results reported below."
    • Plot from run: https://fuma.ctglab.nl/browse/663962
  2. MetaMe - This is what you are referencing, a meta analysis with more GWAS data than just the DecodeMe cohort, which yes does add additional hits
The announcement suggested using this for liftover? Bad idea?

Edit: I see in that post you answer a lot of my questions. You did use that tool and used the QC'd data which was going to be my next question :thumbup:
 
@forestglip Hey, if you have a chance could you see if there is magma.genes.raw as well? POPs needs this file as well. The Magma should have created this
Sure. It's too big to upload here, but I uploaded it to a file sharing site. The link will expire in a week: https://limewire.com/d/gyXn2#CsV6DOeqGh

The link isn't working for me.

this looks like to me additional statistics that does fine mapping on only lifted over DecodeMe data. Which leads me to believe something when wrong with his lift over/munging
I think something was different. @ME/CFS Science Blog pointed out that there seemed to be fewer variants than expected after liftover in Paolo's analysis in the comments here: https://mecfsscience.org/genes-pointing-to-the-brain-decodeme-part-ii/
ME/CFS Science Blog said:
“In order to use them, I had to lift over the DecodeME summary statistics from GRCh38 to GRCh37. This is not a perfect approach because there is a loss of about half of the variants.”

This is curious. I used the GenomicRanges and rtracklayer packages in R and only lost about 25.000 variants out of almost 9 million. FUMA/MAGMA report the same in the log file: “25262 positions did not align with the GRCh37 reference.”
Paolo Maccallini said:
The number of variants was drastically reduced because during harmonization of the summary statistics I removed the non-biallelic SNPs. This is a default feature of MungeSumstats, the R package I used for harmonization and lift over. The problem seems to be that if one position needs allele flipping, the frequency of the alternative allele at that position cannot be calculated as 1-FRQ for non-biallelic SNPs. So the reduction in number was not due to lift-over, but to the elimination of non-biallelic alleles. I am now doing again the analysis, this time including non-biallelic alleles.

The documentation of MungeSumstats: https://www.bioconductor.org/packages/devel/bioc/vignettes/MungeSumstats/inst/doc/MungeSumstats.html
 
Back
Top Bottom