Genetics: CA10

I've been wondering- if CA10 is involved in ME/CFS (or pain disorders) could it (or whatever mechnistic process the CA10 gene finding represents) be directly modulated with drugs in order to stop whatever signals are causing PEM/pain etc? Or is it more of a pointer to the general pathology
 
Inspired by some AlphaGenome stuff I’ve been learning more about and am interested in looking at promoters and enhancers for some of these candidate genes.

Enhancers and promoters are gene-regulatory elements. They are stretches of DNA that help in both eukaryotic and prokaryotic transcription. The promoters are known to initiate transcription, and the enhancers increase the level of transcription
Source

If you look at the genecard for CA10 in the genomics section you can see GeneHancer info on these. There’s lots of interesting stuff in there.

For instance pick the top one, a promoter/enhancer with a high score (it has a little star by it to show this too)
Expand the info and see the location of it is chr17:52158438-52159092
You can zoom in to that location on the DecodeME LocusZoom data and see lots of the hits tie in with this area
The Ensembl info has which tissues these are active in too
Looking at details of all of them a number match with the changes on DecodeME

If I’m interpreting this right it says the changes people with ME/CFS are more likely to have are in these promoters and enhancers in those tissues? So affect expression of CA10 in those tissues? Is that right?

Sorry if this is covering known territory, it’s all new to me. Given the location, at the start of the transcription site, it may be obvious to people thst these are promoters?

I wonder if it’s worth digging through all the Genhancer info for all the genes, especially as some enhancers can be a long way away from the transcription site for the gene.

The EPDnew info will look familiar to anyone who has been looking at AlphaGenome outputs too… I have lots more to learn here.
 
Last edited:
@hotblack you might like to explore some of the tracks on UCSC genome browser--it compiles a lot of this information visually so you can do some exploring.

Here I have it centered on the top DecodeME SNP in the region:

You can "highlight" a region by dragging a box over a region and clicking "add highlight" on the box that comes up, that way you can keep track of where a SNP overlaps with features on other tracks
1770839351753.png

These tracks have a lot of promoter/enhancer/regulatory region info you can play around with (click "show" and then "refresh" to add them.
1770839478359.png

The GENCODE track under "Genes and Gene Predictions" will mark known genes with their exons and introns

If you want to highlight a lot of SNPs at once there are ways you can add a custom track in a BED file format, though it'll take more effort to figure out
 
For instance pick the top one, a promoter/enhancer with a high score (it has a little star by it to show this too)
Expand the info and see the location of it is chr17:52158438-52159092
You can zoom in to that location on the DecodeME LocusZoom data and see lots of the hits tie in with this area

This is more or less exactly what my genetics friends at UCL did when they gave me a presentation of why they thought it was worth picking CA10 for a basic biology project.
 
If you start zoomed in on a SNP, you can also hit the "highlight" button down here which will highlight everything in your viewer
Thanks for the tip!
This is more or less exactly what my genetics friends at UCL did when they gave me a presentation of why they thought it was worth picking CA10 for a basic biology project.
Good to know I’m not talking nonsense! And nice to start to understand it a bit more.
 
I made a script to pull out some data to make it easy to compare to locuszoom and then looked at the top 15 and manually went through deleting those which weren’t clearly/very significant hits. It seems to be mainly the promoter locations.

Gene: CA10 - carbonic anhydrase 10
Location: chr17:51630313-52160017

Found 33 GeneHancer elements:
[GH17J052158] *Promoter/Enhancer | Score: 237 | chr17:52158674-52158830
Sources: ENCODE(Z-Lab),EPDnew
External: ENSR17_9QNZ6, ENSR17_9QNZ9, ENSR17_5G98JK, CA10_2
TFBSs: EZH2, ASH2L, RNF2, RBBP5, ZFX, CTCF, RAD21, SMC3, KDM5A, ZEB1, TCF12, ZNF263, EGR1, REST

[GH17J052146] *Enhancer | Score: 192 | chr17:52146767-52150085
Sources: ENCODE(Z-Lab),FANTOM5
External: ENSR17_9QNT2, ENSR17_9QNT6, ENSR17_83JDPW
TFBSs: ATF7, RUNX3, EP300, SPI1, FOS, FOS, GABPA, POLR2A, MAX, ZBTB33, YY1, REST, SP1, RXRA, MYC, STAT3, HNF4A, JUND, FOXA2, FOXA1, ATF3, FOS, EGR1

[GH17J052160] Promoter | Score: 149 | chr17:52158121-52158181
Sources: EPDnew
External: CA10_1
TFBSs: EZH2, SUZ12, ASH2L, MXI1, RNF2, GATA2, RBBP5

[GH17J052159] Promoter | Score: 94 | chr17:52159997-52160057
Sources: EPDnew
External: CA10_3
TFBSs: EZH2, SUZ12

The scores and stars (meaning an ‘Elite’) are Genehancer info, links to LocusZoom, the EPE or Ensembl info are included and TFBSs are the Transcription Factor Binding Sites.

Does anyone know what the logic is for the circles/triangle and colouring on LocusZoom? The latter seems to change dynamically as you move around so maybe it’s just highlighting most significant in the current view?
 
Last edited:
Wasn't kept in GRCh38, yes--unfortunately Ensembl doesn't really have detailed annotation for why certain genes get dropped in the newest release. Sometimes it's because the gene mapping is suspect, sometimes it's for some other logistical reason. I think that happens a lot to snoRNAs and miRNAs in particular just because of the sheer number of them. But that was the reasoning for creating the Archive--the current version is curated with the best intentions, but shouldn't be considered the end all be all.
Ensembl pipelines are very complex and it would be impossible to provide explanations for all changes between the releases. Sometimes it's possible to figure out or speculate the reasons by looking into intermediate outputs that people who ran the pipelines have but even that is quite challenging. The sheer amount of data and analyses that go into each release... It takes 3-4 or more months and something like 80-100 people working full-time to get a release out. The documentation on the website could be better, though, but it still wouldn't be enough to determine the exact reason behind each change.

I see snoZ178 was last present in Ensembl release 75 on a lower quality assembly which was 12 years ago. Assemblies, in this case GRCh37 and GRCh38, are imported into Ensembl, so if the sequence annotated as snoZ178 on GRCh37 was not present in the better quality assembly (GRCh38), it wouldn't have been annotated. If it is in GRCh38, it's possible that it wasn't predicted due to not passing some threshold somewhere or due to changes in the annotation pipeline. If we think snoZ178 or anything missing from Ensembl might be important, it's possible to contact Ensembl and the relevant team will hopefully have a look at it. There is a team which does manual annotation on the human genome.


I googled "snoZ178" and one of the results was https://humanpaingeneticsdb.ca/ where I found this:

[
{
"Loci ": "CA10; snoZ178",
"Publication Loci ": "CA10; snoZ178",
"Variants ": "rs11079993",
"allele 1 ": "G",
"allele 2 ": "T",
"direction ": "down",
"Phenotype ": "Other Clinical Pain",
"PMID ": "PMID:33830993",
"comments ": "Significantly associated with multisite chronic pain in female"
},
{
"Loci ": "snoZ178",
"Publication Loci ": "snoZ178",
"Variants ": "rs967823",
"allele 1 ": "G",
"allele 2 ": "A",
"direction ": "no direction reported",
"Phenotype ": "Pain",
"PMID ": "PMID:37844115",
"comments ": "Significntly ssocited with pain"
}
]
The variants are in the current Ensembl release: rs11079993, rs967823. When I clicked on "Phenotype data", an additional piece of info (after a table of associations) said "This variant has not been mapped to any Ensembl genes." for both of them.


I found snoZ178 in the current release of Ensembl Plants in Oryza meridionalis and Oryza longistaminata but the annotation is based on Rfam and a different team is in charge of plants annotation.

I don't know if any of this is relevant any more and I see the conversation has moved on, so apologies if irrelevant.
 
Back
Top Bottom