Phenome-wide analysis of copy number variants in 470,727 UK Biobank genomes, 2026, Zou et al

forestglip

Moderator
Staff member
Phenome-wide analysis of copy number variants in 470,727 UK Biobank genomes

Zou, Xueqing Zoe; Hu, Fengyuan; Lou, Haiyi; Burren, Oliver S.; Li, Xiaoyin; Megy, Karyn; Wheeler, Eleanor; Wu, Qiang; Atanur, Santosh S.; Karpinski, Marcin; Loesch, Douglas; Fairhurst-Hunter, Zammy; Deevi, Sri V. V.; Oerton, Erin; Wen, Sean; Jiang, Xiao; Salvoro, Cecilia; Mitchell, Jonathan; Nag, Abhishek; Hollis, Ben; O’Neill, Amanda; Anderson-Dring, Lauren; Bohlooly-Y, Mohammad; Buvall, Lisa; Cameron-Christie, Sophia; Prins, Bram; Cohen, Suzanne; Danielson, Regina F.; Davis, Andrew; Ding, Wei; Dougherty, Brian; Garg, Manik; Georgi, Benjamin; Harper, Andrew; Haefliger, Carolina; Hammar, Mårten; Hanna, Richard N.; Henry, Ian; Kundu, Kousik; Lai, Zhongwu; Lal, Mark; Lassi, Glenda; Liang, Yupu; Lopes, Margarida; Lythgow, Kieren; Maisuria-Armer, Meeta; March, Ruth; Matelska, Dorota; Menzies, Rob; Michaëlsson, Erik; Mowrey, Bill; Muthas, Daniel; Ohne, Yoichiro; Pullman, Benjamin; Hess, Sonja; Raies, Arwa; Reznichenko, Anna; Romero Ros, Xavier; Stevens, Helen; Tachmazidou, Ioanna; Viollet, Coralie; Vitsios, Dimitrios; Walentinsson, Anna; Wang, Lily; Wang, Qing-Dong; Cuomo, Anna; Martin Herranz, Daniel Elias; O’Connell, Jared; Del-Aguila, Jorge L.; Konkar, Anish; Challis, Benjamin; Platt, Adam; Ort, Tatiana; Garnett, James; Peng, Xiao-Rong; Baumberg, Gabrielle; Frydrych, Natalia; Stefanucci, Luca; Szymaniak, Anna; Tsakiroglou, Anna Maria; Sharma, Rahul; Harrow, Jen; MacArthur, Stewart; Wasilewski, Sebastian; O’Dell, Sean; Tian, Lifeng; Smith, Katherine R.; del Angel, Guillermo; Fabre, Margarete; Dhindsa, Ryan S.; Wang, Quanli; Petrovski, Slavé; Carss, Keren

Abstract
Copy number variants (CNVs) are key drivers of human diversity and disease risk1. Here we evaluate the role of CNVs across a broad range of human phenotypes and diseases by analysing CNVs from 470,727 UK Biobank whole-genome sequences and conducting a variant- and gene-level phenome-wide association study (PheWAS) with 2,941 plasma protein abundance measurements, 13,336 binary clinical phenotypes and 1,911 quantitative traits.

Proteomic analyses validated functional associations of CNVs with nearby genes (cis-protein quantitative trait loci; cis-pQTLs)—with deletions and duplications typically associated with reduced and increased protein levels, respectively—and uncovered previously unknown protein–protein interactions (trans-pQTLs). Our PheWAS recapitulated known associations and uncovered associations in both coding and non-coding regions.

Notably, we identified a rare deletion in ZNF451 associated with increased leukocyte telomere length and a non-coding deletion of a SLC2A9 enhancer associated with reduced gout risk. In addition, by combining CNVs with protein-coding single nucleotide variants and indels, we enhanced the power of our study to detect gene–disease associations. Finally, we leveraged this multiomics dataset to identify several pQTLs that constitute candidate biomarkers, including TMPRSS5 for Charcot–Marie–Tooth disease type 1A.

This multiancestry whole-genome-sequence CNV PheWAS offers insights into the roles of CNVs in human health outcomes and could serve as a valuable resource for therapeutic development.

Web | DOI | PMC | PDF | Nature | Open Access
 
I've placed this in ME/CFS Research, because the ME/CFS phenotype was one of the many UK BioBank phenotypes tested in this study for copy number variants.

Nightsong pointed out that there might be some interesting data to explore from this study:
Tangentially: while looking for information on the AZ dataset, I also noticed that raw summary statistics for UK Biobank CNV & WGS PheWAS performed by AZ are available for public download - Link - Bulk Data Downloads.
 
From the page linked above, I downloaded the cnv-all gene-level results for European/NFE binary phenotypes.

Here are the most significant genes based on frequency of copy number variants in the gene in cases vs controls for the phenotype "120010#Ever had chronic Fatigue Syndrome or Myalgic Encephalomyelitis (M.E.)". The size limit for posts meant I had to limit the table to 59 results, but I'll attach a spreadsheet with all of the results from that file for this phenotype.

For all rows:
Phenotype = 120010#Ever had chronic Fatigue Syndrome or Myalgic Encephalomyelitis (M.E.)
Type = Binary
Category = Chapter VI Diseases of the nervous system
GeneCollapsingModelpValuenSamplesBinNcasesBinQVcasesBinNcontrolsBinQVcontrolsBinCaseFreqBinCtrlFreqBinOddsRatioBinOddsRatioLCIBinOddsRatioUCI
1ENSG00000287125LoF_UTR_promoter_unknown_dom5.07E-051217322481411925170.00161225312374045.87E-0527.50898.047694.0331
2ITGAELoF_UTR_promoter_unknown_dom6.72E-0512173224817119251430.00282144296654570.00036058397833147.8443.525117.454
3ITGAELoF_dom8.20E-0512171824806119238300.00241935483870960.0002515976450469.63694.007423.1742
4ITGAELoF_UTR_promoter_dom8.20E-0512171824806119238300.00241935483870960.0002515976450469.63694.007423.1742
5MIR6082LoF_dom0.00043851217322481311925150.00120918984280534.19E-0528.87316.8962120.8857
6MIR6082LoF_UTR_promoter_unknown_dom0.00043851217322481311925150.00120918984280534.19E-0528.87316.8962120.8857
7MIR6082LoF_UTR_promoter_dom0.00043851217322481311925150.00120918984280534.19E-0528.87316.8962120.8857
8ENSG00000266602LoF_UTR_promoter_unknown_dom0.00046841217322481341192518480.01370415155179360.00711105147965211.941.37382.7397
9SMIM15CG_dom0.000522212173224814119251150.00161225312374040.000125785108720212.83664.257338.7051
10ENSG00000273299LoF_UTR_promoter_unknown_dom0.00053641217322481101192511270.0040306328093510.00106498058716483.7961.99167.2349
11ENSG00000266450LoF_dom0.0005461217232481331192428170.01330108827085850.00685161268680491.9541.37652.7737
12ENSG00000266450LoF_UTR_promoter_dom0.0005461217232481331192428170.01330108827085850.00685161268680491.9541.37652.7737
13CLDN12LoF_UTR_promoter_unknown_dom0.00056781217322481101192511280.0040306328093510.00107336626107953.76631.97647.177
14ENSG00000266450LoF_UTR_promoter_unknown_dom0.00061191217322481331192518260.01330108827085850.00692656665352911.93271.36162.7433
15ENSG00000266602LoF_dom0.00062711217112480331192318280.01330645161290320.00694450268805931.92851.35872.7372
16ENSG00000266602LoF_UTR_promoter_dom0.00062711217112480331192318280.01330645161290320.00694450268805931.92851.35872.7372
17LINC02027LoF_UTR_promoter_unknown_dom0.000642212173224814119251160.00161225312374040.000134170782634912.03424.020336.0227
18TECPR2LoF_UTR_promoter_unknown_dom0.000683412173224815119251300.00201531640467550.00025157021744058.02513.111120.701
19ENSG00000287125LoF_UTR_promoter_dom0.00091041217312480311925170.00120967741935485.87E-0520.63165.331979.8332
20ENSG00000287125LoF_dom0.00091041217312480311925170.00120967741935485.87E-0520.63165.331979.8332
21GINS2LoF_dom0.00091141217312481311925070.00120918984280535.87E-0520.62315.329779.8003
22GINS2LoF_UTR_promoter_dom0.00091141217312481311925070.00120918984280535.87E-0520.62315.329779.8003
23ENSG00000275393LoF_dom0.00091141217322481311925170.00120918984280535.87E-0520.62335.329879.801
24ENSG00000275393LoF_UTR_promoter_dom0.00091141217322481311925170.00120918984280535.87E-0520.62335.329879.801
25ENSG00000275393LoF_UTR_promoter_unknown_dom0.00091141217322481311925170.00120918984280535.87E-0520.62335.329879.801
26ENSG00000289509LoF_UTR_promoter_unknown_dom0.001112173224814119251190.00161225312374040.000159327804378910.13383.44529.8101
27ENSG00000259376LoF_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
28LINC02726LoF_UTR_promoter_dom0.00121216272479211914810.00080677692617998.39E-0696.20278.72011061.3391
29SNAI3LoF_dom0.00121217312481211925010.00080612656187028.39E-0696.20738.72051061.3905
30LINC02726LoF_dom0.00121216272479211914810.00080677692617998.39E-0696.20278.72011061.3391
31ENSG00000259376LoF_UTR_promoter_unknown_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
32ENSG00000259376LoF_UTR_promoter_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
33ENSG00000273102CG_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
34CERKLoF_dom0.00121217222481211924110.00080612656187028.39E-0696.20018.71981061.3104
35SNAI3LoF_UTR_promoter_dom0.00121217312481211925010.00080612656187028.39E-0696.20738.72051061.3905
36TBC1D22A-DTLoF_UTR_promoter_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
37LATS2LoF_UTR_promoter_unknown_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
38SNAI3-AS1LoF_UTR_promoter_dom0.00121217302481211924910.00080612656187028.39E-0696.20658.72041061.3816
39TBC1D22A-DTLoF_UTR_promoter_unknown_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
40TBC1D22A-DTLoF_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
41GINS2LoF_UTR_promoter_unknown_dom0.00121217322481311925180.00120918984280536.71E-0518.04524.784468.061
42SNAI3-AS1LoF_dom0.00121217302481211924910.00080612656187028.39E-0696.20658.72041061.3816
43CERKLoF_UTR_promoter_dom0.00121217222481211924110.00080612656187028.39E-0696.20018.71981061.3104
44ENSG00000220256LoF_UTR_promoter_unknown_dom0.00121217322481211925110.00080612656187028.39E-0696.20818.72061061.3994
45C16orf74LoF_UTR_promoter_unknown_dom0.001312173224814119251200.00161225312374040.00016771347829369.6273.288128.1867
46NAGKLoF_UTR_promoter_dom0.001312173224814119251200.00161225312374040.00016771347829369.6273.288128.1867
47NAGKLoF_UTR_promoter_unknown_dom0.001312173224814119251200.00161225312374040.00016771347829369.6273.288128.1867
48LINC02241LoF_UTR_promoter_unknown_dom0.001312173224818119251950.00322450624748080.00079663902189494.05751.96988.358
49ENSG00000251088LoF_UTR_promoter_unknown_dom0.001512173224814119251210.00161225312374040.00017609915220839.16853.144926.73
50TSPAN13LoF_dom0.00161217312481311925090.00120918984280537.55E-0516.044.339759.2849
51TSPAN13LoF_UTR_promoter_dom0.00161217312481311925090.00120918984280537.55E-0516.044.339759.2849
52ENSG00000264080LoF_dom0.00181217322481311192517970.01249496170898830.00668338211000321.88061.31072.6982
53ENSG00000264080LoF_UTR_promoter_dom0.00181217322481311192517970.01249496170898830.00668338211000321.88061.31072.6982
54ENSG00000264080LoF_UTR_promoter_unknown_dom0.00181217322481311192517970.01249496170898830.00668338211000321.88061.31072.6982
55MCEECG_dom0.002112173224814119251230.00161225312374040.00019287050003778.37112.892824.2239
56DNAL1LoF_UTR_promoter_unknown_dom0.002112173224813119251100.00120918984280538.39E-0514.4363.970452.487
57SGK3LoF_UTR_promoter_unknown_dom0.002112173224813119251100.00120918984280538.39E-0514.4363.970452.487
58AGR2LoF_UTR_promoter_dom0.002112170524803119225100.00120967741935488.39E-0514.43863.971252.4967
59AGR2LoF_dom0.002112170524803119225100.00120967741935488.39E-0514.43863.971252.4967
 

Attachments

Some more genes to ponder about.

I also downloaded the "Gene-level results (cnv001-snv)" file ("CNVs with site frequency < 0.01 with protein-truncating SNVs"), which looks at copy number variants as well as single nucleotide variants which prevent the whole gene from being transcribed.
Our data indicated that most CNV [copy number variant] deletions have LoF [loss of function] effects similar to PTVs [protein-truncating variants], motivating integration of these variant types to increase power to detect genetic signals (Methods). Here we introduced a CNV + PTV gene-level collapsing model that integrates CNV deletions with qualifying PTVs per gene

Here are the most significant genes, with all results for this phenotype attached as a spreadsheet.
For all rows:
Phenotype = 120010#Ever had chronic Fatigue Syndrome or Myalgic Encephalomyelitis (M.E.)
Type = Binary
Category = Chapter VI Diseases of the nervous system
GeneCollapsingModelpValuenSamplesBinNcasesBinQVcasesBinNcontrolsBinQVcontrolsBinCaseFreqBinCtrlFreqBinOddsRatioBinOddsRatioLCIBinOddsRatioUCI
1NAGKPTV_CNV_LoF_UTR_promoter_dom_rare0.00005579121732248110119251940.004030632809351070.000788253347980315.132.66979.8578
2NAGKPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00005579121732248110119251940.004030632809351070.000788253347980315.132.66979.8578
3HERC1PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.000157212173224817119251500.002821442966545750.0004192836957342086.74543.055314.8921
4ANO10PTV_CNV_LoF_dom_rare0.00017181217322481111192511310.004433696090286180.001098523282823624.04962.18597.502
5ANO10PTV_CNV_LoF_UTR_promoter_dom_rare0.00018281217322481111192511320.004433696090286180.001106908956738314.01892.16987.4438
6ANO10PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00021911217322481111192511350.004433696090286180.001132065978482363.92952.12267.2744
7RPAP2PTV_CNV_LoF_UTR_promoter_dom_rare0.00023251217322481111192511360.004433696090286180.001140451652397053.90052.10737.2196
8RPAP2PTV_CNV_LoF_dom_rare0.00023251217322481111192511360.004433696090286180.001140451652397053.90052.10737.2196
9USP3PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.000257512173224814119251120.001612253123740430.0001006280869762116.04625.171549.7881
10CHPFPTV_CNV_LoF_UTR_promoter_dom_rare0.00032841217322481111192511420.004433696090286180.001190765695885153.73552.02016.9075
11CHPFPTV_CNV_LoF_dom_rare0.00032841217322481111192511420.004433696090286180.001190765695885153.73552.02016.9075
12CHPFPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00043191217322481111192511470.004433696090286180.001232694065458573.60831.95286.6673
13TPOPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00073221217322481181192513530.007255139056831920.002960142891883512.46151.53043.9593
14CLDN12PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00083371217322481101192511350.004030632809351070.001132065978482363.57081.87616.7963
15BCL2L2PTV_CNV_LoF_UTR_promoter_dom_rare0.00091141217322481311925170.001209189842805325.86997174027891E-0520.62335.329879.801
16BCL2L2PTV_CNV_LoF_dom_rare0.00091141217322481311925170.001209189842805325.86997174027891E-0520.62335.329879.801
17BCL2L2PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
18BCL2L2-PABPN1PTV_CNV_LoF_UTR_promoter_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
19BCL2L2-PABPN1PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
20BCL2L2-PABPN1PTV_CNV_LoF_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
21LCE3DPTV_CNV_LoF_UTR_promoter_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
22LCE3DPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
23LCE3DPTV_CNV_LoF_dom_rare0.00121217322481311925180.001209189842805326.70853913174732E-0518.04524.784468.061
24MOKPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00131217322481131192512250.005239822652156390.001886776630803932.78651.59094.8807
25RHPN1PTV_CNV_LoF_UTR_promoter_dom_rare0.00131217322481131192512240.005239822652156390.001878390956889252.7991.59794.9029
26RHPN1PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00131217322481131192512240.005239822652156390.001878390956889252.7991.59794.9029
27RHPN1PTV_CNV_LoF_dom_rare0.00131217322481131192512240.005239822652156390.001878390956889252.7991.59794.9029
28AOC3PTV_CNV_LoF_UTR_promoter_dom_rare0.00141217322481131192512260.005239822652156390.001895162304718622.77411.58394.8588
29AOC3PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00141217322481131192512260.005239822652156390.001895162304718622.77411.58394.8588
30AOC3PTV_CNV_LoF_dom_rare0.00141217322481131192512260.005239822652156390.001895162304718622.77411.58394.8588
31C16orf74PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.001512173224814119251210.001612253123740430.0001760991522083679.16853.144926.73
32EXOC8PTV_CNV_LoF_UTR_promoter_dom_rare0.001512173224814119251210.001612253123740430.0001760991522083679.16853.144926.73
33EXOC8PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.001512173224814119251210.001612253123740430.0001760991522083679.16853.144926.73
34EXOC8PTV_CNV_LoF_dom_rare0.001512173224814119251210.001612253123740430.0001760991522083679.16853.144926.73
35GNRH2PTV_CNV_LoF_UTR_promoter_dom_rare0.00171217322481131192512320.005239822652156390.001945476348206722.70231.54354.731
36GNRH2PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00171217322481131192512320.005239822652156390.001945476348206722.70231.54354.731
37GNRH2PTV_CNV_LoF_dom_rare0.00171217322481131192512320.005239822652156390.001945476348206722.70231.54354.731
38ACSL6PTV_CNV_LoF_UTR_promoter_dom_rare0.001812173224816119251570.002418379685610640.0004779834131369975.06942.183811.7681
39ACSL6PTV_CNV_LoF_dom_rare0.001812173224816119251570.002418379685610640.0004779834131369975.06942.183811.7681
40ZNF804BPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00181217322481131192512330.005239822652156390.001953862022121412.69061.53694.7104
41ACSL6PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.001912173224816119251580.002418379685610640.0004863690870516814.98192.147611.5571
42OCSTAMPPTV_CNV_LoF_UTR_promoter_dom_rare0.002112173224814119251230.001612253123740430.0001928705000377368.37112.892824.2239
43OCSTAMPPTV_CNV_LoF_dom_rare0.002112173224814119251230.001612253123740430.0001928705000377368.37112.892824.2239
44SPATA25PTV_CNV_LoF_UTR_promoter_dom_rare0.002112173224814119251230.001612253123740430.0001928705000377368.37112.892824.2239
45SPATA25PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.002112173224814119251230.001612253123740430.0001928705000377368.37112.892824.2239
46SPATA25PTV_CNV_LoF_dom_rare0.002112173224814119251230.001612253123740430.0001928705000377368.37112.892824.2239
47MEGF10PTV_CNV_LoF_UTR_promoter_dom_rare0.002212173224817119251810.002821442966545750.0006792395870894174.16271.92139.0192
48MEGF10PTV_CNV_LoF_dom_rare0.002212173224817119251810.002821442966545750.0006792395870894174.16271.92139.0192
49RPAP2PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.00221217322481121192512100.004836759371221280.001760991522083672.75511.53784.936
50BCAR3PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.002312173224816119251600.002418379685610640.0005031404348810494.81582.078711.157
51CGGBP1PTV_CNV_LoF_UTR_promoter_dom_rare0.00241217322481211925120.0008061265618702141.67713478293683E-0548.10376.7731341.6415
52CGGBP1PTV_CNV_LoF_dom_rare0.00241217322481211925120.0008061265618702141.67713478293683E-0548.10376.7731341.6415
53OCSTAMPPTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.002412173224814119251240.001612253123740430.000201256173952428.02232.781423.1383
54CHID1PTV_CNV_LoF_UTR_promoter_unknown_dom_rare0.0028121732248191192511340.003627569528415960.001123680304567683.23641.6466.3633

HERC1 was the most significant gene for predicted loss of function variants in a different UK BioBank study (direct link to results), and it's the second most significant here. So somewhat expected since they're both looking at the UK BioBank, and I think both would be including protein-truncating variants in their tests. But good to see the results align.
 

Attachments

Back
Top Bottom