Abstract
Prolonged detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA and recurrence of PCR-positive tests have been widely reported in patients after recovery from COVID-19, but some of these patients do not appear to shed infectious virus. We investigated the possibility that SARS-CoV-2 RNAs can be reverse-transcribed and integrated into the DNA of human cells in culture and that transcription of the integrated sequences might account for some of the positive PCR tests seen in patients. In support of this hypothesis, we found that DNA copies of SARS-CoV-2 sequences can be integrated into the genome of infected human cells. We found target site duplications flanking the viral sequences and consensus LINE1 endonuclease recognition sequences at the integration sites, consistent with a LINE1 retrotransposon-mediated, target-primed reverse transcription and retroposition mechanism. We also found, in some patient-derived tissues, evidence suggesting that a large fraction of the viral sequences is transcribed from integrated DNA copies of viral sequences, generating viral–host chimeric transcripts. The integration and transcription of viral sequences may thus contribute to the detection of viral RNA by PCR in patients after infection and clinical recovery. Because we have detected only subgenomic sequences derived mainly from the 3′ end of the viral genome integrated into the DNA of the host cell, infectious virus cannot be produced from the integrated subgenomic SARS-CoV-2 sequences.
Continuous or recurrent positive severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) PCR tests have been reported in samples taken from patients weeks or months after recovery from an initial infection (1⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–17). Although bona fide reinfection with SARS-CoV-2 after recovery has recently been reported (18), cohort-based studies with subjects held in strict quarantine after they recovered from COVID-19 suggested that at least some “re-positive” cases were not caused by reinfection (19, 20). Furthermore, no replication-competent virus was isolated or spread from these PCR-positive patients (1⇓–3, 5, 6, 12, 16), and the cause for the prolonged and recurrent production of viral RNA remains unknown. SARS-CoV-2 is a positive-stranded RNA virus. Like other beta-coronaviruses (SARS-CoV-1 and Middle East respiratory syndrome-related coronavirus), SARS-CoV-2 employs an RNA-dependent RNA polymerase to replicate its genomic RNA and transcribe subgenomic RNAs (21⇓⇓–24). One possible explanation for the continued detection of SARS-CoV-2 viral RNA in the absence of virus reproduction is that, in some cases, DNA copies of viral subgenomic RNAs may integrate into the DNA of the host cell by a reverse transcription mechanism. Transcription of the integrated DNA copies could be responsible for positive PCR tests long after the initial infection was cleared. Indeed, nonretroviral RNA virus sequences have been detected in the genomes of many vertebrate species (25, 26), with several integrations exhibiting signals consistent with the integration of DNA copies of viral mRNAs into the germline via ancient long interspersed nuclear element (LINE) retrotransposons (reviewed in ref. 27). Furthermore, nonretroviral RNA viruses such as vesicular stomatitis virus or lymphocytic choriomeningitis virus (LCMV) can be reverse transcribed into DNA copies by an endogenous reverse transcriptase (RT), and DNA copies of the viral sequences have been shown to integrate into the DNA of host cells (28⇓–30). In addition, cellular RNAs, for example the human APP transcripts, have been shown to be reverse-transcribed by endogenous RT in neurons with the resultant APP fragments integrated into the genome and expressed (31). Human LINE1 elements (∼17% of the human genome), a type of autonomous retrotransposons, which are able to retro-transpose themselves and other nonautonomous elements such as Alu, are a source of cellular endogenous RT (32⇓–34). Endogenous LINE1 elements have been shown to be expressed in aged human tissues (35) and LINE1-mediated somatic retrotransposition is common in cancer patients (36, 37). Moreover, expression of endogenous LINE1 and other retrotransposons in host cells is commonly up-regulated upon viral infection, including SARS-CoV-2 infection (38⇓–40).
In this study, we show that SARS-CoV-2 sequences can integrate into the host cell genome by a LINE1-mediated retroposition mechanism. We provide evidence that the integrated viral sequences can be transcribed and that, in some patient samples, the majority of viral transcripts appear to be derived from integrated viral sequences.
https://www.pnas.org/content/118/21/e2105968118/tab-article-info
Prolonged detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA and recurrence of PCR-positive tests have been widely reported in patients after recovery from COVID-19, but some of these patients do not appear to shed infectious virus. We investigated the possibility that SARS-CoV-2 RNAs can be reverse-transcribed and integrated into the DNA of human cells in culture and that transcription of the integrated sequences might account for some of the positive PCR tests seen in patients. In support of this hypothesis, we found that DNA copies of SARS-CoV-2 sequences can be integrated into the genome of infected human cells. We found target site duplications flanking the viral sequences and consensus LINE1 endonuclease recognition sequences at the integration sites, consistent with a LINE1 retrotransposon-mediated, target-primed reverse transcription and retroposition mechanism. We also found, in some patient-derived tissues, evidence suggesting that a large fraction of the viral sequences is transcribed from integrated DNA copies of viral sequences, generating viral–host chimeric transcripts. The integration and transcription of viral sequences may thus contribute to the detection of viral RNA by PCR in patients after infection and clinical recovery. Because we have detected only subgenomic sequences derived mainly from the 3′ end of the viral genome integrated into the DNA of the host cell, infectious virus cannot be produced from the integrated subgenomic SARS-CoV-2 sequences.
Continuous or recurrent positive severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) PCR tests have been reported in samples taken from patients weeks or months after recovery from an initial infection (1⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–17). Although bona fide reinfection with SARS-CoV-2 after recovery has recently been reported (18), cohort-based studies with subjects held in strict quarantine after they recovered from COVID-19 suggested that at least some “re-positive” cases were not caused by reinfection (19, 20). Furthermore, no replication-competent virus was isolated or spread from these PCR-positive patients (1⇓–3, 5, 6, 12, 16), and the cause for the prolonged and recurrent production of viral RNA remains unknown. SARS-CoV-2 is a positive-stranded RNA virus. Like other beta-coronaviruses (SARS-CoV-1 and Middle East respiratory syndrome-related coronavirus), SARS-CoV-2 employs an RNA-dependent RNA polymerase to replicate its genomic RNA and transcribe subgenomic RNAs (21⇓⇓–24). One possible explanation for the continued detection of SARS-CoV-2 viral RNA in the absence of virus reproduction is that, in some cases, DNA copies of viral subgenomic RNAs may integrate into the DNA of the host cell by a reverse transcription mechanism. Transcription of the integrated DNA copies could be responsible for positive PCR tests long after the initial infection was cleared. Indeed, nonretroviral RNA virus sequences have been detected in the genomes of many vertebrate species (25, 26), with several integrations exhibiting signals consistent with the integration of DNA copies of viral mRNAs into the germline via ancient long interspersed nuclear element (LINE) retrotransposons (reviewed in ref. 27). Furthermore, nonretroviral RNA viruses such as vesicular stomatitis virus or lymphocytic choriomeningitis virus (LCMV) can be reverse transcribed into DNA copies by an endogenous reverse transcriptase (RT), and DNA copies of the viral sequences have been shown to integrate into the DNA of host cells (28⇓–30). In addition, cellular RNAs, for example the human APP transcripts, have been shown to be reverse-transcribed by endogenous RT in neurons with the resultant APP fragments integrated into the genome and expressed (31). Human LINE1 elements (∼17% of the human genome), a type of autonomous retrotransposons, which are able to retro-transpose themselves and other nonautonomous elements such as Alu, are a source of cellular endogenous RT (32⇓–34). Endogenous LINE1 elements have been shown to be expressed in aged human tissues (35) and LINE1-mediated somatic retrotransposition is common in cancer patients (36, 37). Moreover, expression of endogenous LINE1 and other retrotransposons in host cells is commonly up-regulated upon viral infection, including SARS-CoV-2 infection (38⇓–40).
In this study, we show that SARS-CoV-2 sequences can integrate into the host cell genome by a LINE1-mediated retroposition mechanism. We provide evidence that the integrated viral sequences can be transcribed and that, in some patient samples, the majority of viral transcripts appear to be derived from integrated viral sequences.
https://www.pnas.org/content/118/21/e2105968118/tab-article-info