Long-read genome sequence assembly provides insight into ongoing retroviral invasion of the koala germline
Matthew Hobbs1, Andrew King1, Ryan Salinas2, Zhiliang Chen2, Kyriakos Tsangaras3,8, Alex D. Greenwood3,4, Rebecca N. Johnson1, Katherine Belov5, Marc R. Wilkins2,6 & Peter Timms7
1Australian Museum Research Institute, Australian Museum, 1 William Street Sydney, NSW, 2010, Australia.
2Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, NSW, 2052, Australia.
3Department of Wildlife Diseases, Leibniz Institute for Zoo and Wildlife Research, Berlin, Germany.
4Department of Veterinary Medicine, Freie Universität Berlin, Berlin, Germany.
5School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia.
6Ramaciotti Centre for Genomics, University of New South Wales, NSW, 2052, Australia.
7Faculty of Science, Health, Education & Engineering, University of the Sunshine Coast, Locked Bag 4, Maroochydore DC, Qld, 4558, Australia.
8Department of Translational Genetics, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus. Matthew Hobbs and Andrew King contributed equally to this work. Marc R. Wilkins and Peter Timms jointly supervised this work. Correspondence and requests for materials should be addressed to P.T. (email: )
The koala retrovirus (KoRV) is implicated in several diseases affecting the koala (Phascolarctos cinereus). KoRV provirus can be present in the genome of koalas as an endogenous retrovirus (present in all cells via germline integration) or as exogenous retrovirus responsible for somatic integrations of proviral KoRV (present in a limited number of cells). This ongoing invasion of the koala germline by KoRV provides a powerful opportunity to assess the viral strategies used by KoRV in an individual. Analysis of a high-quality genome sequence of a single koala revealed 133 KoRV integration sites. Most integrations contain full-length, endogenous provirus; KoRV-A subtype. The second most frequent integrations contain an endogenous recombinant element (recKoRV) in which most of the KoRV protein-coding region has been replaced with an ancient, endogenous retroelement. A third set of integrations, with very low sequence coverage, may represent somatic cell integrations of KoRV-A, KoRV-B and two recently designated additional subgroups, KoRV-D and KoRV-E. KoRV-D and KoRV-E are missing several genes required for viral processing, suggesting they have been transmitted as defective viruses. Our results represent the first comprehensive analyses of KoRV integration and variation in a single animal and provide further insights into the process of retroviral-host species interactions.