Enhanced genome assembly and a new official gene set for Tribolium castaneum
Herndon, Nicolae ; Shelton, Jennifer ; Gerischer, Lizzy ; Ioannidis, Panos ; Ninova, Maria ; Dönitz, Jürgen ; Waterhouse, Robert M ; Liang, Chun et al.
Damm, Carsten ; Siemanowski, Janna ; Kitzmann, Peter ; Ulrich, Julia ; Dippel, Stefan ; Oberhofer, Georg ; Hu, Yonggang ; Schwirz, Jonas ; Schacht, Magdalena ; Lehmann, Sabrina ; Montino, Alice ; Posnien, Nico ; Gurska, Daniela ; Horn, Thorsten ; Seibert, Jan ; Vargas Jentzsch, Iris M ; Panfilio, Kristen A ; Li, Jianwei ; Wimmer, Ernst A ; Stappert, Dominik ; Roth, Siegfried ; Schröder, Reinhard ; Park, Yoonseong ; Schoppmeier, Michael ; Chung, Ho-Ryun ; Klingler, Martin ; Kittelmann, Sebastian ; Friedrich, Markus ; Chen, Rui ; Altincicek, Boran ; Vilcinskas, Andreas ; Zdobnov, Evgeny ; Griffiths-Jones, Sam ; Ronshaugen, Matthew ; Stanke, Mario ; Brown, Sue J ; Bucher, Gregor
Citable Link (URL):http://resolver.sub.uni-goettingen.de/purl?gs-1/17122
Background The red flour beetle Tribolium castaneum has emerged as an important model organism for the study of gene function in development and physiology, for ecological and evolutionary genomics, for pest control and a plethora of other topics. RNA interference (RNAi), transgenesis and genome editing are well established and the resources for genome-wide RNAi screening have become available in this model. All these techniques depend on a high quality genome assembly and precise gene models. However, the first version of the genome assembly was generated by Sanger sequencing, and with a small set of RNA sequence data limiting annotation quality. Results Here, we present an improved genome assembly (Tcas5.2) and an enhanced genome annotation resulting in a new official gene set (OGS3) for Tribolium castaneum, which significantly increase the quality of the genomic resources. By adding large-distance jumping library DNA sequencing to join scaffolds and fill small gaps, the gaps in the genome assembly were reduced and the N50 increased to 4753kbp. The precision of the gene models was enhanced by the use of a large body of RNA-Seq reads of different life history stages and tissue types, leading to the discovery of 1452 novel gene sequences. We also added new features such as alternative splicing, well defined UTRs and microRNA target predictions. For quality control, 399 gene models were evaluated by manual inspection. The current gene set was submitted to Genbank and accepted as a RefSeq genome by NCBI. Conclusions The new genome assembly (Tcas5.2) and the official gene set (OGS3) provide enhanced genomic resources for genetic work in Tribolium castaneum. The much improved information on transcription start sites supports transgenic and gene editing approaches. Further, novel types of information such as splice variants and microRNA target genes open additional possibilities for analysis.