Rosa chinensis Whole Genome v1.0 Assembly & Annotation
Hibrand, L., et al. (2018). "A high-quality sequence of Rosa chinensis to elucidate genome structure and ornamental traits."
bioRxiv 254102; doi: https://doi.org/10.1101/254102
Genome annotation facts and statistics
Rose is the worlds most important ornamental plant with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Rose has a complex genome with high heterozygosity and various ploidy levels. Our objectives were (i) to develop the first high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short read sequencing, and anchoring to a high-density genetic map and (ii) to study the genome structure and the genetic basis of major ornamental traits. We produced a haploid rose line from R. chinensis "Old Blush" and generated the first rose genome sequence at the pseudo-molecule scale (512 Mbp with N50 of 3.4 Mb and L75 of 97). The sequence was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features including the pericentromeric regions through annotation of TE families and positioned centromeric repeats using FISH. Genetic diversity was analysed by resequencing eight Rosa species. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and number of flower petals. A rose APETALA2 homologue is proposed to be the major regulator of petals number in rose. This reference sequence is an important resource for studying polyploidisation, meiosis and developmental processes as we demonstrated for flower and prickle development. This reference sequence will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae.
Homology of the Rosa chinensis v1.0 transcript was determined by pairwise sequence comparison using the blastx algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2017-07) and 1e-6 for the Arabidoposis proteins (TAIR10), UniProt SwissProt (Release 2017-11), and UniProt TrEMBL (Release 2017-11) databases. The best hit reports are available for download in Excel format.
All annotation files are available for download by selecting the desired data type in the left-hand "Resources" side bar. Each data type page will provide a description of the available files and links do download. Alternatively, you can use the FTP repository for bulk download.
Functional annotation for the Rosa_chinensis v1.0 genome are available for download below. The Rosa_chinensis transcripts were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).