Rosa chinensis Whole Genome v1.0 Assembly & Annotation

Analysis NameRosa chinensis Whole Genome v1.0 Assembly & Annotation
MethodCANU assembler (v1.4)
SourcePacific Biosciences reads
Date performed2018-02-20


Hibrand, L., et al. (2018). "A high-quality sequence of Rosa chinensis to elucidate genome structure and ornamental traits." 
bioRxiv 254102; doi:

Genome annotation facts and statistics

Rose is the worlds most important ornamental plant with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Rose has a complex genome with high heterozygosity and various ploidy levels. Our objectives were (i) to develop the first high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short read sequencing, and anchoring to a high-density genetic map and (ii) to study the genome structure and the genetic basis of major ornamental traits. We produced a haploid rose line from R. chinensis "Old Blush" and generated the first rose genome sequence at the pseudo-molecule scale (512 Mbp with N50 of 3.4 Mb and L75 of 97). The sequence was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features including the pericentromeric regions through annotation of TE families and positioned centromeric repeats using FISH. Genetic diversity was analysed by resequencing eight Rosa species. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and number of flower petals. A rose APETALA2 homologue is proposed to be the major regulator of petals number in rose. This reference sequence is an important resource for studying polyploidisation, meiosis and developmental processes as we demonstrated for flower and prickle development. This reference sequence will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae.


Homology of the Rosa chinensis v1.0 transcript was determined by pairwise sequence comparison using the blastx algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2017-07) and 1e-6  for the Arabidoposis proteins (TAIR10), UniProt SwissProt (Release 2017-11), and UniProt TrEMBL (Release 2017-11) databases. The best hit reports are available for download in Excel format. 


Protein Homologs

Rosa chinensis v1.0 transcripts with NCBI nr homologs (EXCEL file) Rosa_chinensis_v1.0_vs_nr.xlsx
Rosa chinensis v1.0 transcripts with NCBI nr (FASTA file) Rosa_chinensis_v1.0_vs_nr_hit.fasta
Rosa chinensis v1.0 transcripts without NCBI nr (FASTA file) Rosa_chinensis_v1.0_vs_nr_noHit.fasta
Rosa chinensis v1.0 transcripts with  arabidopsis (TAIR10) homologs (EXCEL file) Rosa_chinensis_v1.0_vs_tair.xlsx
Rosa chinensis v1.0 transcripts with  arabidopsis (TAIR10) (FASTA file) Rosa_chinensis_v1.0_vs_tair_hit.fasta
Rosa chinensis v1.0 transcripts without  arabidopsis (TAIR10) (FASTA file) Rosa_chinensis_v1.0_vs_tair_noHit.fasta
Rosa chinensis v1.0 transcripts with ExPASy SwissProt homologs (EXCEL file) Rosa_chinensis_v1.0_vs_swissprot.xlsx
Rosa chinensis v1.0 transcripts with ExPASy SwissProt (FASTA file) Rosa_chinensis_v1.0_vs_swissprot_hit.fasta
Rosa chinensis v1.0 transcripts without ExPASy SwissProt (FASTA file) Rosa_chinensis_v1.0_vs_swissprot_noHit.fasta
Rosa chinensis v1.0 transcripts with ExPASy TrEMBL homologs (EXCEL file) Rosa_chinensis_v1.0_vs_trembl.xlsx
Rosa chinensis v1.0 transcripts with ExPASy TrEMBL (FASTA file) Rosa_chinensis_v1.0_vs_trembl_hit.fasta
Rosa chinensis v1.0 transcripts without ExPASy TrEMBL (FASTA file) Rosa_chinensis_v1.0_vs_trembl_noHit.fasta



All annotation files are available for download by selecting the desired data type in the left-hand side bar.  Each data type page will provide a description of the available files and links do download.



Pseudomolecule (FASTA file) Rosa chinensis v1.0.fasta.gz


Gene Predictions


mRNA sequences (FASTA file) Rosa chinensis v1.0_mRNA.fasta.gz
CDS sequences (FASTA file) Rosa chinensis v1.0_CDs.fasta.gz
Protein sequences  (FASTA file) Rosa chinensis v1.0_prot.fasta.gz
Transposable element sequences (GFF3 file) Rosa chinensis v1.0_TE.gff3.gz
ncRNA sequences (FASTA file) Rosa chinensis v1.0_ncrna.fasta.gz
Transposable element sequences (FASTA file) Rosa chinensis v1.0_gene.fasta.gz
Gene models (GFF3 file) Rosa chinensis v1.0_gene_models.gff3.gz


Functional Analysis

Rosa_chinensis v1.0

Functional annotation for the Rosa_chinensis v1.0 genome are available for download below. The Rosa_chinensis transcripts were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).


InterPro Domains for Rosa_chinensis v1.0 transcripts (EXCEL file) Rosa_chinensis_v1.0_IRP.xlsx
Gene Ontology annotations for Rosa_chinensis v1.0 transcripts  (EXCEL file) Rosa_chinensis_v1.0_GO.xlsx
Rosa_chinensis v1.0 transcripts mapped to KEGG Pathways transcripts (EXCEL file) Rosa_chinensis_v1.0_KEGG_pathway.xlsx
Rosa_chinensis v1.0 transcripts mapped to KEGG Orthologs transcripts (EXCEL file) Rosa_chinensis_v1.0_KEGG_ortholog.xlsx