Prunus mongolica Whole Genome v1.0 Assembly & Annotation
Chromosome-level genome assembly of an endangered plant Prunus mongolica using PacBio and Hi-C technologies
Prunus mongolica is an ecologically and economically important xerophytic tree native to Northwest China. Here, we report a high-quality, chromosome-level P. mongolica genome assembly integrating PacBio high-fidelity sequencing and Hi-C technology. The assembled genome was 233.17 Mb in size, with 98.89% assigned to eight pseudochromosomes. The genome had contig and scaffold N50s of 24.33 Mb and 26.54 Mb, respectively, a BUSCO completeness score of 98.76%, and CEGMA indicated that 98.47% of the assembled genome was reliably annotated. The genome contained a total of 88.54 Mb (37.97%) of repetitive sequences and 23,798 protein-coding genes. We found that P. mongolica experienced two whole-genome duplications, with the most recent event occurring ~3.57 million years ago. Phylogenetic and chromosome syntenic analyses revealed that P. mongolica was closely related to P. persica and P. dulcis. Furthermore, we identified a number of candidate genes involved in drought tolerance and fatty acid biosynthesis. These candidate genes are likely to prove useful in studies of drought tolerance and fatty acid biosynthesis in P. mongolica, and will provide important genetic resources for molecular breeding and improvement experiments in Prunus species. This high-quality reference genome will also accelerate the study of the adaptation of xerophytic plants to drought.
Table 1. Global statistics of Prunus mongolica genome assembly and annotation
Homology of the Prunus mongolica Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6 for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format.
Functional annotation for the Prunus mongolica Genome v1.0 are available for download below. The Prunus mongolica Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Prunus mongolica genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.