The apricot (Prunus armeniaca L.) genome elucidates Rosaceae evolution and beta-carotenoid synthesis
Apricots, scientifically known as Prunus armeniaca L, are drupes that resemble and are closely related to peaches or plums. As one of the top consumed fruits, apricots are widely grown worldwide except in Antarctica. A high-quality reference genome for apricot is still unavailable, which has become a handicap that has dramatically limited the elucidation of the associations of phenotypes with the genetic background, evolutionary diversity, and population diversity in apricot. DNA from P. armeniaca was used to generate a standard, size-selected library with an average DNA fragment size of ~20 kb. The library was run on Sequel SMRT Cells, generating a total of 16.54 Gb of PacBio subreads (N50 = 13.55 kb). The high-quality P. armeniaca reference genome presented here was assembled using long-read single-molecule sequencing at approximately 70× coverage and 171× Illumina reads (40.46 Gb), combined with a genetic map for chromosome scaffolding. The assembled genome size was 221.9 Mb, with a contig NG50 size of 1.02 Mb. Scaffolds covering 92.88% of the assembled genome were anchored on eight chromosomes. Benchmarking Universal Single-Copy Orthologs analysis showed 98.0% complete genes. We predicted 30,436 protein-coding genes, and 38.28% of the genome was predicted to be repetitive. We found 981 contracted gene families, 1324 expanded gene families and 2300 apricot-specific genes. The differentially expressed gene (DEG) analysis indicated that a change in the expression of the 9-cis-epoxycarotenoid dioxygenase (NCED) gene but not lycopene beta-cyclase (LcyB) gene results in a low β-carotenoid content in the white cultivar "Dabaixing". This complete and highly contiguous P. armeniaca reference genome will be of help for future studies of resistance to plum pox virus (PPV) and the identification and characterization of important agronomic genes and breeding strategies in apricot.