Prunus persica Whole Genome v1.0 Assembly & Annotation
For use in publications, please CITE the original paper in Nature Genetics:
The International Peach Genome Initiative (2013). The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 45, 487-494 (2013) doi:10.1038/ng.2586
Peach (Prunus persica) is considered one of the genetically most well characterized species in the Rosaceae, and it has distinct advantages that make it suitable as a model genome species for Prunus as well as for other species in the Rosaceae. While some Prunus species, such as cultivated plums and sour cherries, are polyploid, peach is a diploid with n = 8 and has a comparatively small genome currently estimated to be ~220-230 Mbp based upon the peach v1.0 assembly. Peach has a relatively short juvenility period of 2-3 years compared to most other fruit tree species that require 6-10 years. In addition, a number of genes for fundamentally important traits have been genetically described in peach, including genes controlling flower and fruit development, tree growth habit, dormancy, cold hardiness, and disease and pest resistance.
Genome facts and statistics
Peach v1.0 was generated from DNA from the doubled haploid cultivar ‘Lovell’ which means that the genes and intervening DNA is “fixed” or identical for all alleles and both chromosomal copies of the genome. This doubled haploid nature was confirmed by the evaluation of >200 SSRs, and has facilitated a highly accurate and consistent assembly of the peach genome.
The Prunux persica v1.0 genome homology files are available for download in Excel formats with links to GBrowse and to external databases for matched homologs. All homology data was determined using the predicted peach gene transcripts (28,692 sequences) and NCBI blastx against various protein databases. An expectation value cutoff of 1e-6 was used. For EST alignments the NCBI Rosaceae and Genera EST databases were downloaded, and filtered for quality before blasting.
Rosaceae EST alignments
All assembly and annotation files are available for download by selecting the desired data type in the left-hand side bar. Each data type page will provide a description of the available files and links do download.
The Prunux persica v1.0 genome assembly files are available in FASTA and GFF3 formats. There are a total of 202 scaffolds in this assembly of peach. The psuedomolecules corresponding to the eight chromosomes of peach are the first eight scaffolds of the assembly. In future releases these psuedomolecules will most likely be renamed but for now the pseudomolecules are named scaffold_1, scaffold_2, scaffold_3, etc.
The Prunus persica v1.0 genome gene prediction files are available in FASTA and GFF3 formats. An update on May 16, 2012 added Phytozome PACid's to the genes GFF3 file.
The Prunus persica v1.0 genome genes were mapped to KEGG pathways and orthologs using the KEGG Automatic Annotation Server (KAAS). Resulting files are available for download below.
The Prunux persica v1.0 genome repeat files are available in GFF3 formats. Repeats were predicted using the Repbase database, LTR Finder and ReAS prediction tools. A consenus file contains repeats from all three methods.
SNPs have been provided by several different groups.
RosBREED Resequencing Alignments
A total of 23 different peach accessions were resequenced using Illumina short-read technology. The reads were trimmed and aligned to the Peach v1.0 genome and are available in BAM alignment files.
It is not necessary to download the BAM alignment files. Some are very large and multiple downloads may oversubscribe the network bandwidth. Rather, use the following instructions for viewing the alignments.
To view the strawberry resequencing alignments please follow these instructions:
Prunus persica tools available on GDR
Additional Prunus persica tools
After the initial assembly of the peach v1.0 genome, some large scaffolds were missing markers that allowed for their proper orientation and placement within the pseudomolecules. Further analysis was performed to locate markers on 10 scaffolds greater than 300kp in order to place them and orient them within the assembly. A refined assembly was then generated. This refined assembly will be coming in a future release of the peach genome, and is reported in the upcoming peach genome publication. For reference, the following JBrowse viewer is available to visualize changes to the assembly.
The Prunux persica v1.0 genome markers files are available in FASTA and Excel format with links to GBrowse.