Prunus persica 124 Pan Genome v1.0 Assembly & Annotation

Overview
Analysis NamePrunus persica 124 Pan Genome v1.0 Assembly & Annotation
MethodMaSuRCA
SourcePacbio and Illumina Prunus persica 124 Pan
Date performed2022-03-28

Publication

Zhang A, Zhou H, Jiang X, Han Y, Zhang X. The Draft Genome of a Flat Peach (Prunus persica L. cv. '124 Pan') Provides Insights into Its Good Fruit Flavor Traits.. Plants (Basel, Switzerland). 2021 Mar 12; 10(3).

Abstract

The flat peach has become more and more popular worldwide for its fruit quality with relatively low acidity, high sugar content and rich flavor. However, the draft genome assembly of flat peach is still unavailable and the genetic basis for its fruit flavor remains unclear. In this study, the draft genome of a flat peach cultivar '124 Pan' was assembled by using a hybrid assembly algorithm. The final assembly resulted in a total size of 206 Mb with a N50 of 26.3 Mb containing eight chromosomes and seven scaffolds. Genome annotation revealed that a total of 25,233 protein-coding genes were predicted with comparable gene abundance among the sequenced peach species. The phylogenetic tree and divergence times inferred from 572 single copy genes of 13 plant species confirmed that Prunus ferganensis was the ancestor of the domesticated peach. By comparing with the genomes of Prunus persica (Lovell) and Prunus ferganensis, the expansion of genes encoding enzymes involved in terpene biosynthesis was found, which might contribute to the good fruit flavor traits of '124 Pan'. The flat peach draft genome assembly obtained in this study will provide a valuable genomic resource for peach improvement and molecular breeding.

Homology

Homology of the Prunus persica 124 Pan Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2021-09) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format. 

 

Protein Homologs

Prunus persica v1.0 proteins with NCBI nr homologs (EXCEL file) ppersica_124_Pan_v1.0_vs_nr.xlsx.gz
Prunus persica v1.0 proteins with NCBI nr (FASTA file) ppersica_124_Pan_v1.0_vs_nr_hit.fasta.gz
Prunus persica v1.0 proteins without NCBI nr (FASTA file) ppersica_124_Pan_v1.0_vs_nr_noHit.fasta.gz
Prunus persica v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) ppersica_124_Pan_v1.0_vs_arabidopsis.xlsx.gz
Prunus persica v1.0 proteins with arabidopsis (Araport11) (FASTA file) ppersica_124_Pan_v1.0_vs_arabidopsis_hit.fasta.gz
Prunus persica v1.0 proteins without arabidopsis (Araport11) (FASTA file) ppersica_124_Pan_v1.0_vs_arabidopsis_noHit.fasta.gz
Prunus persica v1.0 proteins with SwissProt homologs (EXCEL file) ppersica_124_Pan_v1.0_vs_swissprot.xlsx.gz
Prunus persica v1.0 proteins with SwissProt (FASTA file) ppersica_124_Pan_v1.0_vs_swissprot_hit.fasta.gz
Prunus persica v1.0 proteins without SwissProt (FASTA file) ppersica_124_Pan_v1.0_vs_swissprot_noHit.fasta.gz
Prunus persica v1.0 proteins with TrEMBL homologs (EXCEL file) ppersica_124_Pan_v1.0_vs_trembl.xlsx.gz
Prunus persica v1.0 proteins with TrEMBL (FASTA file) ppersica_124_Pan_v1.0_vs_trembl_hit.fasta.gz
Prunus persica v1.0 proteins without TrEMBL (FASTA file) ppersica_124_Pan_v1.0_vs_trembl_noHit.fasta.gz

 

Assembly

The Prunus persica 124 Pan Genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Ppersica_124Pan_v1.0.fasta.gz

 

Gene Predictions

The Prunus persica 124 Pan v1.0 genome gene prediction files are available in FASTA and GFF3 formats.

Downloads

Protein sequences  (FASTA file) Ppersica_124Pan_v1.0.proteins.fasta.gz
CDS  (FASTA file) Ppersica_124Pan_v1.0.cds.fasta.gz
Genes (GFF3 file) Ppersica_124Pan_v1.0.genes.gff3.gz

 

Functional Analysis

Functional annotation for the Prunus persica 124 Pan Genome v1.0 are available for download below. The Prunus persica Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan ppersica_124_Pan_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan ppersica_124_Pan_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs ppersica_124_Pan_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways ppersica_124_Pan_v1.0_KEGG-pathways.xlsx.gz

 

Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Prunus persica genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.

 

Fragaria x ananassa GDR RefTrans v1 ppersica_124_Pan_v1.0_f.x.ananassa_GDR_reftransV1
fragaria avium GDR RefTrans v1 ppersica_124_Pan_v1.0_p.avium_GDR_reftransV1
fragaria persica GDR RefTrans v1 ppersica_124_Pan_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 ppersica_124_Pan_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 ppersica_124_Pan_v1.0_rubus_GDR_reftransV2
Malus_x_domestica GDR RefTrans v1 ppersica_124_Pan_v1.0_m.x.domestica_GDR_reftransV1
Pyrus GDR RefTrans v1 ppersica_124_Pan_v1.0_pyrus_GDR_reftransV1