Prunus armeniaca Yinxiangbai Genome v1.0 Assembly & Annotation

Overview
Analysis NamePrunus armeniaca Yinxiangbai Genome v1.0 Assembly & Annotation
MethodCanu (version 1.7) and SMARTdenovo (version 1.0)
SourcePacBio reads and Nanopore reads
Date performed2024-02-28

Publication

Zhang Q, Zhang D, Yu K, Ji J, Liu N, Zhang Y, Xu M, Zhang YJ, Ma X, Liu S, Sun WH, Yu X, Hu W, Lan SR, Liu ZJ, Liu W. Frequent germplasm exchanges drive the high genetic diversity of Chinese-cultivated common apricot germplasm. Hortic Res. 2021 Oct 1;8(1):215. doi: 10.1038/s41438-021-00650-8

Supplementary Table 2:  Assembly statistics of the P. armeniaca ‘Yinxiangbai’ genome.

 

Contig

Length(bp)

Number

Max length

15,297,900

 

N50

4,038,258

17

N60

3,487,594

24

N70

2,572,382

33

N80

1,881,471

44

N90

988,841

63

Total length

251,190,293

 

number>=1000bp

256

 

number>=10000bp

239

 

GC rate (%)

37.55

 

Supplementary Table 3: The length of chromosome by Hi-C assembly of P. armeniaca ‘Yinxiangbai’ genome.

 

Length

No. of contigs

Chr01

48,757,601

26

Chr02

30,979,433

34

Chr03

27,445,780

48

Chr04

31,253,906

32

Chr05

19,750,602

32

Chr06

33,056,764

49

Chr07

30,082,201

41

Chr08

22,582,249

25

Contig N50 Length (bp)

3,166,142

Scaffold N90 Length (bp)

19,750,602

Scaffold N50 Length (bp)

30,979,433

Total Size (bp)

251,329,793

Total length of contigs (bp)

251,190,293

Length of unanchored contigs (bp)

7,421,257

Anchored contig rate (%)

97.04

Supplementary Table 4:  BUSCO assessment of gene annotation of P. armeniaca ‘Yinxiangbai’ genome.

Type

Number

Percentage(%)

Complete BUSCOs (C)

1,323

96.2

Complete and single-copy BUSCOs (S)

1,250

90.9

Complete and duplicated BUSCOs (D)

73

5.3

Fragmented BUSCOs (F)

7

0.5

Missing BUSCOs (M)

45

3.3

Total BUSCO groups searched

1,375

 

 

Homology Analysis

Homology of the Prunus armeniaca Yinxiangbai genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6  for the Arabidoposis proteins (Araport11, 2022-09), UniProtKB/SwissProt (Release 2023-07), and UniProtKB/TrEMBL (Release 2023-07) databases. The best hit reports are available for download in Excel format. 

Protein Homologs

P. armeniaca Yinxiangbai v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) Parmeniaca_Yinxiangbai_v1.0_vs_arabidopsis.xlsx.gz
P. armeniaca Yinxiangbai v1.0 proteins with arabidopsis (Araport11) (FASTA file) Parmeniaca_Yinxiangbai_v1.0_vs_arabidopsis_hit.fasta.gz
P. armeniaca Yinxiangbai v1.0 proteins without arabidopsis (Araport11) (FASTA file) Parmeniaca_Yinxiangbai_v1.0_vs_arabidopsis_noHit.fasta.gz
P. armeniaca Yinxiangbai v1.0 proteins with SwissProt homologs (EXCEL file) Parmeniaca_Yinxiangbai_v1.0_vs_swissprot.xlsx.gz
P. armeniaca Yinxiangbai v1.0 proteins with SwissProt (FASTA file) Parmeniaca_Yinxiangbai_v1.0_vs_swissprot_hit.fasta.gz
P. armeniaca Yinxiangbai v1.0 proteins without SwissProt (FASTA file) Parmeniaca_Yinxiangbai_v1.0_vs_swissprot_noHit.fasta.gz
P. armeniaca Yinxiangbai v1.0 proteins with TrEMBL homologs (EXCEL file) Parmeniaca_Yinxiangbai_v1.0_vs_trembl.xlsx.gz
P. armeniaca Yinxiangbai v1.0 proteins with TrEMBL (FASTA file) Parmeniaca_Yinxiangbai_v1.0_vs_trembl_hit.fasta.gz
P. armeniaca Yinxiangbai v1.0 proteins without TrEMBL (FASTA file) Parmeniaca_Yinxiangbai_v1.0_vs_trembl_noHit.fasta.gz
Assembly

The Prunus armeniaca Yinxiangbai genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes(FASTA file) parmeniaca_Yinxiangbai_v1.0.fasta.gz
Gene Predictions

The Prunus armeniaca Yinxiangbai genome v1.0.a1 gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) parmeniaca_Yinxiangbai_v1.0.a1.genes.gff3.gz
Protein sequences (FASTA file) parmeniaca_Yinxiangbai_v1.0.a1.pep.fasta.gz
CDS sequences (FASTA file) parmeniaca_Yinxiangbai_v1.0.a1.cds.fasta.gz
Functional Analysis

Functional annotation for the Prunus armeniaca Yinxiangbai genome v1.0 are available for download below. The P. armeniaca Yinxiangbai genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan Parmeniaca_Yinxiangbai_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan Parmeniaca_Yinxiangbai_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs Parmeniaca_Yinxiangbai_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways Parmeniaca_Yinxiangbai_v1.0_KEGG-pathways.xlsx.gz
Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Prunus armeniaca Yinxiangbai genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3.

 

Fragaria x ananassa GDR RefTrans v1 Parmeniaca_Yinxiangbai_v1.0_f.x.ananassa_GDR_reftransV1
P. armeniaca GDR RefTrans v1 Parmeniaca_Yinxiangbai_v1.0_p.avium_GDR_reftransV1
Prunus persica GDR RefTrans v1 Parmeniaca_Yinxiangbai_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 Parmeniaca_Yinxiangbai_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 Parmeniaca_Yinxiangbai_v1.0_rubus_GDR_reftransV2
Malus_x_domestica GDR RefTrans v1 Parmeniaca_Yinxiangbai_v1.0_m.x.domestica_GDR_reftransV1
Pyrus GDR RefTrans v1 Parmeniaca_Yinxiangbai_v1.0_pyrus_GDR_reftransV1