Prunus davidiana ZXST Haplotype Genome v1.0 Assembly & Annotation

Overview
Analysis NamePrunus davidiana ZXST Haplotype Genome v1.0 Assembly & Annotation
MethodALLPATHS-LG, Hifiasm (v0.16.1-r375)
SourcePrunus davidiana Illumina, PacBio Hifi and Hi-C Reads
Date performed2023-10-04

Publication

Wang J, Li Y, Wang X, Cao K, Chen C, Wu J, Fang W, Zhu G, Wang J, Zhao Y, Fan J, Liu S, Chen X, Li W, Bie H, Guo D, Xu Q, Wang L. Haplotype-resolved genome of a heterozygous wild peach reveals the PdaWRKY4-PdaCYP716A1 module mediates resistance to aphids by regulating betulin biosynthesis.  https://doi.org/10.1111/jipb.13782

Abstract

Wild species of domesticated crops provide valuable genetic resources for resistance breeding. Prunus davidiana, a wild relative of peach with high heterozygosity and diverse stress tolerance, shows high resistance against aphids. Here, we present the 501.7 Mb haplotype-resolved genome assembly of P. davidiana. Genomic comparisons of the two haplotypes revealed 18,152 structural variations, 2699 Pda_hap1 specific and 2702 Pda_hap2 specific genes, and 1118 allele specific expressed genes. Genome composition indicated 4.1% of P. davidiana genome was non-peach origin, of which 94.5% is derived from almond. PdaWRKY4 is identified to confer aphid resistance, with a 9-bp deletion in its promoter of the resistant phenotype, which directly promotes synthesis of the anti-aphid metabolite betulin. We employ a genome design to develop a breeding workflow for rapidly and precisely producing aphid-resistant peaches. This study identifies a novel aphid resistance gene and provides insights into genome design for the development of resistant fruit cultivars.

Materials and Methods

In this study, “Zhouxingshantao 1” (ZXST, Prunus davidiana) from the National Germplasm Resource Repository of Peach at the Zhengzhou Fruit Research Institute, CAAS, China, was used to construct haplotype genomes. By sequencing the ZXST, a total of 29 Gb HiFi long reads, 24 Gb ONT long reads, and 40 Gb Hi-C reads were obtained. The k-mer distribution exhibited remarkable double peaks, indicating a highly heterozygous ZXST genome, with a heterozygosity rate of 1.12%. The genome was initially phased and assembled using HiFi and Hi-C reads, and gaps were further filled by ONT reads. Finally, a 501.7 Mb chromosome-level haplotype genome of ZXST with 16 pseudochromosomes was obtained, which could be divided into two haplotypes with sizes of 257.1 and 244.6 Mb, respectively. A total of 22,479 and 21,784 protein-coding genes were predicted in the Pda_hap1 and Pda_hap2 genomes, respectively, based on an integrative pipeline of de novo prediction, homology-based search, and RNA-seq evidence.

 

Table S1. Summary of features of the assembled haplotype genome of ZXST.

Assembly feature

Pda_hap1

Pda_hap2

Pda_diploid (Cao et al., 2022)

Genome size (Mb)

257.1

244.6

259.3

N50 (Mb)

27.7

28.5

28.5

Longest scaffolds (Mb)

46.1

46.9

47.2

Number of scaffolds

375

123

564

Complete BUSCOs (%)

98.9

99

99

Fragmented BUSCOs (%)

0.6

0.5

0.5

Missing BUSCOs (%)

0.5

0.5

0.5

LTR assembly index (LAI)

11.81

14.71

12.8

 
Homology

Homology of the Prunus davidiana Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6  for the Arabidoposis proteins (Araport11, 2022-09), UniProtKB/SwissProt (Release 2023-07), and UniProtKB/TrEMBL (Release 2023-07) databases. The best hit reports are available for download in Excel format. 

Protein Homologs

Prunus davidiana v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) Pdavidiana_Haplotype_v1.0_vs_arabidopsis.xlsx.gz
Prunus davidiana v1.0 proteins with arabidopsis (Araport11) (FASTA file) Pdavidiana_Haplotype_v1.0_vs_arabidopsis_hit.fasta.gz
Prunus davidiana v1.0 proteins without arabidopsis (Araport11) (FASTA file) Pdavidiana_Haplotype_v1.0_vs_arabidopsis_noHit.fasta.gz
Prunus davidiana v1.0 proteins with SwissProt homologs (EXCEL file) Pdavidiana_Haplotype_v1.0_vs_swissprot.xlsx.gz
Prunus davidiana v1.0 proteins with SwissProt (FASTA file) Pdavidiana_Haplotype_v1.0_vs_swissprot_hit.fasta.gz
Prunus davidiana v1.0 proteins without SwissProt (FASTA file) Pdavidiana_Haplotype_v1.0_vs_swissprot_noHit.fasta.gz
Prunus davidiana v1.0 proteins with TrEMBL homologs (EXCEL file) Pdavidiana_Haplotype_v1.0_vs_trembl.xlsx.gz
Prunus davidiana v1.0 proteins with TrEMBL (FASTA file) Pdavidiana_Haplotype_v1.0_vs_trembl_hit.fasta.gz
Prunus davidiana v1.0 proteins without TrEMBL (FASTA file) Pdavidiana_Haplotype_v1.0_vs_trembl_noHit.fasta.gz
Assembly

The Prunus davidiana ZXST Haplotype genome v1.0 assembly files are available in GFF3 and FASTA format.

Downloads

Chromosomes (masked HAP 1) (FASTA file) Pdavidiana_Hap1.masked_V1.0.a1.fasta.gz
Chromosomes (masked HAP 2) (FASTA file) Pdavidiana_Hap2.masked_V1.0.a1.fasta.gz
Chromosomes (HAP 1) (FASTA file) Pdavidiana_Hap1_V1.0.a1.fasta.gz
Chromosomes (HAP 2) (FASTA file) Pdavidiana_Hap2_V1.0.a1.fasta.gz
Repeats (HAP 1) (GFF3 file) Pdavidiana_Hap1.repeats.gff.gz
Repeats (HAP 2) (GFF3 file) Pdavidiana_Hap2.repeats.gff.gz
Gene Predictions

The Prunus davidiana ZXST Haplotype genome v1.0 gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (HAP 1) (GFF3 file) Pdavidiana_Hap1_V1.0.a1.genes.gff3.gz
Genes (HAP 2) (GFF3 file) Pdavidiana_Hap2_V1.0.a1.genes.gff3.gz
Protein sequences (HAP 1) (FASTA file) Pdavidiana_Hap1_V1.0.a1.pep.fasta.gz
Protein sequences (HAP 2) (FASTA file) Pdavidiana_Hap2_V1.0.a1.pep.fasta.gz
CDS sequences (HAP 1) (FASTA file) Pdavidiana_Hap1_V1.0.a1.cds.fasta.gz
CDS sequences (HAP 2) (FASTA file) Pdavidiana_Hap2_V1.0.a1.cds.fasta.gz
Functional Analysis

Functional annotation for the Prunus davidiana Genome v1.0 are available for download below. The Prunus davidiana Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan Pdavidiana_Haplotype_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan Pdavidiana_Haplotype_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs Pdavidiana_Haplotype_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways Pdavidiana_Haplotype_v1.0_KEGG-pathways.xlsx.gz
Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Prunus davidiana genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.

 

Fragaria x ananassa GDR RefTrans v1 Pdavidiana_Haplotype_v1.0_f.x.ananassa_GDR_reftransV1
Prunus avium GDR RefTrans v1 Pdavidiana_Haplotype_v1.0_p.avium_GDR_reftransV1
Prunus persica GDR RefTrans v1 Pdavidiana_Haplotype_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 Pdavidiana_Haplotype_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 Pdavidiana_Haplotype_v1.0_rubus_GDR_reftransV2
Malus_x_domestica GDR RefTrans v1 Pdavidiana_Haplotype_v1.0_m.x.domestica_GDR_reftransV1
Pyrus GDR RefTrans v1 Pdavidiana_Haplotype_v1.0_pyrus_GDR_reftransV1