Pyrus communis d'Anjou draft Genome v1.0 Assembly & Annotation

MethodCanu (2.1.1)
SourcePacBio reads, Illumina reads (pcommunis d'Anjou draft Genome v1.0)
Date performed2022-10-12


Huiting Zhang, Eric K. Wafula, Jon Eilers,  Alex Harkess, Paula E. Ralph, Prakash R. Timilsena,  Claude W. DePamphilis, Jessica M. Waite and  Loren A. Honaas, Building a foundation for gene family analysis in Rosaceae genomes with a novel workflow: a case study in Pyrus architecture genes, Front. Plant Sci., 2022, doi: 10.3389/fpls.2022.975942


The rapid development of sequencing technologies has led to a deeper understanding of plant genomes. However, direct experimental evidence connecting genes to important agronomic traits is still lacking in most non-model plants. For instance, the genetic mechanisms underlying plant architecture are poorly understood in pome fruit trees, creating a major hurdle in developing new cultivars with desirable architecture, such as dwarfing rootstocks in European pear (Pyrus communis). An efficient way to identify genetic factors for important traits in non-model organisms can be to transfer knowledge across genomes. However, major obstacles exist, including complex evolutionary histories and variable quality and content of publicly available plant genomes. As researchers aim to link genes to traits of interest, these challenges can impede the transfer of experimental evidence across plant species, namely in the curation of high-quality, high-confidence gene models in an evolutionary context. Here we present a workflow using a collection of bioinformatic tools for the curation of deeply conserved gene families of interest across plant genomes. To study gene families involved in tree architecture in European pear and other rosaceous species, we used our workflow, plus a draft genome assembly and high-quality annotation of a second P. communis cultivar, ‘d’Anjou.’ Our comparative gene family approach revealed significant issues with the most recent ‘Bartlett’ genome - primarily thousands of missing genes due to methodological bias. After correcting assembly errors on a global scale in the ‘Bartlett’ genome, we used our workflow for targeted improvement of our genes of interest in both P. communis genomes, thus laying the groundwork for future functional studies in pear tree architecture. Further, our global gene family classification of 15 genomes across 6 genera provides a valuable and previously unavailable resource for the Rosaceae research community. With it, orthologs and other gene family members can be easily identified across any of the classified genomes. Importantly, our workflow can be easily adopted for any other plant genomes and gene families of interest.


Homology of the Pyrus communis d'Anjou Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2021-09) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format. 


Protein Homologs

Pyrus communis v1.0 proteins with NCBI nr homologs (EXCEL file) pcommunis_DAnjou_v1.0_vs_nr.xlsx.gz
Pyrus communis v1.0 proteins with NCBI nr (FASTA file) pcommunis_DAnjou_v1.0_vs_nr_hit.fasta.gz
Pyrus communis v1.0 proteins without NCBI nr (FASTA file) pcommunis_DAnjou_v1.0_vs_nr_noHit.fasta.gz
Pyrus communis v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) pcommunis_DAnjou_v1.0_vs_arabidopsis.xlsx.gz
Pyrus communis v1.0 proteins with arabidopsis (Araport11) (FASTA file) pcommunis_DAnjou_v1.0_vs_arabidopsis_hit.fasta.gz
Pyrus communis v1.0 proteins without arabidopsis (Araport11) (FASTA file) pcommunis_DAnjou_v1.0_vs_arabidopsis_noHit.fasta.gz
Pyrus communis v1.0 proteins with SwissProt homologs (EXCEL file) pcommunis_DAnjou_v1.0_vs_swissprot.xlsx.gz
Pyrus communis v1.0 proteins with SwissProt (FASTA file) pcommunis_DAnjou_v1.0_vs_swissprot_hit.fasta.gz
Pyrus communis v1.0 proteins without SwissProt (FASTA file) pcommunis_DAnjou_v1.0_vs_swissprot_noHit.fasta.gz
Pyrus communis v1.0 proteins with TrEMBL homologs (EXCEL file) pcommunis_DAnjou_v1.0_vs_trembl.xlsx.gz
Pyrus communis v1.0 proteins with TrEMBL (FASTA file) pcommunis_DAnjou_v1.0_vs_trembl_hit.fasta.gz
Pyrus communis v1.0 proteins without TrEMBL (FASTA file) pcommunis_DAnjou_v1.0_vs_trembl_noHit.fasta.gz



The pyrus communis d'Anjou v1.0 assembly files files are available in FASTA format.


Chromosomes (FASTA file) pcommunis_DAnjou_v1.0.fasta.gz


Gene Predictions

The pyrus communis d'Anjou v1.0 genome gene prediction files are available in GFF3 and FASTA formats.


Genes (GFF3 file) pcommunis_DAnjou_v1.0.genes.gff3.gz
CDS sequences (FASTA file) pcommunis_DAnjou_v1.0.cds.fasta.gz
Protein sequences (FASTA file) pcommunis_DAnjou_v1.0.protein.fasta.gz


Functional Analysis

Functional annotation for the Pyrus communis d'Anjou Genome v1.0 are available for download below. The Pyrus communis Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).


GO assignments from InterProScan pcommunis_DAnjou_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan pcommunis_DAnjou_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs pcommunis_DAnjou_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways pcommunis_DAnjou_v1.0_KEGG-pathways.xlsx.gz


Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Pyrus communis genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.


pyrus communis GDR RefTrans v1 pcommunis_DAnjou_v1.0_f.x.ananassa_GDR_reftransV1
fragaria avium GDR RefTrans v1 pcommunis_DAnjou_v1.0_p.avium_GDR_reftransV1
fragaria persica GDR RefTrans v1 pcommunis_DAnjou_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 pcommunis_DAnjou_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 pcommunis_DAnjou_v1.0_rubus_GDR_reftransV2
Fragaria_x_ananassa GDR RefTrans v1 pcommunis_DAnjou_v1.0_m.x.domestica_GDR_reftransV1
Pyrus GDR RefTrans v1 pcommunis_DAnjou_v1.0_pyrus_GDR_reftransV1