Crataegus pinnatifida var. major Genome v1.0 Assembly & Annotation

Analysis NameCrataegus pinnatifida var. major Genome v1.0 Assembly & Annotation
MethodNanopore PromethION, NextDenovo (2.4)
SourceCrataegus pinnatifida var. major Genome v1.0
Date performed2022-05-23


 Zhang Ticao, Qiaoqin, Du Xiao, Zhang Xiao, Hou Yali, Wei Xin, Sun Chao, Zhang Rengang, Yun Quanzheng, Crabbe M. James, Van De Peer Yves, Dong Wenxuan. The cultivated hawthorn (Crataegus pinnatifida var. major) genome sheds light on the evolution of Maleae (apple tribe). Accepted to Journal of Integrative Plant Biology. 2022.


Cultivated hawthorn (Crataegus pinnatifida var. major) is an important medicinal and edible plant that has a long history of uses for health protection in China. Herein, we provide a de novo chromosome-level genome sequence of the hawthorn cultivar ‘Qiu Jinxing’. We assembled a 823.41 Mb genome encoding 40,571 genes and further anchored the 779.24 Mb sequence into 17 pseudo-chromosomes, which accounts for 94.64% of the assembled genome. Phylogenomic analyses revealed that cultivated hawthorn diverged from the combined clades of Malus and Pyrus at approximately 11.8 Mya. Notably, the genes involved in flavonoid and triterpenoid biosynthetic pathways have been significantly amplified in the hawthorn genome. In addition, our results indicated that the Maleae (apple tribe) share a unique ancient tetraploid event; however, no recent independent whole-genome duplication event was specifically detected in hawthorn. The amplification of long terminal repeat retrotransposons (e.g., Ty3/gypsy) had contributed the most to the expansion of the hawthorn genome. Furthermore, we identified two paleo-sub-genomes in extant species of Maleae and found these two sub-genomes showed different rearrangement mechanisms. The ancestral chromosomes of Rosaceae were reconstructed and the paleo-polyploid origin of Maleae is discussed. Overall, our study provides an improved context for understanding the evolution of Maleae and this new high-quality reference genome provides a useful resource for horticultural improvement of hawthorn.


Homology of the Crataegus pinnatifida var. major Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2021-09) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format. 


Protein Homologs

Crataegus pinnatifida v1.0 proteins with NCBI nr homologs (EXCEL file) crataegus_pinnatifida_v1.0_vs_nr.xlsx.gz
Crataegus pinnatifida v1.0 proteins with NCBI nr (FASTA file) crataegus_pinnatifida_v1.0_vs_nr_hit.fasta.gz
Crataegus pinnatifida v1.0 proteins without NCBI nr (FASTA file) crataegus_pinnatifida_v1.0_vs_nr_noHit.fasta.gz
Crataegus pinnatifida v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) crataegus_pinnatifida_v1.0_vs_arabidopsis.xlsx.gz
Crataegus pinnatifida v1.0 proteins with arabidopsis (Araport11) (FASTA file) crataegus_pinnatifida_v1.0_vs_arabidopsis_hit.fasta.gz
Crataegus pinnatifida v1.0 proteins without arabidopsis (Araport11) (FASTA file) crataegus_pinnatifida_v1.0_vs_arabidopsis_noHit.fasta.gz
Crataegus pinnatifida v1.0 proteins with SwissProt homologs (EXCEL file) crataegus_pinnatifida_v1.0_vs_swissprot.xlsx.gz
Crataegus pinnatifida v1.0 proteins with SwissProt (FASTA file) crataegus_pinnatifida_v1.0_vs_swissprot_hit.fasta.gz
Crataegus pinnatifida v1.0 proteins without SwissProt (FASTA file) crataegus_pinnatifida_v1.0_vs_swissprot_noHit.fasta.gz
Crataegus pinnatifida v1.0 proteins with TrEMBL homologs (EXCEL file) crataegus_pinnatifida_v1.0_vs_trembl.xlsx.gz
Crataegus pinnatifida v1.0 proteins with TrEMBL (FASTA file) crataegus_pinnatifida_v1.0_vs_trembl_hit.fasta.gz
Crataegus pinnatifida v1.0 proteins without TrEMBL (FASTA file) crataegus_pinnatifida_v1.0_vs_trembl_noHit.fasta.gz



The Crataegus pinnatifida var. major Genome v1.0 assembly file is available in FASTA format.


Chromosomes (FASTA file) Cpinnatifida_major_v1.0.fasta.gz
Repeats (GFF3 file) Cpinnatifida_major_v1.0.repeats.gff3.gz


Gene Predictions

The Crataegus pinnatifida var. major v1.0 genome gene prediction files are available in FASTA and GFF3 formats.


Protein sequences  (FASTA file) Cpinnatifida_major_v1.0.proteins.fasta.gz
CDS  (FASTA file) Cpinnatifida_major_v1.0.cds.fasta.gz
Genes (GFF3 file) Cpinnatifida_major_v1.0.genes.gff3.gz


Functional Analysis

Functional annotation for the Crataegus pinnatifida var. major Genome v1.0 are available for download below. The Crataegus pinnatifida Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).


GO assignments from InterProScan crataegus_pinnatifida_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan crataegus_pinnatifida_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs crataegus_pinnatifida_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways crataegus_pinnatifida_v1.0_KEGG-pathways.xlsx.gz


Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Crataegus pinnatifida genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.


Fragaria x ananassa GDR RefTrans v1 crataegus_pinnatifida_v1.0_f.x.ananassa_GDR_reftransV1
fragaria avium GDR RefTrans v1 crataegus_pinnatifida_v1.0_p.avium_GDR_reftransV1
fragaria persica GDR RefTrans v1 crataegus_pinnatifida_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 crataegus_pinnatifida_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 crataegus_pinnatifida_v1.0_rubus_GDR_reftransV2
Malus_x_domestica GDR RefTrans v1 crataegus_pinnatifida_v1.0_m.x.domestica_GDR_reftransV1
Pyrus GDR RefTrans v1 crataegus_pinnatifida_v1.0_pyrus_GDR_reftransV1