Malus x domestica HFTH1 Whole Genome v1.0

Analysis NameMalus x domestica HFTH1 Whole Genome v1.0
MethodIllumina PE, Pacific Biosciences SMRT and Hi-C
Source (v1.0.a1)
Date performed2019-04-09


Liyi Zhang, Jiang Hu, Xiaolei Han, Jingjing Li , Yuan Gao, Christopher M. Richards , Caixia Zhang, Yi Tian, Guiming Liu, Hera Gul, Dajiang Wang, Yu Tian, Chuanxin Yang, Minghui Meng, Gaopeng Yuan, Guodong Kang, Yonglong Wu, Kun Wang, Hengtao Zhang, Depeng Wang & Peihua Cong. A high-quality apple genome assembly reveals the association of a retrotransposon and red fruit colour. Nature communications. 2019 April 02.


About the Assembly


Assemble a high-quality genome (contig N50 of 6.99Mb) of the apple anther-derived homozygous line HFTH1 and  reveal that the extensive genomic variations are largely attributable to activity of transposable elements. 

Sequencing, Assembly, and Annotation

PacBio single-molecule long reads (77Gb with an average length of 13.1kb), 66-fold Illumina paired-end short reads (43.3Gb), 224-fold optical map data (147.8Gb with an average length of 178.9kb) and 145-fold Hi-C data . The assembly was performed in a stepwise fashion16, and the initial assembly of the PacBio-only data generated a 656.52Mb genome size with a contig N50 of 4.63Mb. The initial contigs were polished with PacBio long reads and Illumina short reads. Subsequently, the polished contigs were scaffolded using optical map data, and during this step four contigs containing conflicting connections were identified and split to resolve conflicts, and 58.5% gaps that were introduced in this step were closed by subsequent gap filling procedure. Finally, scaffolding with Hi-C data allowed the accurate clustering and ordering of 17 pseudo-chromosomes covering the 658.90Mb assembly, with a contig N50 of 6.99Mb and a maximum contig length of 18.01Mb. The assembly size was close to the estimated genome size of GDDH133, but represented 92.99% of our estimated genome size (708.54Mb) for HFTH1 by k-mer analysis, and ~97.89% of the Illumina reads of HFTH1 could be mapped to our assembly. In addition, the 160,068bp chloroplast genome and 396,939bp mitochondria genome were assembled into two complete contigs.


Homology of the Malus x domestica HFTH1 genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2018-05) and 1e-6  for the Arabidoposis proteins (TAIR10), UniProtKB/SwissProt (Release 2019-01), and UniProtKB/TrEMBL (Release 2019-01) databases. The best hit reports are available for download in Excel format. 


Protein Homologs

Malus x domestica HFTH1 v1.0 proteins with NCBI nr homologs (EXCEL file) Mxd_HFTH1_v1.0_vs_nr.xlsx.gz
Malus x domestica HFTH1 v1.0 proteins with NCBI nr (FASTA file) Mxd_HFTH1_v1.0_vs_nr_hit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins without NCBI nr (FASTA file) Mxd_HFTH1_v1.0_vs_nr_noHit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins with arabidopsis (TAIR10) homologs (EXCEL file) Mxd_HFTH1_v1.0_vs_arabidopsis.xlsx.gz
Malus x domestica HFTH1 v1.0 proteins with arabidopsis (TAIR10) (FASTA file) Mxd_HFTH1_v1.0_vs_arabidopsis_hit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins without arabidopsis (TAIR10) (FASTA file) Mxd_HFTH1_v1.0_vs_arabidopsis_noHit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins with SwissProt homologs (EXCEL file) Mxd_HFTH1_v1.0_vs_swissprot.xlsx.gz
Malus x domestica HFTH1 v1.0 proteins with SwissProt (FASTA file) Mxd_HFTH1_v1.0_vs_swissprot_hit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins without SwissProt (FASTA file) Mxd_HFTH1_v1.0_vs_swissprot_noHit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins with TrEMBL homologs (EXCEL file) Mxd_HFTH1_v1.0_vs_trembl.xlsx.gz
Malus x domestica HFTH1 v1.0 proteins with TrEMBL (FASTA file) Mxd_HFTH1_v1.0_vs_trembl_hit.fasta.gz
Malus x domestica HFTH1 v1.0 proteins without TrEMBL (FASTA file) Mxd_HFTH1_v1.0_vs_trembl_noHit.fasta.gz



All assembly and annotation files are available for download by selecting the desired data type in the left-hand side bar.  Each data type page will provide a description of the available files and links to download.



Chromosome(FASTA file) Malus_x_domestica_HFTH1_v1.0.a1.fasta.gz


Gene Predictions

The Malus x domestica HFTH1 v1.0 genome gene prediction files are available in FASTA and GFF3 formats.


Transcript CDS sequences (FASTA file) Malus_x_domestica_HFTH1_v1.0.a1.transcripts.fasta.gz
Protein sequences  (FASTA file) Malus_x_domestica_HFTH1_v1.0.a1.proteins.fasta.gz
Genes (GFF3 file) Malus_x_domestica_HFTH1_v1.0.genes.a1.gff3.gz


Functional Analysis

Functional annotation for the Malus x domestica HFTH1 genome v1.0 are available for download below. The Malus x domestica HFTH1 genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).


GO assignments from InterProScan Malus_x_domestica_HFTH1_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan Malus_x_domestica_HFTH1_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs Malus_x_domestica_HFTH1_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways Malus_x_domestica_HFTH1_v1.0_KEGG-pathways.xlsx.gz


Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Malus x domestica genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.


Fragaria x ananassa GDR RefTrans v1 Malus_x_domestica_HFTH1_v1.0.a1_f.x.ananassa_GDR_reftransV1
Malus x domestica GDR RefTrans v1 Malus_x_domestica_HFTH1_v1.0.a1_m.x.domestica_GDR_reftransV1
Prunus avium GDR RefTrans v1 Malus_x_domestica_HFTH1_v1.0.a1_p.avium_GDR_reftransV1
Prunus persica GDR RefTrans v1 Malus_x_domestica_HFTH1_v1.0.a1_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 Malus_x_domestica_HFTH1_v1.0.a1_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 Malus_x_domestica_HFTH1_v1.0.a1_rubus_GDR_reftransV2