Malus x domestica Whole Genome v1.0 Assembly & Annotation

Genome Overview
Analysis NameMalus x domestica Whole Genome v1.0 Assembly & Annotation
MethodConstructed using a gene-centered strategy
SourceSanger reads
Date performed2012-07-26

From this page you can browse and download the whole genome sequence of the Velaso et al., 2010 published apple genome assembly v1.0; predicted gene transcripts, their locations and putative function based on homology to known genes. You can BLAST your sequences against the genome sequence, the predicted genes and associated transcripts. You can search and browse the chromosomes, predicted genes and markers in GBrowse and view the evidence for each prediction and feature.    There are two asemblies currently available

The original contig-based assembly from the Velaso et al., 2010 publication

  1. Pseudo haplotype assemblies derived from the original contigs
  2. Use the resources side-bar to view details and download files for these assemblies.


Velasco et. al. The genome of the domesticated apple (Malus × domestica Borkh). Nature Genetics 42, 833–839 (2010)

Additional information about this analysis:
Property NameValue
JBrowse URL
Analysis Typewhole_genome

The Malus x domestica v1.0 genome assembly files are available in FASTA and GFF3 formats.  There are no pseudomolecule (chromosome) sequence files available for this assembly because of the multiple haplotypes present in the assembly.   Please see the Malus x domestica v1.0 pseudo haplotype (primary assembly) for a pseudomolecule sequence.


Contigs (FASTA file) Malus_x_domestica.v1.0.contigs.fa.gz
Contigs aligned to chromsomal coordinates (GFF3 file) Malus_x_domestica.v1.0.contigs.gff.gz
Chloroplast (FASTA file) Malus_x_domestica.v1.0.chloroplast.fa.gz
Mitochondria (FASTA file) Malus_x_domestica.v1.0.mitochondria.fa.gz








Gene Predictions

Gene models were predicted on the assembly contigs using various gene prediction tools such as Glimmer, FGENESH, Genewise, GMAP and Twinscan.  A final consensus gene set was constructed using evidence derived from all of these gene prediction tools.  Both the consensus sequences and alternate gene models are available for download below.  The consensus genes have been mapped to chromosomal positions.  However this is for convinence as there is no chromosomal sequence for this assembly.  

5' and 3' UTR regions are currently not available for gene models


Consensus gene model mRNA (FASTA) Malus_x_domestica.v1.0.consensus_mRNA.fa.gz
Consensus gene model CDSs (FASTA) Malus_x_domestica.v1.0.consensus_CDS.fa.gz
Consensus gene model CDSs with 300bp flanking (FASTA) Malus_x_domestica_v1.0.consensus_CDS300flanking.fa.gz
Consensus gene model proteins (FASTA) Malus_x_domestica.v1.0.consensus_peptide.fa.gz
Alternate gene model CDSs (FASTA) Malus_x_domestica.v1.0.other_CDS.fa.gz
Alternate gene model proteins (FASTA) Malus_x_domestica.v1.0.other_peptide.fa.gz
Consensus genes located on chromosomes (GFF3) Malus_x_domestica.v1.0.consensus.gff.gz
Consensus genes located on contigs (GFF3) Malus_x_domestica.v1.0.consensus2contigs.gff.gz


Functional Annotations

The following functional annotation files were derived from processing all predicted consensus genes through InterProScan and the KEGG KAAS services.  Genes are mapped to InterPro domains, GO terms, KEGG pathways and orthologs.


Consensus genes mapped to InterPro domains Malus_x_domestica_v1.0.genes2IPR.txt
Consensus genes mapped to GO terms Malus_x_domestica_v1.0.genes2GO.txt
Consensus genes mapped to KEGG pathways Malus_x_domestica.v1.0.genes2KEGG_pathways.txt
Consensus genes mapped to KEGG orthologs Malus_x_domestica.v1.0.genes2KEGG_orthologs.txt



Excel Reports
Best hit reports of blastp of Malus x domestica genome v1.0 proteins versus various protein databases are available below in Excel format.  


Non-redundant best-match proteins (Excel file) Malus_x_domestica.v1.0_gene_pep_function_101210.formated.xls                      
TAIR (arabidopsis)   (Excel file) Malus_x_domestica.v1.0_gene_pep_tair_blasp_100510.formated.xls
ExPASy TrEMBL   (Excel file) Malus_x_domestica.v1.0_gene_pep_uniprot_trembl_blastp_100610.formated.xls
ExPASy SwissProt (Excel file) Malus_x_domestica.v1.0_gene_pep_uniprot_sprot_blastp_100610.formated.xls
Prunus persica (peach) v1.0 proteins    (Excel file) Malus_x_domestica.v1.0_gene_pep_peach_blastp_100610.formated.xls
Populus trichocarpa (poplar) v2.0 proteins    (Excel file) Malus_x_domestica.v1.0_gene_pep_poplar_blastp_100610.formated.xls
Vitis vinifera (grape) proteins (Excel file) Malus_x_domestica.v1.0_gene_pep_Vitis_tblastn_100610.formated.xls


Markers / SNPs

Markers such as SNPs, SSRs, ESTs, etc are available for download in Excel and tab-delimited format. 


Apple markers: SNPs, SSRs, ESTs (Excel) Malus_x_domestica.v1.0.markers.xls
IRSC 9K SNPs with links to chromosomes in Gbrowse (Excel) IRSC_9K_apple_SNP_array.xls
IRSC 9K SNPs with links to contigs in Gbrowse (Excel) IRSC_9K_apple_SNP_array.2contigs.xls
IRSC candidate SNPs for Chr 1 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 2 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 3 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 4 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 5 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 6 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 7 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 8 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 9 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 10 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 11 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 12 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 13 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 14 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 15 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 16 (Excel and tab-delimited)
IRSC candidate SNPs for Chr 17 (Excel and tab-delimited)




Repeats were predicted  using read depth information from the genome assembly and then mapped for convienence to chromosomal positions.  Both mappings are available in GFF file below.


RosBREED Resequencing Alignments

A total of 31 different apple accessions were resequenced using Illumina short-read technology.  The reads were trimmed and aligned to the Malus x domestica v1.0 contig assembly and are available in BAM alignment files.

It is not necessary to download the BAM alignment files. Some are very large and multiple downloads may oversubscribe the network bandwidth. Rather, use the following instructions for viewing the alignments.

To view the strawberry resequencing alignments please follow these instructions:

  1. Download the file. This file contains the reference sequence and gene models. After downloading, unzip this file in your working directory.
  2. Launch the Integrative Genomics Viewer (IGV). Launch the version appropriate for the amount of memory you have available on your computer
  3. After IGV starts, load the genome file downloaded in the first step by clicking the menu item GenomesLoad Genome From File. Navigate to the folder where you unpacked the zip file from step 1 and select the file named Malus_x_domestica_v1.0.contigs.genome.
  4. Select an alignment file you wish to view by right-clicking on a file with a .bam extension and select the option Copy link location (in Chrome and Firefox), Copy shortcut (in Internet Explorer) or Copy link (in Safari).
  5. Add the alignment as a track in IGV by clicking the menu item FileLoad from URL. Paste the URL copied in the previous step into the box.
  6. You may load as many alignment files as you want

All assembly and annotation files are available for download by selecting the desired data type in the right-hand "Resources" side bar.  Each data type page will provide a description of the available files and links do download.