Malus x domestica Whole Genome v3.0.a1 Assembly & Annotation

Overview
Analysis NameMalus x domestica Whole Genome v3.0.a1 Assembly & Annotation
MethodBfast, in house developed software
SourceSanger reads, 454 reads, SOLiD reads
Date performed2015-04-03

The Apple genome assembly version 3.0  mainly involved anchoring of contigs using a manually curated consensus genetic map obtained from from 21 genetic maps and a new structure of scaffolds. Minor modifications were also made to the set of contigs,reducing them to 94,174 .

New scaffolds were made by exploiting 30X coverage of SOLiD mate pair reads with a single insert size of 5 Kb. Scaffolds were created by mapping reads against contigs in single end mode with Bfast aligner (http://sourceforge.net/projects/bfast/) and physical relationships were then reconstructed and filtered by means of in-house developed software.

Contigs and scaffolds were then anchored to chromosomes (LG) by using a consensus of 21 genetic maps (using about 16,000  genetic markers obtained from the Apple illumina 20K SNP chip array, Bianco et al. 2014 PLoS ONE 9(10): e110377). This provided information for 66,124 contigs (589,070 Kbp).

Asssembly 3.0 is provided as gff3 by the Riccardo Velasco group with contigs coordinates on LG and (to grant back compatibility with previous version) also as primary and 3 alternative assemblies (see description of Malus x domestica genome v2.0 for details). 

Genes from v1.0 have been mapped to the v.3.0 scaffold (combined current assembly and annotation is named v3.0.a1) and further annotation is underway and will be available in the near future.

Note: Gene alignments to contigs can be viewed in GBrowse Malus x domestica v3.0.a1 contigs and contig alignments to LG can be viewed in GBrowse Malus x domestica v3.0.a1 pseudomolecules.

Download
All assembly and annotation files are available for download by selecting the desired data type in the right-hand side bar.  Each data type page will provide a description of the available files and links to download.
Assembly

The Malus x domestica v3.0.a1 genome assembly files are available in FASTA and GFF3 formats.  There are a total of 17 pseudomolecules in this assembly of apple.

Downloads

Pseudomolecules & Scaffolds (GFF3 file)  Malus_x_domestica.v3.0.a1_pseudomolecules.gff3.gz
Contigs (GFF3 file)  Malus_x_domestica.v3.0.a1_contigs.gff3.gz
Contigs (Fasta file)  Malus_x_domestica.v3.0.a1_contigs.fasta.gz

 

Primary and three alternative assemblies

Primary haplotype pseudomolecules (AGP file)  Malus_x_domestica.v3.0.a1_lg-pht_scaffold.agp.tar.gz
Primary haplotype pseudomolecules (FASTA file)  Malus_x_domestica.v3.0.a1_lg-pht_scaffold.fasta.tar.gz
Three alternative haplotype assemblies (TSV file)  Malus_x_domestica.v3.0.a1_lg-aht.tsv.tar.gz

 

Other Files

Positions of ambiguous nucleotides in primary haplotype assembly all_iupac.zip

 

Repeats

Repeats were predicted and provided as tab-delimited and fasta files below.

Downloads

Predicted repeats on contigs (Tab-delimited file) Malus_x_domestica.v3.0.a1_repeats.txt.gz
Predicted repeats on contigs (FASTA file) Malus_x_domestica.v3.0.a1_repeats.fasta.gz

 

Genes

The Malus x domestica v1.0 genes are aligned to v3.0.a1 genome and provided as download below:

V1 Gene Alignements (GFF3 file) Malus_x_domestica.v3.0.a1_v1_gene_alignment.gff3.gz
Gene Set - CDS (Fasta file) Malus_x_domestica.v3.0.a1_gene_set_cds.fasta.gz
Gene Set - Peptides (Fasta file) Malus_x_domestica.v3.0.a1_gene_set_pep.fasta.gz
Mapping between MDP genes and NCBI LOC genes MDP_vs_ncbi_by_cds_with_loc.xlsx