Genome Assembly of F. x ananassa and four wild species v1.0
Fragaria x ananassa Genome v1.0 (FAN_r1.1)
Fragaria x ananassa Reference Genome v1.0 (FANhybrid_r1.2)
Fragaria iinumae Genome v1.0 (FII_r1.1)
Fragaria nipponica Genome v1.0 (FNI_r1.1)
Fragaria nubicola Genome v1.0 (FNU_r1.1)
Fragaria orientalis Genome v1.0 (FOR_r1.1)
The cultivated strawberry (Fragaria x ananassa) is one of the most popular and globally consumed fruit crops.
F. x ananassa is an octoploid (2n=8X=56) species that originated from a natural hybridization between F. virginiana and F. chiloensis. The genus Fragaria belongs to the family Rosaceae, and comprised with one cultivated (F. x ananassa) and 21 wild species, including 12 diploids, five tetraploids, one hexaploids, two octoploids, and one decaploid.
De novo whole genome sequencing in octoploid strawberry, F. x ananassa, was performed by using Illumina and Roche 454 sequencing platforms. An Japanese variety bred in Chiba prefecture, 'Reikou', was subjected the analysis.
A virtual 'reference genome', which integrated genome sequences of homeologous chromosomes, was constructed by eliminating heterozygous bases in the process of sequence assembly (FANhybrid_r1.2). In parallel, four wild Fragaria species, which represent genetic diversity in the genus Fragaria, were selected based on simple sequence repeat (SSR) markers, and were subjected to whole genome sequences by using an Illumina plat form. The assembled contigs of the wild species, along with the F. x ananassa contigs were designated as below:
FAN (F. x ananassa), FII (F. iinumae), FNI (F. nipponica), FNU (F. nubicola) and FOR (F. orientalis).
The sequence IDs were named according to the following criteria.
Reference genome: The sequences derived from the 454 scaffolds were prefixed 'FANhyb_rscf' with sequence specific eight digits. The sequences derived from the Illumina scaffolds and unassembled contigs were prefixed 'FANhyb_icon' and suffixed '_a' after eight digiits. The Illumina singlets sequences were prefixed 'FANhyb_iscf' or 'FANhyb_icon'. The former and later were used for sequences derived from the Illumina scaffolds and unassembled contigs, respectively, that developed by SOAPdenovo 1.0.5. '_r' and '_o' were suffixed after eight digits for repeat and outlier singlets, respectively, and others were suffixed with '_s'.
Illumina assembled genome sequences: The each Illumina scaffolds and unassembled contigs were named with three capital alphabets representing the designated genome names (FAN, FII, FNI, FNU, FOR), followed by 'iscf' or 'icon', and eight digit. 'iscf' and 'icon' were used for scaffolds and unassembled contigs, respectively.
Hirakawa H, Shirasawa K, Kosugi S, Tashiro K, Nakayama S, Yamada M, Kohara M, Watanabe A, Kishida Y, Fujishiro T, Tsuruoka H, Minami C, Sasamoto S, Kato M, Nanri K, Komaki A, Yanagi T, Guoxin Q, Maeda F, Ishikawa M, Kuhara S, Sato S, Tabata S, Isobe SN. (2013)
Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species. DNA Res. doi:10.1093/dnares/dst049
Project Page: Strawberry GARDEN
Assembly of the F.xananassa reference genome
De novo whole genome sequencing in octoploid strawberry, F. x ananassa, was performed by using Illumina and Roche 454 sequencing platforms. A virtual 'reference genome', which integrated genome sequences of homeologous chromosomes, was constructed by eliminating heterozygous bases in the process of sequence assembly (FANhybrid_r1.2). The 454 reads were assembled using Newbler 2.7 (Roche Diagnostics) in a heterozygotic mode. In parallel, all the F.x ananassa Illumina reads were assembled using SOAPdenovo v1.05 with k-mer = 75.
Assembly of Illumina reads from F. x ananassa and wild species
The assembly of Illumina reads from F.x ananassa was performed along with that of reads from the four wild species, F. iinumae, F. nipponica, F. nubicola, and F. orientalis (Fig. 2). The Illumina reads in each species were assembled using SOAPdenovo v1.05 with k-mer = 75.
The FANhybrid_r1.2 and the five Illumina-assembled genomes were subjected to gene prediction and modelling using Augustus 2.7 with a training set from A. thaliana and the following parameters; gene model = partial; protein, introns, start, stop, cds, coding seq, gff3, and UTR = on, and alternatives-from-evidence and alternatives-from-sampling = true.
All assembly and annotation files are available for download by selecting the desired data type in the right-hand "Resources" side bar. Each data type page will provide a description of the available files and links do download.
Alternatively you may explore or perform a bulk download using the FTP site.