Fragaria vesca Whole Genome v1.1.a2 (Re-annotation of v1.1)

Overview
Analysis NameFragaria vesca Whole Genome v1.1.a2 (Re-annotation of v1.1)
MethodMAKER2 annotation pipeline
SourceFragaria vesca Whole Genome v1.1 Assembly & Annotation
Date performed2014-02-01

Reference

Darwish, O., Shahan, R., Liu, Z., Slovin, J. P., & Alkharouf, N. W. (2015). Re-annotation of the woodland strawberry (Fragaria vesca) genome. BMC genomics, 16(1), 29.

Background

A draft of the F. vesca genome sequence (v1.0) was published in 2010, and the new assembly (v1.1) was built on 2011. Gene models visible within GBrowse are those from v1.0 superimposed onto v1.1. No new gene predictions have been done for version 1.1. The original annotation version 1.0 were developed using GeneMark-ES+ which is a self-training gene prediction tool that relies primarily on the combination of ab initio predictions with mapping high confidence ESTs in addition to mapping gene deserts from transposable elements. Based on over 25 different tissue transcriptomes, the F. vesca genome annotation have been revised, thereby providing several improvements over the previous annotation.

Genome annotation facts and statistics

MAKER annotation pipeline to generate the revised F. vesca annotation. The following classes of evidence data we passed into the MAKER pipeline to generate the annotations. 1) 754,400 de novo assembled transcripts from 25 RNA-Seq samples, each with two biological replicates; 2) trained ab initio predictions from the SNAP gene prediction tool; 3) Augustus trained datasets of Arabidopsis thaliana and Solanum lycopersicum (tomato) transcriptomes; 4) first generation F. vesca gene predictions obtained from the GDR; 5) reference-based assemblies (1,302,739) obtained by aligning all RNA-Seq samples to the F. vesca genome version 1.1 using Cufflinks; and 6) plant reference proteins downloaded from the UNIPROT database.

The re-annotation pipeline improved existing annotations by increasing the annotation accuracy based on extensive transcriptome data. It uncovered new genes, added exons to current genes, and extended or merged exons.

Downloads

All assembly and annotation files are available for download by selecting the desired data type in the right-hand side bar.  Each data type page will provide a description of the available files and links do download.

Gene Predictions

The Fragaria vesca v1.1.a2 genome gene prediction files are available in GFF3 format.

Downloads

Genes and transcripts (GFF3 file) Fragaria_vesca_v1.1.a2.gff3.gz