Fragaria x ananassa Camarosa Genome v1.0.a2 (Re-annotation of v1.0.a1)

Overview
Analysis NameFragaria x ananassa Camarosa Genome v1.0.a2 (Re-annotation of v1.0.a1)
MethodAn optimized annotation pipeline mainly using BRAKER2
SourceFragaria ananassa Whole Genome v1.0.a2
Date performed2021-02-16

Reference

Tianjia Liu, Muzi Li, Zhongchi Liu, Xianyan Ai and Yongping Li (2021). Reannotation of the cultivated strawberry genome and establishment of a strawberry genome database. Horticulture Research 8, 41 (2021).

Genome annotation backgroud and statistics

Cultivated strawberry (Fragaria × ananassa) is an important fruit crop species whose fruits are enjoyed by many worldwide. An octoploid of hybrid origin, the complex genome of this species was recently sequenced, serving as a key reference genome for cultivated strawberry and related species of the Rosaceae family. The current annotation of the F. ananassa genome mainly relies on ab initio predictions and, to a lesser extent, transcriptome data. Here, we present the structure and functional reannotation of the F. ananassa genome based on one PacBio full-length RNA library and ninety-two Illumina RNA-Seq libraries. This improved annotation of the F. ananassa genome, v1.0.a2, comprises a total of 108,447 gene models, with 97.85% complete BUSCOs. The models of 19,174 genes were modified, 360 new genes were identified, and 11,044 genes were found to have alternatively spliced isoforms. Additionally, we constructed a strawberry genome database (SGD) for strawberry gene homolog searching and annotation downloading. Finally, the transcriptome of the receptacles and achenes of F. ananassa at four developmental stages were reanalyzed and qualified, and the expression profiles of all the genes in this annotation are also provided. Together, this study provides an updated annotation of the F. ananassa genome, which will facilitate genomic analyses across the Rosaceae family and gene functional studies in cultivated strawberry.

 

 

Homology

Homology of the Fragaria ananassa genome v1.0.a2 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2018-05) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2019-01), and UniProtKB/TrEMBL (Release 2019-01) databases. The best hit reports are available for download in Excel format. 

 

Protein Homologs

Fragaria ananassa v1.0.a2 proteins with NCBI nr homologs (EXCEL file) Fragaria_x_ananassa_v1.a2_vs_nr.xlsx.gz
Fragaria ananassa v1.0.a2 proteins with NCBI nr (FASTA file) Fragaria_x_ananassa_v1.a2_vs_nr_hit.fasta.gz
Fragaria ananassa v1.0.a2 proteins without NCBI nr (FASTA file) Fragaria_x_ananassa_v1.a2_vs_nr_noHit.fasta.gz
Fragaria ananassa v1.0.a2 proteins with arabidopsis (Araport11) homologs (EXCEL file) Fragaria_x_ananassa_v1.a2_vs_arabidopsis.xlsx.gz
Fragaria ananassa v1.0.a2 proteins with arabidopsis (Araport11) (FASTA file) Fragaria_x_ananassa_v1.a2_vs_arabidopsis_hit.fasta.gz
Fragaria ananassa v1.0.a2 proteins without arabidopsis (Araport11) (FASTA file) Fragaria_x_ananassa_v1.a2_vs_arabidopsis_noHit.fasta.gz
Fragaria ananassa v1.0.a2 proteins with SwissProt homologs (EXCEL file) Fragaria_x_ananassa_v1.a2_vs_swissprot.xlsx.gz
Fragaria ananassa v1.0.a2 proteins with SwissProt (FASTA file) Fragaria_x_ananassa_v1.a2_vs_swissprot_hit.fasta.gz
Fragaria ananassa v1.0.a2 proteins without SwissProt (FASTA file) Fragaria_x_ananassa_v1.a2_vs_swissprot_noHit.fasta.gz
Fragaria ananassa v1.0.a2 proteins with TrEMBL homologs (EXCEL file) Fragaria_x_ananassa_v1.a2_vs_trembl.xlsx.gz
Fragaria ananassa v1.0.a2 proteins with TrEMBL (FASTA file) Fragaria_x_ananassa_v1.a2_vs_trembl_hit.fasta.gz
Fragaria ananassa v1.0.a2 proteins without TrEMBL (FASTA file) Fragaria_x_ananassa_v1.a2_vs_trembl_noHit.fasta.gz

 

Download

All assembly and annotation files are available for download by selecting the desired data type in the left-hand side bar.  Each data type page will provide a description of the available files and links to download.

Gene Predictions

The Fragaria ananassa v1.0.a2 gene prediction files are available in FASTA and GFF3 formats.

Downloads

Protein sequences  (FASTA file) Fragaria_x_ananassa_v1.a2.proteins.fasta.gz
Genes (GFF3 file) Fragaria_x_ananassa_v1.a2.genes.gff3.gz

 

Functional Analysis

Functional annotation for the Fragaria ananassa Genome v1.0.a2 are available for download below. The Fragaria ananassa genome v1.0.a2 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan Fragaria_x_ananassa_v1.a2_genes2GO.xlsx.gz
IPR assignments from InterProScan Fragaria_x_ananassa_v1.a2_genes2IPR.xlsx.gz
Proteins mapped to KEGG Pathways Fragaria_x_ananassa_v1.a2_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Orthologs Fragaria_x_ananassa_v1.a2_KEGG-pathways.xlsx.gz