SourceGenbank Fragaria ESTs (July 2012)
Date performed2012-12-19
Number of Reads55,513
Number of Contigs6,226
Number of Singlets12,008
OrganismsFragaria vesca

This is the fifth version of the Fragaria unigene. This build was used many sequencing projects around the world are depositing ESTs from the genus Fragaria in the NCBI dbEST database. The Fragaria ESTs included in this assembly were downloaded on July 1, 2012.

Not all of the Fragaria ESTs are of high quality. To filter, we crossmatched the public sequences against NCBI's UniVec database and used the BLAST sequence similarity algorithm to remove species-specific chloroplast, mitochondrial, tRNA, and rRNA sequences. To reduce redundancy and create longer transcripts we assembled these ESTs using the CAP31 program. The final assembly has been annotated by BLAST sequence similarity searching agaist Swiss-Prot2,TrEMBL3,TAIR4Arabidopsis proteinsPrunus persica5,Populus trichocarpa6 and Vitis vinifera7.

Processing Summary
Number of ESTs available 58,295
Number of ESTs available after filtering 55,513
Average Length 627
Number of Contigs(CAP3 Assembly, -p 90 ) 6,226
Average Length of Contigs 952
Number of Singlets 12,008
Number of Putative Unigenes 18,234


Library Information
The Fragaria ESTs used for this assembly were downloaded on July 01, 2012


EST Libraries
Number of ESTs available 58,295
# of Species 4
# of Libraries 54
# of Tissues 20
# of Development Stages 14

View detailed chart of libraries.

Fragaria chiloensis 137
Fragaria vesca 44,979
Fragaria vesca subsp. vesca 2,824
Fragaria x ananassa 10,354



Homology was determined using the BLASTx algorithm for the Fragaria Contigs and Singlets vs. the Swiss-Prot , TrEMBL,TAIR Arabidopsis proteins,Prunus persica, Populus trichocarpa and Vitis vinifera proteins. Only matches with an E-value of 1.0 e-6 or better were recorded. Swiss-Prot is a curated protein database with a high level of annotation and a minimal level of redundancy, and TrEMBL is a computer-annotated supplement of Swiss-Prot that contains all the translations of TrEMBL nucleotide sequence entries not yet integrated in Swiss-Prot. Homology of Fragaria in Excel spreadsheet can be downloaded from the Downloads.


KEGG analysis of Fragaria unigene v5.0 contigs

All Fragaria unigene v5.0 contigs were uploaded to the KEGG / KASS server at The SBH (single-directional best hit) method was selected under the category "Assignment method". All other settings were defaults. Results were downloaded in the heir.tar.gz heirarchy file and uploaded to the website.



Microsatellite Analysis

The type and frequency of simple sequence repeats in Fragaria unigene v5.0 contigs was determined using the program.For these searches, SSRs are defined as dinucleotides repeated at least 5 times, trinucleotides repeated at least 4 times, tetranucleotides repeated at least 3 times, or pentanucleotides repeated at least 3 times. The SSRs of Fragaria unigene v5.0 contigs are available to be downloaded from the Downloads.

Sequence information
Number of Sequences 6,226
Number of Sequences Having One Or More SSRs 1,853
Percentage of Sequences Having One Or More SSRs 29.76%
Total Number of SSRs Found 2,556
Number of Motifs 251

Frequency of Motif Type

Motif Length Frequency Percentage Frequency
2bp 801 31.33%
3bp 1379 53.95%
4bp 281 10.99%
5bp 95 3.72%


