Rubus idaeus Anitra Genome v1.0 Assembly & Annotation

Overview
Analysis NameRubus idaeus Anitra Genome v1.0 Assembly & Annotation
MethodCanu
SourcePacBio
Date performed2022-03-23

Publication

Davik J, Røen D, Lysøe E, Buti M, Rossman S, Alsheikh M, Aiden EL, Dudchenko O, Sargent DJ. A chromosome-level genome sequence assembly of the red raspberry (Rubus idaeus L.).. PloS one. 2022; 17(3):e0265096.

Abstract

Rubus idaeus L. (red raspberry), is a perennial woody plant species of the Rosaceae family that is widely cultivated in the temperate regions of world and is thus an economically important soft fruit species. It is prized for its flavour and aroma, as well as a high content of healthful compounds such as vitamins and antioxidants. Breeding programs exist globally for red raspberry, but variety development is a long and challenging process. Genomic and molecular tools for red raspberry are valuable resources for breeding. Here, a chromosome-length genome sequence assembly and related gene predictions for the red raspberry cultivar 'Anitra' are presented, comprising PacBio long read sequencing scaffolded using Hi-C sequence data. The assembled genome sequence totalled 291.7 Mbp, with 247.5 Mbp (84.8%) incorporated into seven sequencing scaffolds with an average length of 35.4 Mbp. A total of 39,448 protein-coding genes were predicted, 75% of which were functionally annotated. The seven chromosome scaffolds were anchored to a previously published genetic linkage map with a high degree of synteny and comparisons to genomes of closely related species within the Rosoideae revealed chromosome-scale rearrangements that have occurred over relatively short evolutionary periods. A chromosome-level genomic sequence of R. idaeus will be a valuable resource for the knowledge of its genome structure and function in red raspberry and will be a useful and important resource for researchers and plant breeders.

Homology

Homology of the Rubus idaeus Anitra Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2021-09) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format. 

 

Protein Homologs

Rubus idaeus v1.0 proteins with NCBI nr homologs (EXCEL file) ridaeus_Anitra_v1.0_vs_nr.xlsx.gz
Rubus idaeus v1.0 proteins with NCBI nr (FASTA file) ridaeus_Anitra_v1.0_vs_nr_hit.fasta.gz
Rubus idaeus v1.0 proteins without NCBI nr (FASTA file) ridaeus_Anitra_v1.0_vs_nr_noHit.fasta.gz
Rubus idaeus v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) ridaeus_Anitra_v1.0_vs_arabidopsis.xlsx.gz
Rubus idaeus v1.0 proteins with arabidopsis (Araport11) (FASTA file) ridaeus_Anitra_v1.0_vs_arabidopsis_hit.fasta.gz
Rubus idaeus v1.0 proteins without arabidopsis (Araport11) (FASTA file) ridaeus_Anitra_v1.0_vs_arabidopsis_noHit.fasta.gz
Rubus idaeus v1.0 proteins with SwissProt homologs (EXCEL file) ridaeus_Anitra_v1.0_vs_swissprot.xlsx.gz
Rubus idaeus v1.0 proteins with SwissProt (FASTA file) ridaeus_Anitra_v1.0_vs_swissprot_hit.fasta.gz
Rubus idaeus v1.0 proteins without SwissProt (FASTA file) ridaeus_Anitra_v1.0_vs_swissprot_noHit.fasta.gz
Rubus idaeus v1.0 proteins with TrEMBL homologs (EXCEL file) ridaeus_Anitra_v1.0_vs_trembl.xlsx.gz
Rubus idaeus v1.0 proteins with TrEMBL (FASTA file) ridaeus_Anitra_v1.0_vs_trembl_hit.fasta.gz
Rubus idaeus v1.0 proteins without TrEMBL (FASTA file) ridaeus_Anitra_v1.0_vs_trembl_noHit.fasta.gz

 

Assembly

The Rubus idaeus Anitra Genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Ridaeus_Anitra_v1.0.fasta.gz

 

Gene Predictions

The Rubus idaeus Anitra v1.0 genome gene prediction files are available in FASTA and GFF3 formats.

Downloads

Protein sequences  (FASTA file) Ridaeus_Anitra_v1.0.proteins.fasta.gz
Genes (GFF3 file) Ridaeus_Anitra_v1.0.genes.gff3.gz

 

Functional Analysis

Functional annotation for the Rubus idaeus Anitra Genome v1.0 are available for download below. The Rubus idaeus Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan ridaeus_Anitra_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan ridaeus_Anitra_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs ridaeus_Anitra_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways ridaeus_Anitra_v1.0_KEGG-pathways.xlsx.gz

 

Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Rubus idaeus genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.

 

Fragaria x ananassa GDR RefTrans v1 ridaeus_Anitra_v1.0_f.x.ananassa_GDR_reftransV1
fragaria avium GDR RefTrans v1 ridaeus_Anitra_v1.0_p.avium_GDR_reftransV1
fragaria persica GDR RefTrans v1 ridaeus_Anitra_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 ridaeus_Anitra_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 ridaeus_Anitra_v1.0_rubus_GDR_reftransV2
Malus_x_domestica GDR RefTrans v1 ridaeus_Anitra_v1.0_m.x.domestica_GDR_reftransV1
Pyrus GDR RefTrans v1 ridaeus_Anitra_v1.0_pyrus_GDR_reftransV1