Rubus idaeus cv. 'Autumn Bliss' NIAB Genome v1.0 Assembly & Annotation

Overview
Analysis NameRubus idaeus cv. 'Autumn Bliss' NIAB Genome v1.0 Assembly & Annotation
MethodNECAT (v0.0.1_update20200803)
SourceNanopore reads used for the ‘Autumn Bliss’
Date performed2023-07-07

Publication

Price, R. J., Davik, J., Fernandéz Fernandéz, F., Bates, H. J., Lynn, S., Nellist, C. F., Buti, M., Røen, D., Šurbanovski, N., Alsheikh, M., Harrison, R. J., & Sargent, D. J. (2023). Chromosome-scale genome sequence assemblies of the ‘Autumn Bliss’ and ‘Malling Jewel’ cultivars of the highly heterozygous red raspberry (Rubus idaeus L.) derived from long-read Oxford Nanopore sequence data. PLoS ONE, 18(5), e0285756https://doi.org/10.1371/journal.pone.0285756.

Abstract
Red raspberry (Rubus idaeus L.) is an economically valuable soft-fruit species with a relatively small (~300 Mb) but highly heterozygous diploid (2n = 2x = 14) genome. Chromosome-scale genome sequences are a vital tool in unravelling the genetic complexity controlling traits of interest in crop plants such as red raspberry, as well as for functional genomics, evolutionary studies, and pan-genomics diversity studies. In this study, we developed genome sequences of a primocane fruiting variety (‘Autumn Bliss’) and a floricane variety (‘Malling Jewel’). The use of long-read Oxford Nanopore Technologies sequencing data yielded long read lengths that permitted well resolved genome sequences for the two cultivars to be assembled. The de novo assemblies of ‘Malling Jewel’ and ‘Autumn Bliss’ contained 79 and 136 contigs respectively, and 263.0 Mb of the ‘Autumn Bliss’ and 265.5 Mb of the ‘Malling Jewel’ assembly could be anchored unambiguously to a previously published red raspberry genome sequence of the cultivar ‘Anitra’. Single copy ortholog analysis (BUSCO) revealed high levels of completeness in both genomes sequenced, with 97.4% of sequences identified in ‘Autumn Bliss’ and 97.7% in ‘Malling Jewel’. The density of repetitive sequence contained in the ‘Autumn Bliss’ and ‘Malling Jewel’ assemblies was significantly higher than in the previously published assembly and centromeric and telomeric regions were identified in both assemblies. A total of 42,823 protein coding regions were identified in the ‘Autumn Bliss’ assembly, whilst 43,027 were identified in the ‘Malling Jewel’ assembly. These chromosome-scale genome sequences represent an excellent genomics resource for red raspberry, particularly around the highly repetitive centromeric and telomeric regions of the genome that are less complete in the previously published ‘Anitra’ genome sequence.

Table 2. Assembly statistics for the de novo assemblies of the Rubus idaeus ‘Malling Jewel’ and ‘Autumn Bliss’ genome sequences.

  ‘Autumn Bliss’ ‘Malling Jewel’
Total sequence length (bp)    268,668,490    265,724,784
Number of contigs    146    83
Longest contig (bp)    9,660,097    40,892,532
Shortest contig (bp)    3,465    3,226
GC content (%)    37.8    37.85
Contig N50 (bp)    3,253,875    9,899,270
Contig L50    25    7
Gap (%)    0    0
Complete BUSCOs(%)    97.7    97.6
Single copy BUSCOs(%)    89    93.6
Duplicated BUSCOs(%)    8.7    4
Fragmented BUSCOs(%)    0.4    0.4
Missing BUSCOs(%)    1.8    2
Homology

Homology of the Rubus idaeus 'Autumn Bliss' genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6  for the Arabidoposis proteins (Araport11 2022-09), UniProtKB/SwissProt (Release 2023-06), and UniProtKB/TrEMBL (Release 2023-06) databases. The best hit reports are available for download in Excel format. 

 

Protein Homologs

Prunus cerasus v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) RiAB_ragtag_HiC_v1.0_vs_arabidopsis.xlsx.gz
Prunus cerasus v1.0 proteins with arabidopsis (Araport11) (FASTA file) RiAB_ragtag_HiC_v1.0_vs_arabidopsis_hit.fasta.gz
Prunus cerasus v1.0 proteins without arabidopsis (Araport11) (FASTA file) RiAB_ragtag_HiC_v1.0_vs_arabidopsis_noHit.fasta.gz
Prunus cerasus v1.0 proteins with SwissProt homologs (EXCEL file) RiAB_ragtag_HiC_v1.0_vs_swissprot.xlsx.gz
Prunus cerasus v1.0 proteins with SwissProt (FASTA file) RiAB_ragtag_HiC_v1.0_vs_swissprot_hit.fasta.gz
Prunus cerasus v1.0 proteins without SwissProt (FASTA file) RiAB_ragtag_HiC_v1.0_vs_swissprot_noHit.fasta.gz
Prunus cerasus v1.0 proteins with TrEMBL homologs (EXCEL file) RiAB_ragtag_HiC_v1.0_vs_trembl.xlsx.gz
Prunus cerasus v1.0 proteins with TrEMBL (FASTA file) RiAB_ragtag_HiC_v1.0_vs_trembl_hit.fasta.gz
Prunus cerasus v1.0 proteins without TrEMBL (FASTA file) RiAB_ragtag_HiC_v1.0_vs_trembl_noHit.fasta.gz
Assembly

The Rubus idaeus 'Autumn Bliss' NIAB Genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Ridaeus_AutumnBliss_NIAB_V1.0.a1.fasta.gz
Gene Predictions

The Rubus idaeus 'Autumn Bliss' NIAB V1.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Ridaeus_AutumnBliss_NIAB_V1.0.a1.genes.gff3.gz
Protein sequences (FASTA file) Ridaeus_AutumnBliss_NIAB_V1.0.a1.pep.fasta.gz
CDS sequences (FASTA file) Ridaeus_AutumnBliss_NIAB_V1.0.a1.cds.fasta.gz
Functional Analysis

Functional annotation for the Rubus idaeus 'Autumn Bliss' genome v1.0 are available for download below. The Rubus idaeus Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan RiAB_ragtag_HiC_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan RiAB_ragtag_HiC_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs RiAB_ragtag_HiC_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways RiAB_ragtag_HiC_v1.0_KEGG-pathways.xlsx.gz
Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Rubus idaeus Autumn Bliss genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.

 

Fragaria x ananassa GDR RefTrans v1 RiAB_ragtag_HiC_v1.0_f.x.ananassa_GDR_reftransV1
Pyrus GDR RefTrans v1 RiAB_ragtag_HiC_v1.0_pyrus_GDR_reftransV1
Prunus persica GDR RefTrans v1 RiAB_ragtag_HiC_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 RiAB_ragtag_HiC_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 RiAB_ragtag_HiC_v1.0_rubus_GDR_reftransV2