Rubus idaeus cv. 'Malling Jewel' NIAB Genome v1.0 Assembly & Annotation

Overview
Analysis NameRubus idaeus cv. 'Malling Jewel' NIAB Genome v1.0 Assembly & Annotation
MethodNECAT (v0.0.1_update20200803)
SourceNanopore reads used for the ‘Malling Jewel’
Date performed2023-07-07

Publication

Price, R. J., Davik, J., Fernandéz Fernandéz, F., Bates, H. J., Lynn, S., Nellist, C. F., Buti, M., Røen, D., Šurbanovski, N., Alsheikh, M., Harrison, R. J., & Sargent, D. J. (2023). Chromosome-scale genome sequence assemblies of the ‘Autumn Bliss’ and ‘Malling Jewel’ cultivars of the highly heterozygous red raspberry (Rubus idaeus L.) derived from long-read Oxford Nanopore sequence data. PLoS ONE, 18(5), e0285756. https://doi.org/10.1371/journal.pone.0285756.

Abstract
Red raspberry (Rubus idaeus L.) is an economically valuable soft-fruit species with a relatively small (~300 Mb) but highly heterozygous diploid (2n = 2x = 14) genome. Chromosome-scale genome sequences are a vital tool in unravelling the genetic complexity controlling traits of interest in crop plants such as red raspberry, as well as for functional genomics, evolutionary studies, and pan-genomics diversity studies. In this study, we developed genome sequences of a primocane fruiting variety (‘Autumn Bliss’) and a floricane variety (‘Malling Jewel’). The use of long-read Oxford Nanopore Technologies sequencing data yielded long read lengths that permitted well resolved genome sequences for the two cultivars to be assembled. The de novo assemblies of ‘Malling Jewel’ and ‘Autumn Bliss’ contained 79 and 136 contigs respectively, and 263.0 Mb of the ‘Autumn Bliss’ and 265.5 Mb of the ‘Malling Jewel’ assembly could be anchored unambiguously to a previously published red raspberry genome sequence of the cultivar ‘Anitra’. Single copy ortholog analysis (BUSCO) revealed high levels of completeness in both genomes sequenced, with 97.4% of sequences identified in ‘Autumn Bliss’ and 97.7% in ‘Malling Jewel’. The density of repetitive sequence contained in the ‘Autumn Bliss’ and ‘Malling Jewel’ assemblies was significantly higher than in the previously published assembly and centromeric and telomeric regions were identified in both assemblies. A total of 42,823 protein coding regions were identified in the ‘Autumn Bliss’ assembly, whilst 43,027 were identified in the ‘Malling Jewel’ assembly. These chromosome-scale genome sequences represent an excellent genomics resource for red raspberry, particularly around the highly repetitive centromeric and telomeric regions of the genome that are less complete in the previously published ‘Anitra’ genome sequence.

Table 2. Assembly statistics for the de novo assemblies of the Rubus idaeus ‘Malling Jewel’ and ‘Autumn Bliss’ genome sequences.

  ‘Autumn Bliss’ ‘Malling Jewel’
Total sequence length (bp)    268,668,490    265,724,784
Number of contigs    146    83
Longest contig (bp)    9,660,097    40,892,532
Shortest contig (bp)    3,465    3,226
GC content (%)    37.8    37.85
Contig N50 (bp)    3,253,875    9,899,270
Contig L50    25    7
Gap (%)    0    0
Complete BUSCOs(%)    97.7    97.6
Single copy BUSCOs(%)    89    93.6
Duplicated BUSCOs(%)    8.7    4
Fragmented BUSCOs(%)    0.4    0.4
Missing BUSCOs(%)    1.8    2
Homology

Homology of the Rubus idaeus Malling Jewel genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6  for the Arabidoposis proteins (Araport11 2022-09), UniProtKB/SwissProt (Release 2023-06), and UniProtKB/TrEMBL (Release 2023-06) databases. The best hit reports are available for download in Excel format. 

Protein Homologs

Rubus idaeus v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) RiMJ_ragtag_HiC_v1.0_vs_arabidopsis.xlsx.gz
Rubus idaeus v1.0 proteins with arabidopsis (Araport11) (FASTA file) RiMJ_ragtag_HiC_v1.0_vs_arabidopsis_hit.fasta.gz
Rubus idaeus v1.0 proteins without arabidopsis (Araport11) (FASTA file) RiMJ_ragtag_HiC_v1.0_vs_arabidopsis_noHit.fasta.gz
Rubus idaeus v1.0 proteins with SwissProt homologs (EXCEL file) RiMJ_ragtag_HiC_v1.0_vs_swissprot.xlsx.gz
Rubus idaeus v1.0 proteins with SwissProt (FASTA file) RiMJ_ragtag_HiC_v1.0_vs_swissprot_hit.fasta.gz
Rubus idaeus v1.0 proteins without SwissProt (FASTA file) RiMJ_ragtag_HiC_v1.0_vs_swissprot_noHit.fasta.gz
Rubus idaeus v1.0 proteins with TrEMBL homologs (EXCEL file) RiMJ_ragtag_HiC_v1.0_vs_trembl.xlsx.gz
Rubus idaeus v1.0 proteins with TrEMBL (FASTA file) RiMJ_ragtag_HiC_v1.0_vs_trembl_hit.fasta.gz
Rubus idaeus v1.0 proteins without TrEMBL (FASTA file) RiMJ_ragtag_HiC_v1.0_vs_trembl_noHit.fasta.gz
Assembly

The Rubus idaeus 'Malling Jewel' NIAB genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Ridaeus_MallingJewel_NIAB_V1.0.a1.fasta.gz
Gene Predictions

The Rubus idaeus 'Malling Jewel' NIAB V1.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Ridaeus_MallingJewel_NIAB_V1.0.a1.genes.gff3.gz
Protein sequences (FASTA file) Ridaeus_MallingJewel_NIAB_V1.0.a1.pep.fasta.gz
CDS sequences (FASTA file) Ridaeus_MallingJewel_NIAB_V1.0.a1.cds.fasta.gz
Functional Analysis

The functional annotation files for the Rubus idaeus Malling Jewel genome are available for download below. The Rubus idaeus Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan RiMJ_ragtag_HiC_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan RiMJ_ragtag_HiC_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs RiMJ_ragtag_HiC_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways RiMJ_ragtag_HiC_v1.0_KEGG-pathways.xlsx.gz