Oxford Nanopore Fine-Tune Crop Basecalling Models

Reference: Gottschalk, C., Burchard, E., Whitt, L., Vann, C., Dardick, C., Jung, S., Main, D., Waite, J. M., Honaas, L., & Harkess, A. (2025). Development of Rosaceae crop-specific Nanopore models for community use through the Genome Database for Rosaceae. bioRxiv. https://doi.org/10.1101/2025.06.23.660393

Description: Download Dorado basecalling models for apple, peach, pear, plum, and sweet cherry. The models are available as zipped folders (~30 mb) and need to be uncompressed upon downloading (see the tutorial below for instructions).

These crop-specific models were developed to improve basecalling read qualities of Rosaceae crop datasets generated on Oxford Nanopore Technology platforms. At the current time, these models are only for DNA sequencingcompatible with R10.4.1 flow cells and V114 library preparations and are based on a high-accuracy model.

Each model was trained using a custom pipeline based on the pytorch basecaller Bonito (https://github.com/nanoporetech/bonito) and export for use with the Dorado basecaller (https://github.com/nanoporetech/dorado). They were based on fine-tuning the dna_r10.4.1_e8.2_400bps_hac@v5.0.0 model. For a detailed explanation of the methodology for the training and validation of these models please refer to this preprint X . Questions and correspondence can be sent to Chris Gottschalk at christopher.gottschalk@usda.gov.

These models perform best when used on data from within the species they were trained on. However, they can be applied in an inter-specific and -generic fashion to other datasets but have only been validated for within Rosaceae. For example, using the apple (Malus) model to basecall a Pyrus dataset.

Version 1.0 DNA Models

Apple (Malus domestica) model – drMalDom_v1.0_ dna_r10.4.1_e8.2_400bps_hac@v5.0.0

Peach (Prunus persica) model – drPruPer_v1.0_ dna_r10.4.1_e8.2_400bps_hac@v5.0.0

Pear (Pyrus communis) model – drPryCom_v1.0_ dna_r10.4.1_e8.2_400bps_hac@v5.0.0

Plum (Prunus domestica) model – drPruDom_v1.0_ dna_r10.4.1_e8.2_400bps_hac@v5.0.0

Sweet Cherry (Prunus avium) model – drMalDom_v1.0_ dna_r10.4.1_e8.2_400bps_hac@v5.0.0

Updates and support

We plan to provide support and updates to these models when Oxford Nanopore Technologies releases new flow cells and library preparation kits. Intermediate updates could be released if the reference models undergo version updates (e.g. dna_r10.4.1_e8.2_400bps_hac@v5.0.0 -> dna_r10.4.1_e8.2_400bps_hac@v6.0.0). We also plan to fine-tune and release fast and super-accurate models in the coming months for these five crops. In the meantime, the high-accurate models offer a nice middle ground between call quality and base call run time. 

Tutorial

Example using the drMalDom_V1.0 model:

 

wget https://www.rosaceae.org/rosaceae_downloads/ONTmodels/drMalDom_v1.0_dna_r10.4.1_e8.2_400bps_hac@v5.0.0.tar.gz
tar -xzf drMalDom_v1.0_dna_r10.4.1_e8.2_400bps_hac@v5.0.0.tar.gz
dorado basecaller ./drMalDom_v1.0_dna_r10.4.1_e8.2_400bps_hac@v5.0.0 ./pod5 > ./reads.bam