|
Overview
Analysis Name | Fragaria x ananassa 'Florida Brilliance' Genome v1.0 Assembly & Annotation |
Method | Hifiasm (0.16.1) |
Source | FaFB1 HiFi and Hi-C reads |
Date performed | 2022-09-19 |
Publication
Han, H., Salinas, N., Barbey, C. R., Jang, Y. J., Fan, Z., Verma, S., Whitaker, V. M., & Lee, S. (2024). A telomere-to-telomere phased genome of an octoploid strawberry reveals a receptor kinase conferring anthracnose resistance. GigaScience. Manuscript submitted for publication.
Disclaimer
The data is released under the CC0 Universal Public Domain Dedication.
Background
Trio binning is the first algorithm to generate a haplotype-resolved assembly, however, the requirement of parental data often limits the haplotype-phased genome assembly in practice. Recently, a new algorithm combining PacBio HiFi reads and Hi-C chromatin interaction data generated fully haplotype-phased genome assembly without parental data.
Without parental sequencing, we present first telomere-to-telomere octoploid strawberry genome assembly consisting of two haploid assemblies (phased-1 and phased-2).
Genome facts and statistics
The phase-1 assembly contained 3,716 contigs with an N50 of 23.7 Mb, and the phased-2 assembly contained 1,226 contigs with an N50 of 26.7 Mb. Only fifteen contigs accounted for 50% of phased-2 assembly, indicating that a contigs corresponds to a chromosome. In addition, largest contig size in phased-1 and phased-2 genome assemblies were over 36 Mb. Before scaffolding, the Benchmarking Universal Single-Copy Orthologs (BUSCO) scores were 99.2% in phased-1 assembly and 99.1% in phased-2 assembly indicating qualified initial assembly. Comparison of the full assembly to whole genome sequencing HiFi reads of ‘Florida Brilliance’ using Merqury showed very high base accuracy (QV>69.8), indicating 99.99999% of HiFi reads were detected on the combined phased-1 and 2 contigs.
We observed 99.1% complete gene models with a majority (96.6%) of the duplicated complete gene models in both phased-1 and phased-2 genome assembly. The final assembly of ‘Florida Brilliance’ consisted of 784.9 Mb and 781.0 Mb in phased-1 and phased-2 assembly. All 56 pseudo-chromosomes from phased-1 and phased-2 assembly contained putative telomere sequences at the 5’ and/or 3’ ends.
|