Potentilla anserina Genome v1.0 Assembly & Annotation
Gan, X.; Li, S.; Zong, Y.; Cao, D.; Li, Y.; Liu, R.; Cheng, S.; Liu, B.; Zhang, H. Chromosome-Level Genome Assembly Provides New Insights into Genome Evolution and Tuberous Root Formation of Potentilla anserina. Genes 2021, 12, 1993. https://doi.org/10.3390/genes12121993
Potentilla anserina is a perennial stoloniferous plant with edible tuberous roots in Rosaceae, served as important food and medicine sources for Tibetans in the Qinghai-Tibetan Plateau (QTP), China, over thousands of years. However, a lack of genome information hindered the genetic study. Here, we presented a chromosome-level genome assembly using single-molecule long-read sequencing, and the Hi-C technique. The assembled genome was 454.28 Mb, containing 14 chromosomes, with contig N50 of 2.14 Mb. A total of 46,495 protein-coding genes, 169.74 Mb repeat regions, and 31.76 Kb non-coding RNA were predicted. P. anserina diverged from Potentilla micrantha ∼28.52 million years ago (Mya). Furthermore, P. anserina underwent a recent tetraploidization ∼6.4 Mya. The species-specific genes were enriched in Starch and sucrose metabolism and Galactose metabolism pathways. We identified the sub-genome structures of P. anserina, with A sub-genome was larger than B sub-genome and closer to P. micrantha phylogenetically. Despite lacking significant genome-wide expression dominance, the A sub-genome had higher homoeologous gene expression in shoot apical meristem, flower and tuberous root. The resistance genes was contracted in P. anserina genome. Key genes involved in starch biosynthesis were expanded and highly expressed in tuberous roots, which probably drives the tuber formation. The genomics and transcriptomics data generated in this study advance our understanding of the genomic landscape of P. anserina, and will accelerate genetic studies and breeding programs.
Summary of Potentilla anserina genome assembly statistics
Supplementary Table S4. Statistics of completeness validation of genome assembly.
BUSCO eudicotyledons_odb10 (genome mode)
Homology of the Potentilla anserina Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2021-09) and 1e-6 for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format.
The Potentilla anserina Genome v1.0 assembly file is available in FASTA format.
The Potentilla anserina v1.0 genome gene prediction files are available in GFF3 and FASTA.
Functional annotation for the Potentilla anserina Genome v1.0 are available for download below. The Potentilla anserina Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Potentilla anserina genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.