Prunus mandshurica CH264_4 Whole Genome v1.0 Assembly & Annotation

Overview
Analysis NamePrunus mandshurica CH264_4 Whole Genome v1.0 Assembly & Annotation
MethodFalcon/Falcon-Unzip/SMARTdenovo/Ra (na)
SourceLong-reads(Oxford Nanopore Technologies or Pacific Biosciences)and Illumina
Date performed2021-03-09

Publication

Groppi A, Liu S, Cornille A, Decroocq S, Bui QT, Tricon D, Cruaud C, Arribat S, Belser C, Marande W, Salse J, Huneau C, Rodde N, Rhalloussi W, Cauet S, Istace B, Denis E, Carrère S, Audergon JM, Roch G, Lambert P, Zhebentyayeva T, Liu WS, Bouchez O, Lopez-Roques C, Serre RF, Debuchy R, Tran J, Wincker P, Chen X, Pétriacq P, Barre A, Nikolski M, Aury JM, Abbott AG, Giraud T, Decroocq V. Population genomics of apricots unravels domestication history and adaptive events.. Nature communications. 2021 06 25; 12(1):3956.

Background

The specific traits that have recently evolved in domesticated organisms under strong and recent human-driven selection provide unique opportunities to understand adaptive processes as it leaves footprints in the genome that are easier to detect than those left by natural selection. Here we studied the evolutionary history and selection footprints in apricots (Armeniaca spp) using a population genomics approach. We obtained the genomes of nearly 600 accessions and four high-quality assemblies of Armeniaca genomes, anchored on genetic maps. Population structure inferences and approximate Bayesian computation using these genomes indicated that Chinese and European apricots formed two differentiated gene pools with relatively high genetic diversity, resulting from independent domestication events from two different wild Central Asian populations, and with gene flow between wild and cultivated populations. We detected genomic footprints of selection (i.e., selective sweeps and recurrent changes in amino-acids) in the two groups of cultivated apricots. Consistent with large effective population sizes and outcrossing in fruit trees, we found a relatively low proportion of the genome affected by selection. Different genomic regions have been affected by selection in European and Chinese cultivated apricots, despite convergent phenotypic traits. Selection footprints appeared more abundant in European apricots, with a hotspot on chromosome 4, while admixture was much more pervasive in Chinese cultivated apricots. In both cultivated groups, the genes affected by selection had predicted functions involved in the perennial life cycle, fruit quality and disease resistance, and some of them colocalized with previously mapped genomic regions associated with these functions. In addition to improving our fundamental knowledge on the processes of adaptation, identified genes under positive selection provide clues to the biology of selected traits and targets for fruit tree research and breeding.

Homology

Homology of the Prunus mandshurica CH264_4 genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2018-05) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2019-01), and UniProtKB/TrEMBL (Release 2019-01) databases. The best hit reports are available for download in Excel format. 

 

Protein Homologs

Prunus mandshurica CH264_4 v1.0 proteins with NCBI nr homologs (EXCEL file) Pmandshurica_CH264_4_v1.0_vs_nr.xlsx.gz
Prunus mandshurica CH264_4 v1.0 proteins with NCBI nr (FASTA file) Pmandshurica_CH264_4_v1.0_vs_nr_hit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins without NCBI nr (FASTA file) Pmandshurica_CH264_4_v1.0_vs_nr_noHit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) Pmandshurica_CH264_4_v1.0_vs_arabidopsis.xlsx.gz
Prunus mandshurica CH264_4 v1.0 proteins with arabidopsis (Araport11) (FASTA file) Pmandshurica_CH264_4_v1.0_vs_arabidopsis_hit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins without arabidopsis (Araport11) (FASTA file) Pmandshurica_CH264_4_v1.0_vs_arabidopsis_noHit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins with SwissProt homologs (EXCEL file) Pmandshurica_CH264_4_v1.0_vs_swissprot.xlsx.gz
Prunus mandshurica CH264_4 v1.0 proteins with SwissProt (FASTA file) Pmandshurica_CH264_4_v1.0_vs_swissprot_hit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins without SwissProt (FASTA file) Pmandshurica_CH264_4_v1.0_vs_swissprot_noHit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins with TrEMBL homologs (EXCEL file) Pmandshurica_CH264_4_v1.0_vs_trembl.xlsx.gz
Prunus mandshurica CH264_4 v1.0 proteins with TrEMBL (FASTA file) Pmandshurica_CH264_4_v1.0_vs_trembl_hit.fasta.gz
Prunus mandshurica CH264_4 v1.0 proteins without TrEMBL (FASTA file) Pmandshurica_CH264_4_v1.0_vs_trembl_noHit.fasta.gz

 

Assembly

The Prunus mandshurica CH264_4 Genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) pMandshurica_CH264_4_v1.0.fasta.gz

 

Gene Predictions

The Prunus mandshurica CH264_4 v1.0 genome gene prediction files are available in FASTA and GFF3 formats.

Downloads

Protein sequences  (FASTA file) pMandshurica_CH264_4_v1.0.proteins.fasta.gz
CDS  (FASTA file) pMandshurica_CH264_4_v1.0.cds.fasta.gz
Genes (GFF3 file) pMandshurica_CH264_4_v1.0.genes.gff3.gz

 

Functional Analysis

Functional annotation for the Prunus mandshurica CH264_4 Genome v1.0 are available for download below. The Prunus mandshurica CH264_4 Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan Pmandshurica_CH264_4_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan Pmandshurica_CH264_4_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs Pmandshurica_CH264_4_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways Pmandshurica_CH264_4_v1.0_KEGG-pathways.xlsx.gz

 

Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the Prunus mandshurica CH264_4 genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.

 

Fragaria x ananassa GDR RefTrans v1 Prunus mandshurica CH264_4_v1.0_f.x.ananassa_GDR_reftransV1
Malus_x_domestica GDR RefTrans v1 Prunus mandshurica CH264_4_v1.0_m.x.domestica_GDR_reftransV1
fragaria avium GDR RefTrans v1 Prunus mandshurica CH264_4_v1.0_p.avium_GDR_reftransV1
fragaria persica GDR RefTrans v1 Prunus mandshurica CH264_4_v1.0_p.persica_GDR_reftransV1
Pyrus GDR RefTrans v1 Prunus mandshurica CH264_4_v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 Prunus mandshurica CH264_4_v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 Prunus mandshurica CH264_4_v1.0_rubus_GDR_reftransV2