Prunus armeniaca Marouch n14 Whole Genome v1.0 Assembly & Annotation

Overview
Analysis NamePrunus armeniaca Marouch n14 Whole Genome v1.0 Assembly & Annotation
MethodFalcon/Falcon-Unzip/SMARTdenovo/Ra
SourceLong-reads (Oxford Nanopore Technologies or Pacific Biosciences) and short-reads (Illumina).
Date performed2021-02-10

Publication

Groppi A, Liu S, Cornille A, Decroocq S, Bui QT, Tricon D, Cruaud C, Arribat S, Belser C, Marande W, Salse J, Huneau C, Rodde N, Rhalloussi W, Cauet S, Istace B, Denis E, Carrère S, Audergon JM, Roch G, Lambert P, Zhebentyayeva T, Liu WS, Bouchez O, Lopez-Roques C, Serre RF, Debuchy R, Tran J, Wincker P, Chen X, Pétriacq P, Barre A, Nikolski M, Aury JM, Abbott AG, Giraud T, Decroocq V. Population genomics of apricots unravels domestication history and adaptive events.. Nature communications. 2021 06 25; 12(1):3956.

Background

The specific traits that have recently evolved in domesticated organisms under strong and recent human-driven selection provide unique opportunities to understand adaptive processes as it leaves footprints in the genome that are easier to detect than those left by natural selection. Here we studied the evolutionary history and selection footprints in apricots (Armeniaca spp) using a population genomics approach. We obtained the genomes of nearly 600 accessions and four high-quality assemblies of Armeniaca genomes, anchored on genetic maps. Population structure inferences and approximate Bayesian computation using these genomes indicated that Chinese and European apricots formed two differentiated gene pools with relatively high genetic diversity, resulting from independent domestication events from two different wild Central Asian populations, and with gene flow between wild and cultivated populations. We detected genomic footprints of selection (i.e., selective sweeps and recurrent changes in amino-acids) in the two groups of cultivated apricots. Consistent with large effective population sizes and outcrossing in fruit trees, we found a relatively low proportion of the genome affected by selection. Different genomic regions have been affected by selection in European and Chinese cultivated apricots, despite convergent phenotypic traits. Selection footprints appeared more abundant in European apricots, with a hotspot on chromosome 4, while admixture was much more pervasive in Chinese cultivated apricots. In both cultivated groups, the genes affected by selection had predicted functions involved in the perennial life cycle, fruit quality and disease resistance, and some of them colocalized with previously mapped genomic regions associated with these functions. In addition to improving our fundamental knowledge on the processes of adaptation, identified genes under positive selection provide clues to the biology of selected traits and targets for fruit tree research and breeding.

Homology

Homology of the Prunus armeniaca Marouch n14 genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2018-05) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2019-01), and UniProtKB/TrEMBL (Release 2019-01) databases. The best hit reports are available for download in Excel format. 

 

Protein Homologs

Prunus armeniaca Marouch n14 v1.0 proteins with NCBI nr homologs (EXCEL file) Prunus_armeniaca_Marouch_n14_v1.0_vs_nr.xlsx.gz
Prunus armeniaca Marouch n14 v1.0 proteins with NCBI nr (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_nr_hit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins without NCBI nr (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_nr_noHit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) Prunus_armeniaca_Marouch_n14_v1.0_vs_arabidopsis.xlsx.gz
Prunus armeniaca Marouch n14 v1.0 proteins with arabidopsis (Araport11) (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_arabidopsis_hit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins without arabidopsis (Araport11) (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_arabidopsis_noHit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins with SwissProt homologs (EXCEL file) Prunus_armeniaca_Marouch_n14_v1.0_vs_swissprot.xlsx.gz
Prunus armeniaca Marouch n14 v1.0 proteins with SwissProt (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_swissprot_hit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins without SwissProt (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_swissprot_noHit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins with TrEMBL homologs (EXCEL file) Prunus_armeniaca_Marouch_n14_v1.0_vs_trembl.xlsx.gz
Prunus armeniaca Marouch n14 v1.0 proteins with TrEMBL (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_trembl_hit.fasta.gz
Prunus armeniaca Marouch n14 v1.0 proteins without TrEMBL (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0_vs_trembl_noHit.fasta.gz

 

Assembly

The Prunus armeniaca Marouch n14 Genome v1.0 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0.fasta.gz

 

Gene Predictions

The Prunus armeniaca Marouch n14 v1.0 genome gene prediction files are available in FASTA and GFF3 formats.

Downloads

Protein sequences  (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0.proteins.fasta.gz
CDS  (FASTA file) Prunus_armeniaca_Marouch_n14_v1.0.cds.fasta.gz
Genes (GFF3 file) Prunus_armeniaca_Marouch_n14_v1.0.genes.gff3.gz

 

Functional Analysis

Functional annotation for the Prunus armeniaca Marouch n14 Genome v1.0 are available for download below. The Prunus armeniaca Marouch n14 Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan Prunus_armeniaca_Marouch_n14_v1.0_genes2GO.xlsx.gz
IPR assignments from InterProScan Prunus_armeniaca_Marouch_n14_v1.0_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs Prunus_armeniaca_Marouch_n14_v1.0_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways Prunus_armeniaca_Marouch_n14_v1.0_KEGG-pathways.xlsx.gz

 

Transcript Alignments
Transcript alignments were performed by the GDR Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the fragaria salicina genome assembly. Alignments with an alignment length of 97% and 97% identify were preserved. The available files are in GFF3 format.

 

Fragaria x ananassa GDR RefTrans v1 Prunus armeniaca Marouch n14 _v1.0_f.x.ananassa_GDR_reftransV1
fragaria avium GDR RefTrans v1 Prunus armeniaca Marouch n14 _v1.0_p.avium_GDR_reftransV1
fragaria persica GDR RefTrans v1 Prunus armeniaca Marouch n14 _v1.0_p.persica_GDR_reftransV1
Pyrus GDR RefTrans v1 Prunus armeniaca Marouch n14 _v1.0_p.persica_GDR_reftransV1
Rosa GDR RefTrans v1 Prunus armeniaca Marouch n14 _v1.0_rosa_GDR_reftransV1
Rubus GDR RefTrans v2 Prunus armeniaca Marouch n14 _v1.0_rubus_GDR_reftransV2