Pyrus GDR RefTrans V1.0

Analysis NamePyrus GDR RefTrans V1.0
Methodreftrans (1.0)
Date performed2019-12-13

Materials & Methods

GDR Pyrus RefTrans V1.0 combines peer-reviewed published RNA-Seq and EST data sets to create a reference transcriptome (RefTrans, 63,681 sequences) for Pyrus and provides putative gene function identified by homology to known proteins.

In Pyrus RefTrans V1.0, 4.5 billion RNA-Seq reads from publicly available, peer-reviewed Pyrus RNA-Seq data sets (Busatto et al. 2019 [SRP142284],  Cao et al. 2019 [SRP148620], Cheng et al. 2019 [SRP116053], Li et al. 2019 [SRP167064SRP167421], Meng et al. 2019 [SRP065003], Yang et al. 2019 [SRP013102],  Zhao et al. 2019 [SRP018328], Cao et al. 2018 [SRP026375, SRP041408SRP051914], Jiao et al. 2018 [SRP009196], Wang et al. 2018 [SRP063385]), and 1,967 ESTs, were downloaded from the NCBI Short Read Archive database, the EBI database and the NCBI dbEST database, respectively. The RNA-Seq reads and ESTs were assembled using the Mainlab RefTrans pipeline (manuscript in preparation – details of pipeline provided ahead of publication on request). The RefTrans sequences were functionally characterized by pairwise comparison using the BLASTX algorithm against the Swiss-Prot (UniProtKB/Swiss-Prot Release 2019_01) and TrEMBL (UniProtKB/TrEMBL Release 2019_01) protein databases.  Information on the top 10 matches with an expectation (E) value of ≤ 1E-06 were recorded and stored in GDR together with the RefTrans sequences. InterPro domains and Gene Ontology assignments were made to Pyrus RefTrans V1 using InterProScan. The transcriptome and associated annotation are available to download, search by name, keyword (functional description), or mapped location, and view on the genome through JBrowse.




RefTrans in FASTA format  (63,681 sequences) Pyrus  RefTrans V1.0 FASTA format 


Homology Analysis

Homology of the Pyrus RefTrans V1.0 was determined by pairwise sequence comparison using the blastx algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2018-05) and 1e-6  for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2019-01), and UniProtKB/TrEMBL (Release 2019-01) databases. The best hit reports are available for download in Excel format. 


ExPASy SwissProt (Excel file) (69% refTrans with homologies) Pyrus  RefTrans V1.0 vs Swissprot.xlsx.gz
RefTrans with homologies (Fasta file) Pyrus RefTrans V1.0 vs Swissprot_hit.fasta.gz
RefTrans without homologies (Fasta file) Pyrus  RefTrans V1.0 vs Swissprot_noHit.fasta.gz


ExPAS TrEMBL (Excel file) (88% refTrans with homologies) Pyrus   RefTrans V1.0 vs TrEMBL.xlsx.gz
RefTrans with homologies (Fasta file) Pyrus  RefTrans V1.0 vs TrEMBL_hit.fasta.gz
RefTrans without homologies (Fasta file) Pyrus RefTrans V1.0 vs TrEMBL_noHit.fasta.gz

Araport11 (arabidopsis)

 Araport11 (arabidopsis) (Excel file) (74% refTrans with homologies) Pyrus  RefTrans V1.0 vs Araport11.xlsx.gz
RefTrans with homologies (Fasta file)  Pyrus RefTrans V1.0 vs Araport11_hit.fasta.gz
RefTrans without homologies (Fasta file) Pyrus  RefTrans V1.0 vs Araport11_noHit.fasta.gz


NCBI nr (88% refTrans with homologies) Pyrus  RefTrans V1.0 vs NCBI nr.xlsx.gz
RefTrans with homologies (Fasta file) Pyrus RefTrans V1.0 vs NCBI nr_hit.fasta.gz
RefTrans without homologies (Fasta file) Pyrus  RefTrans V1.0 vs NCBI nr_noHit.fasta.gz


InterProscan Analysis

InterPro domains and Gene Ontology assignments were made to the Pyrus  RefTrans V1.0 using InterProScan at the EBI through Blast2GO.
Gene Ontology annotations by RefTrans (Excel file) Pyrus  RefTrans V1.0 Gene Ontology annotations.xlsx.gz
InterPro annotations by RefTrans (Excel file) Pyrus  RefTrans V1.0 InterPro annotations.xlsx.gz


KEGG Analysis

KEGG pathway and ortholog assignments were made to the pyrus  RefTrans V1.0 using the KEGG / KASS server at 
KEGG pathway annotations by RefTrans (Excel file)  Pyrus  RefTrans V1.0 KEGG pathways.xlsx.gz
KEGG ortholog annotations by RefTrans (Excel file) Pyrus  RefTrans V1.0 KEGG orthologs.xlsx.gz



The alignment tool 'BLAT' was used to map the Pyrus  RefTrans V1.0 to the genome.  Alignments with an alignment length of 98% and 95% identify were preserved. 
BLAT of refTrans to the R.chinensis homozygous genome v2.0 (Excel file) Pyrus  RefTrans V1.0_Pussuriensis x communis Zhongai 1 genome v1.0.xlsx.gz