Pyrus GDR RefTrans V1.0
Materials & Methods
GDR Pyrus RefTrans V1.0 combines peer-reviewed published RNA-Seq and EST data sets to create a reference transcriptome (RefTrans, 63,681 sequences) for Pyrus and provides putative gene function identified by homology to known proteins.
In Pyrus RefTrans V1.0, 4.5 billion RNA-Seq reads from publicly available, peer-reviewed Pyrus RNA-Seq data sets (Busatto et al. 2019 [SRP142284], Cao et al. 2019 [SRP148620], Cheng et al. 2019 [SRP116053], Li et al. 2019 [SRP167064, SRP167421], Meng et al. 2019 [SRP065003], Yang et al. 2019 [SRP013102], Zhao et al. 2019 [SRP018328], Cao et al. 2018 [SRP026375, SRP041408, SRP051914], Jiao et al. 2018 [SRP009196], Wang et al. 2018 [SRP063385]), and 1,967 ESTs, were downloaded from the NCBI Short Read Archive database, the EBI database and the NCBI dbEST database, respectively. The RNA-Seq reads and ESTs were assembled using the Mainlab RefTrans pipeline (manuscript in preparation – details of pipeline provided ahead of publication on request). The RefTrans sequences were functionally characterized by pairwise comparison using the BLASTX algorithm against the Swiss-Prot (UniProtKB/Swiss-Prot Release 2019_01) and TrEMBL (UniProtKB/TrEMBL Release 2019_01) protein databases. Information on the top 10 matches with an expectation (E) value of ≤ 1E-06 were recorded and stored in GDR together with the RefTrans sequences. InterPro domains and Gene Ontology assignments were made to Pyrus RefTrans V1 using InterProScan. The transcriptome and associated annotation are available to download, search by name, keyword (functional description), or mapped location, and view on the genome through JBrowse.