Using RNA-Seq to create sample-specific proteomic databases that enable mass spectrometric discovery of splice junction peptides

SRP019939

Homo sapiens

9 Downloadable Samples

Illumina HiSeq 2000

Submitter Supplied Information

Description

Many new alternative splice forms have been detected at the transcript level using next generation sequencing (NGS) methods, especially RNA-Seq, but it is not known how many of these transcripts are being translated. Leveraging the unprecedented capabilities of NGS, we collected RNA-Seq and proteomics data from the same cell population (Jurkat cells) and created a bioinformatics pipeline that builds customized databases for the discovery of novel splice-junction peptides. Results: Eighty million paired-end Illumina reads and ~500,000 tandem mass spectra were used to identify 12,873 transcripts (19,320 including isoforms) and 6,810 proteins. We developed a bioinformatics workflow to retrieve high-confidence, novel splice junction sequences from the RNA data, translate these sequences into the analogous polypeptide sequence, and create a customized splice junction database for MS searching. Overall design: Jurkat T-cell mRNA was analyzed on an Illumina HiSeq2000. ~80 million paired end reads (2x200bp, ~350bp lengths) were collected.

PubMed ID

23629695

Publication Title

Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq.

Total Samples

Submitter’s Institution

No associated institution

Authors

Sheynkman GM, Shortreed MR, Frey BL, Smith LM

Source Repository

Sequence Read Archive (SRA)

No rows found