skip to main content
10.1145/2488551.2488581acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurompiConference Proceedingsconference-collections
research-article

A dynamic pipeline for RNA sequencing on multicore processors

Published: 15 September 2013 Publication History

Abstract

We present a concurrent algorithm for mapping short and long RNA sequences on multicore processors. Our solution processes the data, initially stored on disk, in batches of reads which are passed between the consecutive stages of a pipeline. A major operational reorganization of the original static pipeline, combined with a complete reimplementation based on POSIX threads, renders a dissociated execution between threads and stages/task types, so that threads can compute any type of pending task resulting in a dynamic pipeline. The experiments on a multicore platform reveal that this reorganization yields significantly higher performance, specially for architectures equipped with a small to moderate number of cores.
As an additional contribution, our experiments also reveal that the use of 16-nucleotide (nt) seeds during the one of the stages of the pipeline, instead of the 15-nt length that was proposed originally, yields a remarkable reduction in the execution time of the global alignment process while maintaining the sensitivity of the algorithm.

References

[1]
D. Adjeroh, T. C. Bell, and A. Mukherjee. The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern matching. Springer, 2008.
[2]
L. Ben and S. Steven L. Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4):357--359, 2012.
[3]
R. Blumofe and C. Leiserson. Scheduling multithreaded computations by work stealing. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, pages 356--368, November 1994.
[4]
M. Garber, M. G. Grabherr, M. Guttman, and C. Trapnell. Computational methods for transcriptome annotation and quantification using RNA-seq. Nature methods, 8:469--477, 2011.
[5]
J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann Pub., San Francisco, 5th edition, 2012.
[6]
B. Langmead, C. Trapnell, M. Pop, and S. L Salzberg. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome biology, 10(3):R25, 2009.
[7]
H. Li and N. Homer. A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform, 11:473--483, 2010.
[8]
H. Martínez, J. Tárraga, I. Medina, S. Barrachina, M. Castillo, J. Dopazo, and E. S. Quintana-Ortí. Concurrent and accurate RNA sequencing on multicore platforms. Technical Report ICC 2013-03-01, Depto. Ingeniería y Ciencia de Computadores, Universidad Jaime I, Spain, 2013. Available at http://www.arxiv.org/1304.0681.
[9]
OpenMP Architecture Review Board. OpenMP web site. http://www.openmp.org/.
[10]
SAMtools. BAM/SAM API documentation. http://samtools.sourceforge.net/.
[11]
T. F. Smith and M. S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147:195--197, 1981.
[12]
C. Trapnell, L. Pachter, and S. L. Salzberg. TopHat: discovering splice junctions with RNA-seq. Bioinformatics, 25(9):1105--1111, 2009.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EuroMPI '13: Proceedings of the 20th European MPI Users' Group Meeting
September 2013
289 pages
ISBN:9781450319034
DOI:10.1145/2488551
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • ARCOS: Computer Architecture and Technology Area, Universidad Carlos III de Madrid

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 September 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RNA sequencing
  2. high performance computing
  3. multicore processors
  4. pipelining
  5. short-read alignment

Qualifiers

  • Research-article

Conference

EuroMPI '13
Sponsor:
  • ARCOS
EuroMPI '13: 20th European MPI Users's Group Meeting
September 15 - 18, 2013
Madrid, Spain

Acceptance Rates

EuroMPI '13 Paper Acceptance Rate 22 of 47 submissions, 47%;
Overall Acceptance Rate 66 of 139 submissions, 47%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media