Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit

GPGPU '16: Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit

March 2016

2016 Proceeding

Conference Chairs:
David Kaeli,
John Cavazos

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

PPoPP '16: 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Barcelona Spain 12 March 2016

ISBN:

978-1-4503-4195-0

Published:

12 March 2016

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Reflects downloads up to 01 Jan 2025Bibliometrics

Citation Count

115

Downloads (6 weeks)

Downloads (12 months)

296

Downloads (cumulative)

3,935

Sections

GPGPU '16: Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit

2016

Previous Next

Abstract

No abstract available.

Proceeding Downloads

PDFFront matter (Title page, Messge from the organizers, TOC)

Skip Table Of Content Section

Select All

Export Citations Save to Binder

SESSION: Algorithm

abstract

Runtime aware architectures

Mateo Valero

Page 1https://doi.org/10.1145/2884045.2884055

In the last years the traditional ways to keep the increase of hardware performance to the rate predicted by the Moore's Law vanished. When uni-cores were the norm, hardware design was decoupled from the software stack thanks to a well defined ...

- 0
Metrics
Total Citations0

Abstract

research-article

GPU centric extensions for parallel strongly connected components computation

Shrinivas Devshatwar,
Madhur Amilkanthwar,
Rupesh Nasre

Pages 2–11https://doi.org/10.1145/2884045.2884048

Finding Strongly Connected Components (SCC) of a directed graph is a fundamental graph problem. Many of the state-of-the-art sequential algorithms use depth-first search (DFS) to find SCCs. Since, in general DFS is hard to parallelize, researchers rely ...

- 9
- 427
Metrics
Total Citations9
Total Downloads427
Last 12 Months21
Last 6 weeks4

Abstract
Get Access

research-article

General-purpose join algorithms for large graph triangle listing on heterogeneous systems

Daniel Zinn,
Haicheng Wu,
Jin Wang,
Molham Aref,
Sudhakar Yalamanchili

Pages 12–21https://doi.org/10.1145/2884045.2884054

We investigate applying general-purpose join algorithms to the triangle listing problem on heterogeneous systems that feature a multi-core CPU and multiple GPUs. In particular, we consider an out-of-core context where graph data are available on ...

- 4
- 218
Metrics
Total Citations4
Total Downloads218
Last 12 Months11
Last 6 weeks1

Abstract
Get Access

SESSION: Heterogenous languages, extensions and runtimes

research-article

Performance portable GPU code generation for matrix multiplication

Toomas Remmelg,
Thibaut Lutz,
Michel Steuwer,
Christophe Dubach

Pages 22–31https://doi.org/10.1145/2884045.2884046

Parallel accelerators such as GPUs are notoriously hard to program; exploiting their full performance potential is a job best left for ninja programmers. High-level programming languages coupled with optimizing compilers have been proposed to attempt to ...

- 19
- 306
Metrics
Total Citations19
Total Downloads306
Last 12 Months20
Last 6 weeks1

Abstract
Get Access

research-article

Multi-stage programming for GPUs in C++ using PACXX

Michael Haidl,
Michel Steuwer,
Tim Humernbrum,
Sergei Gorlatch

Pages 32–41https://doi.org/10.1145/2884045.2884049

Writing and optimizing programs for high performance on systems with Graphics Processing Units (GPUs) remains a challenging task even for expert programmers. A promising optimization technique is multi-stage programming -- evaluating parts of the ...

- 6
- 273
Metrics
Total Citations6
Total Downloads273
Last 12 Months8
Last 6 weeks1

Abstract
Get Access

research-article

Simplifying programming and load balancing of data parallel applications on heterogeneous systems

Borja Pérez,
José Luis Bosque,
Ramón Beivide

Pages 42–51https://doi.org/10.1145/2884045.2884051

Heterogeneous architectures have experienced a great development thanks to their excellent cost/performance ratio and low power consumption. But heterogeneity significantly complicates both programming and efficient use of the resources. As a result, ...

- 32
- 346
Metrics
Total Citations32
Total Downloads346
Last 12 Months5
Last 6 weeks2

Abstract
Get Access

SESSION: Tasking and scheduling

abstract

Working together to build the heterogeneous processing ecosystem

Andrew Richards

Page 52https://doi.org/10.1145/2884045.2884056

We can now say that almost all future performance improvements will come from heterogeneous acceleration. But the reality of building successful software and platforms is that no one company or individual can create everything. That means we need to ...

- 0
Metrics
Total Citations0

Abstract

research-article

Implementing directed acyclic graphs with the heterogeneous system architecture

Sooraj Puthoor,
Ashwin M. Aji,
Shuai Che,
Mayank Daga,
Wei Wu,
Bradford M. Beckmann,
Gregory Rodgers

Pages 53–62https://doi.org/10.1145/2884045.2884052

Achieving optimal performance on heterogeneous computing systems requires a programming model that supports the execution of asynchronous, multi-stream, and out-of-order tasks in a shared memory environment. Asynchronous dependency-driven tasking is one ...

- 16
- 453
Metrics
Total Citations16
Total Downloads453
Last 12 Months26
Last 6 weeks3

Abstract
Get Access

research-article

GPUpIO: the case for I/O-driven preemption on GPUs

Lior Zeno,
Avi Mendelson,
Mark Silberstein

Pages 63–71https://doi.org/10.1145/2884045.2884053

As GPUs become general purpose, they are outgrowing the coprocessor model and require convenient I/O abstractions such as files and network sockets. Recent studies have shown the benefits of native GPU I/O layers, in terms of both programmability and ...

- 8
- 265
Metrics
Total Citations8
Total Downloads265
Last 12 Months17
Last 6 weeks3

Abstract
Get Access

SESSION: Stencil optimization

research-article

A systems perspective on GPU computing: a tribute to Karsten Schwan

Naila Farooqui

Pages 72–81https://doi.org/10.1145/2884045.2884057

Over a distinguished career, Regents Professor Karsten Schwan has made significant contributions across a diverse array of topics in computer systems, including operating systems for multi-core platforms, virtualization technologies, enterprise ...

- 1
- 192
Metrics
Total Citations1
Total Downloads192
Last 12 Months5
Last 6 weeks1

Abstract
Get Access

research-article

Public Access

Designing high performance communication runtime for GPU managed memory: early experiences

Dip Sankar Banerjee,
Khaled Hamidouche,
Dhabaleswar K. Panda

Pages 82–91https://doi.org/10.1145/2884045.2884050

Graphics Processing Units (GPUs) have gained the position of a main stream accelerator due to its low power footprint and massive parallelism. CUDA 6.0 onward, NVIDIA has introduced the Managed Memory capability which unifies the host and device memory ...

- 4
- 662
Metrics
Total Citations4
Total Downloads662
Last 12 Months78
Last 6 weeks9

Abstract
View online with eReader
PDF

research-article

Public Access

Effective resource management for enhancing performance of 2D and 3D stencils on GPUs

Prashant Singh Rawat,
Changwan Hong,
Mahesh Ravishankar,
Vinod Grover,
Louis-Noël Pouchet,
P. Sadayappan

Pages 92–102https://doi.org/10.1145/2884045.2884047

GPUs are an attractive target for data parallel stencil computations prevalent in scientific computing and image processing applications. Many tiling schemes, such as overlapped tiling and split tiling, have been proposed in past to improve the ...

- 16
- 744
Metrics
Total Citations16
Total Downloads744
Last 12 Months105
Last 6 weeks16

Abstract
View online with eReader
PDF

Save to Binder

Create a New Binder

Name

Contributors

David R. Kaeli
Northeastern University
- Publication Years1991 - 2024
- Publication counts192
- Citation count2,478
- Available for Download106
- Downloads (cumulative)64,364
- Downloads (12 months)11,147
- Downloads (6 weeks)1,405
- Average Downloads per Article607
- Average Citation per Article13
View Full Profile
John Cavazos
University of Delaware
- Publication Years1996 - 2018
- Publication counts46
- Citation count1,675
- Available for Download37
- Downloads (cumulative)25,503
- Downloads (12 months)2,587
- Downloads (6 weeks)199
- Average Downloads per Article689
- Average Citation per Article36
View Full Profile

Index Terms

Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit

Index terms have been assigned to the content through auto-classification.

Comments

Recommendations

NetGames '10: Proceedings of the 9th Annual Workshop on Network and Systems Support for Games
MobiCom '03: Proceedings of the 9th annual international conference on Mobile computing and networking
WPES '10: Proceedings of the 9th annual ACM workshop on Privacy in the electronic society

Acceptance Rates

GPGPU '16 Paper Acceptance Rate 9 of 23 submissions, 39%;

Overall Acceptance Rate 57 of 129 submissions, 44%

Year	Submitted	Accepted	Rate
GPGPU '20	12	7	58%
GPGPU '19	15	6	40%
GPGPU-10	15	8	53%
GPGPU '16	23	9	39%
GPGPU-7	27	12	44%
GPGPU-6	37	15	41%
Overall	129	57	44%

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Proceeding Downloads

Save to Binder

Index Terms

Recommendations

NetGames '10: Proceedings of the 9th Annual Workshop on Network and Systems Support for Games

MobiCom '03: Proceedings of the 9th annual international conference on Mobile computing and networking

WPES '10: Proceedings of the 9th annual ACM workshop on Privacy in the electronic society

Acceptance Rates