No abstract available.
Front Matter
Front Matter
Locality-Based Optimizations in the Chapel Compiler
One of the main challenges of distributed memory programming is achieving efficient access to data. Low-level programming paradigms such as MPI and SHMEM require programmers to explicitly move data between compute nodes, which typically results in ...
iCetus: A Semi-automatic Parallel Programming Assistant
The iCetus tool is a new interactive parallelizer, providing users with a range of capabilities for the source-to-source transformation of C programs using OpenMP directives in shared memory machines. While the tool can parallelize code fully ...
Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization
Modern compilers have relied on various best-effort heuristics to solve the register allocation problem due to its high computation complexity. A “greedy” algorithm that performs a scan of prioritized live intervals for allocation followed by ...
Performance Evaluation of OSCAR Multi-target Automatic Parallelizing Compiler on Intel, AMD, Arm and RISC-V Multicores
With an increasing number of shared memory multicore processor architectures, there is a requirement for supporting multiple architectures in automatic parallelizing compilers. The OSCAR (Optimally Scheduled Advanced Multiprocessor) automatic ...
Front Matter
LC-MEMENTO: A Memory Model for Accelerated Architectures
- Kiran Ranganath,
- Jesun Firoz,
- Joshua Suetterlein,
- Joseph Manzano,
- Andres Marquez,
- Mark Raugas,
- Daniel Wong
With the advent of heterogeneous architectures, in particular, with the ubiquity of multi-GPU systems, it is becoming increasingly important to manage device memory efficiently in order to reap the benefits of the additional core count. To date, ...
The ORKA-HPC Compiler—Practical OpenMP for FPGAs
ORKA-HPC is a new and downloadable OpenMP-to-FPGA compiler that is easy to set up, easy to use, and easy to extend. It targets a variety of different FPGA-boards, and is distributed with a “batteries included” runtime and development environment.
...
Front Matter
Optimizing Sparse Matrix Multiplications for Graph Neural Networks
Graph neural networks (GNNs) are emerging as a powerful technique for modeling graph structures. Due to the sparsity of real-world graph data, GNN performance is limited by extensive sparse matrix multiplication (SpMM) operations involved in ...
A Hybrid Synchronization Mechanism for Parallel Sparse Triangular Solve
Sparse triangular solve, SpTS, is an important and recurring component of many sparse linear solvers that are extensively used in many big-data analytics and machine learning algorithms. Despite its inherent sequential execution, a number of ...
Techniques for Managing Polyhedral Dataflow Graphs
- Ravi Shankar,
- Aaron Orenstein,
- Anna Rift,
- Tobi Popoola,
- MacDonald Lowe,
- Shuai Yang,
- T. Dylan Mikesell,
- Catherine Olschanowsky
Scientific applications, especially legacy applications, contain a wealth of scientific knowledge. As hardware changes, applications need to be ported to new architectures and extended to include scientific advances. As a result, it is common to ...
Index Terms
- Languages and Compilers for Parallel Computing: 34th International Workshop, LCPC 2021, Newark, DE, USA, October 13–14, 2021, Revised Selected Papers