PACO: Vol 121, No C

Volume 121, Issue CSep 2024Current Issue

Latest Issue

Volume 121, Issue C

Sep 2024

Publisher:

Elsevier Science Publishers B. V.
PO Box 211 1000 AE Amsterdam
Netherlands

ISSN:0167-8191

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

editorial

Editorial Board

https://doi.org/10.1016/S0167-8191(24)00046-2

Regular paper

research-article

WBSP: Addressing stragglers in distributed machine learning with worker-busy synchronous parallel

https://doi.org/10.1016/j.parco.2024.103092

Abstract

Parameter server is widely used in distributed machine learning to accelerate training. However, the increasing heterogeneity of workers’ computing capabilities leads to the issue of stragglers, making parameter synchronization challenging. To ...

research-article

Multi-GPU 3D k-nearest neighbors computation with application to ICP, point cloud smoothing and normals computation

https://doi.org/10.1016/j.parco.2024.103093

Abstract

The k-Nearest Neighbors algorithm is a fundamental algorithm that finds applications in many fields like Machine Learning, Computer Graphics, Computer Vision, and others. The algorithm determines the closest points (d-dimensional) of a reference ...

research-article

NxtSPR: A deadlock-free shortest path routing dedicated to relaying for Triplet-Based many-core Architecture

https://doi.org/10.1016/j.parco.2024.103094

Abstract

Deadlock-free routing is a significant challenge in Network-on-Chip (NoC) design as it affects the network’s latency, power consumption, and load balance, impacting the performance of multi-processor systems-on-chip. However, achieving deadlock-...

Highlights

The topology-related characteristics of Triplet-Based many-core Architecture are defined systematically using graph and group theory, and its correctness is verified through formal verification (proof-based) methods.
A novel and high-...

research-article

Mobilizing underutilized storage nodes via job path: A job-aware file striping approach

https://doi.org/10.1016/j.parco.2024.103095

Abstract

Users’ limited understanding of the storage system architecture prevents them from fully utilizing the parallel I/O capability of the storage system, leading to a negative impact on the overall performance of supercomputers. Therefore, exploring ...

Special issue on The 15th International Workshop on Programming Models and Applications for Multicores and Manycores

research-article

Abstractions for C++ code optimizations in parallel high-performance applications

https://doi.org/10.1016/j.parco.2024.103096

Abstract

Many computational problems consider memory throughput a performance bottleneck, especially in the domain of parallel computing. Software needs to be attuned to hardware features like cache architectures or concurrent memory banks to reach a ...

Highlights

Proposing novel abstraction for flexible traversals of regular data structures.
Designed for traversal-agnostic algorithms in HPC parallel computing.
Reduces traversal code complexity, improving separation of concerns and ...

research-article

An automated OpenMP mutation testing framework for performance optimization

https://doi.org/10.1016/j.parco.2024.103097

Abstract

Performance optimization continues to be a challenge in modern HPC software. Existing performance optimization techniques, including profiling-based and auto-tuning techniques, fail to indicate program modifications at the source level thus ...

Parallel Computing

Sections

Editorial Board

WBSP: Addressing stragglers in distributed machine learning with worker-busy synchronous parallel

Multi-GPU 3D k-nearest neighbors computation with application to ICP, point cloud smoothing and normals computation

NxtSPR: A deadlock-free shortest path routing dedicated to relaying for Triplet-Based many-core Architecture

Mobilizing underutilized storage nodes via job path: A job-aware file striping approach

Abstractions for C++ code optimizations in parallel high-performance applications

An automated OpenMP mutation testing framework for performance optimization

Sections

Editorial Board

WBSP: Addressing stragglers in distributed machine learning with worker-busy synchronous parallel

Multi-GPU 3D k-nearest neighbors computation with application to ICP, point cloud smoothing and normals computation

NxtSPR: A deadlock-free shortest path routing dedicated to relaying for Triplet-Based many-core Architecture

Mobilizing underutilized storage nodes via job path: A job-aware file striping approach

Abstractions for C++ code optimizations in parallel high-performance applications

An automated OpenMP mutation testing framework for performance optimization

Save to Binder

Comments