Search Page | SpringerLink

Article

GPU-based butterfly counting

When dealing with large bipartite graphs, butterfly counting is a crucial and time-consuming operation. Graphics processing units (GPUs) are widely...

Yifei Xia, Feng Zhang, ... Siqi Ma in The VLDB Journal

27 June 2024

Article

GPU cluster dynamics: insights from Alibaba’s 2023 trace release

In this paper, we present a comprehensive analysis of GPU cluster traces from Alibaba, released in 2023, focusing on understanding the detailed...

Ahmad Siavashi, Mahmoud Momtazpour in Computing

20 November 2024

Article

OpBench: an operator-level GPU benchmark for deep learning

Operators (such as Conv and ReLU) play an important role in deep neural networks. Every neural network is composed of a series of differentiable...

Qingwen Gu, Bo Fan, ... Shimin Hu in Science China Information Sciences

20 August 2024

Article

GPU Side-Channel Attack Classification for Targeted Secure Shader Mitigation

Graphics processing units (GPUs) provide massively parallel processing capabilities, enabling accelerated computation across diverse applications....

Nelson Lungu, Sudhansu Shekhar Patra, ... Mahendra Kumar Gourisaria in SN Computer Science

06 December 2024

Article

Hybridhadoop: CPU-GPU hybrid scheduling in hadoop

As a GPU has become an essential component in high performance computing, it has been attempted by many works to leverage GPU computing in Hadoop....

Chanyoung Oh, Saehanseul Yi, ... Youngmin Yi in Cluster Computing

21 November 2023

Article

A graph pattern mining framework for large graphs on GPU

Graph pattern mining (GPM) is an important problem in graph processing. There are many parallel frameworks for GPM, many of which suffer from low...

Lin Hu, Yinnian Lin, ... M. Tamer Özsu in The VLDB Journal

05 December 2024

Article

Full access

MuxFlow: efficient GPU sharing in production-level clusters with more than 10000 GPUs

Large-scale GPU clusters are widely used to speed up both latency-critical (online) and best-effort (offline) deep learning (DL) workloads. However,...

Xuanzhe Liu, Yihao Zhao, ... Xin Jin in Science China Information Sciences

13 December 2024

Article

Distributed data processing and task scheduling based on GPU parallel computing

Distributed data parallel (DDP) computing ensures data parallelism, enabling execution across several computers. A separate distributed data parallel...

Jun Li in Neural Computing and Applications

30 November 2024

Article

Implementation and analysis of GPU algorithms for Vecchia Approximation

Gaussian Processes have become an indispensable part of the spatial statistician’s toolbox but are unsuitable for analyzing large datasets because of...

Zachary James, Joseph Guinness in Statistics and Computing

28 October 2024

Article

Utilization-prediction-aware energy optimization approach for heterogeneous GPU clusters

Optimizing energy consumption in heterogeneous GPU clusters is of paramount importance to enhance overall system efficiency and reduce operational...

Sheng Wang, Shiping Chen, Yumei Shi in The Journal of Supercomputing

11 December 2023

Article

Full access

Towards GPU-enabled serverless cloud edge platforms for accelerating HEVC video coding

Multimedia streaming has become integral to modern living, reshaping entertainment consumption, information access, and global engagement. The ascent...

Andoni Salcedo-Navarro, Raúl Peña-Ortiz, ... Juan Gutiérrez-Aguado in Cluster Computing

05 November 2024 Open access

Article

High throughput acceleration of NIST lightweight authenticated encryption schemes on GPU platform

Authenticated encryption with associated data (AEAD) has become prominent over time because it offers authenticity and confidentiality...

Jia-Lin Chan, Wai-Kong Lee, ... Bok-Min Goi in Cluster Computing

20 May 2024

Article

GPU-accelerated relaxed graph pattern matching algorithms

Graph pattern matching is widely used in real-world applications, such as social network analysis. Since the traditional subgraph isomorphism is...

Amira Benachour, Saïd Yahiaoui, ... Nadia Nouali-Taboudjemat in The Journal of Supercomputing

16 June 2024

Article

A fine-grained GPU sharing and job scheduling for deep learning jobs on the cloud

This paper introduces an innovative GPU sharing and scheduling method to tackle resource wastage and underutilization in deep learning training jobs....

Wu-Chun Chung, Jyun-Sen Tong, Zhi-Hao Chen in The Journal of Supercomputing

03 January 2025

Article

Full access

An autotuning approach to select the inter-GPU communication library on heterogeneous systems

In this work, an automatic optimisation approach for parallel routines on multi-GPU systems is presented. Several inter-GPU communication libraries...

Jesús Cámara, Javier Cuenca, ... Murilo Boratto in The Journal of Supercomputing

12 December 2024 Open access

Article

GPU thread throttling for page-level thrashing reduction via static analysis

Unified virtual memory was introduced in modern GPUs to enable a new programming model for programmers. This method manages memory pages between the...

Hyunjun Kim, Hwansoo Han in The Journal of Supercomputing

16 December 2023

Article

Full access

CPU-GPU co-execution through the exploitation of hybrid technologies via SYCL

The performance and energy efficiency offered by heterogeneous systems are highly useful for modern C++ applications, but the technological variety...

Nozal Raúl, Jose Luis Bosque in The Journal of Supercomputing

31 January 2025 Open access

Article

A high-performance dynamic scheduling for sparse matrix-based applications on heterogeneous CPU–GPU environment

Efficient utilization of processors in heterogeneous CPU–GPU systems is crucial for improving overall application performance by reducing workload...

Ahmad Shokrani Baigi, Abdorreza Savadi, Mahmoud Naghibzadeh in The Journal of Supercomputing

07 August 2024

Article

Accelerating BERT inference with GPU-efficient exit prediction

BERT is a representative pre-trained language model that has drawn extensive attention for significant improvements in downstream Natural Language...

Lei Li, Chengyu Wang, ... Aoying Zhou in Frontiers of Computer Science

22 January 2024

Article

Full access

GAPS: GPU-accelerated processing service for SM9

SM9 was established in 2016 as a Chinese official identity-based cryptographic (IBC) standard, and became an ISO standard in 2021. It is well-known...

Wenhan Xu, Hui Ma, Rui Zhang in Cybersecurity

02 October 2024 Open access

Search

Filters

Search Results

Search

Navigation