Theory and algorithms for application domains

Applied Filters

People

Publications

Publication Date

Searched The ACM Guide to Computing Literature (3,762,364 records)|Limit your search to The ACM Full-Text Collection (757,131 records)

Showing 1 - 20of452 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
July 2024
<italic>InSS</italic>: An Intelligent Scheduling Orchestrator for Multi-GPU Inference With Spatio-Temporal Sharing
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 10Oct. 2024, Pages 1735–1748https://doi.org/10.1109/TPDS.2024.3430063
As the applications of AI proliferate, it is critical to increase the throughput of online DNN inference services. Multi-process service (MPS) improves the utilization rate of GPU resources by spatial-sharing, but it also brings unique challenges. First, ...
1
Metrics
Total Citations1
research-article
February 2024
End-to-End Bayesian Networks Exact Learning in Shared Memory
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 4April 2024, Pages 634–645https://doi.org/10.1109/TPDS.2024.3366471
Bayesian networks are important Machine Learning models with many practical applications in, e.g., biomedicine and bioinformatics. The problem of Bayesian networks learning is <inline-formula><tex-math notation="LaTeX">$\mathcal {NP}$</tex-math><...
0
Metrics
Total Citations0
research-article
January 2024
Multi-Agent Deep Reinforcement Learning Framework for Renewable Energy-Aware Workflow Scheduling on Distributed Cloud Data Centers
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 4April 2024, Pages 604–615https://doi.org/10.1109/TPDS.2024.3360448
The ever-increasing demand for the cloud computing paradigm has resulted in the widespread deployment of multiple datacenters, the operations of which consume very high levels of energy. The carbon footprint resulting from these operations threatens ...
0
Metrics
Total Citations0
research-article
December 2023
A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 4April 2024, Pages 577–591https://doi.org/10.1109/TPDS.2023.3343570
With the increasing volumes of data samples and deep neural network (DNN) models, efficiently scaling the training of DNN models has become a significant challenge for server clusters with AI accelerators in terms of memory and computing efficiency. ...
0
Metrics
Total Citations0
research-article
December 2023
Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-Alignment
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 2Feb. 2024, Pages 280–296https://doi.org/10.1109/TPDS.2023.3340518
Deep neural networks (DNNs) have been widely adopted for various mobile inference tasks, yet their ever-increasing computational demands are hindering their deployment on resource-constrained mobile devices. Hybrid deep learning partitions a DNN into two ...
1
Metrics
Total Citations1
research-article
November 2023
Batch Jobs Load Balancing Scheduling in Cloud Computing Using Distributional Reinforcement Learning
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 1Jan. 2024, Pages 169–185https://doi.org/10.1109/TPDS.2023.3334519
In cloud computing, how to reasonably allocate computing resources for batch jobs to ensure the load balance of dynamic clusters and meet user requests is an important and challenging task. Most existing studies are based on deep Q network, which utilizes ...
0
Metrics
Total Citations0
research-article
November 2023
US-Byte: An Efficient Communication Framework for Scheduling Unequal-Sized Tensor Blocks in Distributed Deep Learning
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 1Jan. 2024, Pages 123–139https://doi.org/10.1109/TPDS.2023.3331372
The communication bottleneck severely constrains the scalability of distributed deep learning, and efficient communication scheduling accelerates distributed DNN training by overlapping computation and communication tasks. However, existing approaches ...
0
Metrics
Total Citations0
research-article
Open Access
November 2023
SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 1Jan. 2024, Pages 73–88https://doi.org/10.1109/TPDS.2023.3330669
The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes ...
0
Metrics
Total Citations0
research-article
October 2023
FedHAP: Federated Hashing With Global Prototypes for Cross-Silo Retrieval
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 35, Issue 4April 2024, Pages 592–603https://doi.org/10.1109/TPDS.2023.3324426
Deep hashing has been widely applied in large-scale data retrieval due to its superior retrieval efficiency and low storage cost. However, data are often scattered in data silos with privacy concerns, so performing centralized data storage and retrieval ...
0
Metrics
Total Citations0
research-article
September 2023
Consistent Low Latency Scheduler for Distributed Key-Value Stores
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 34, Issue 12Dec. 2023, Pages 3012–3027https://doi.org/10.1109/TPDS.2023.3315777
Nowadays, the distributed key-value stores have become the basic building block for large-scale cloud applications. In large-scale distributed key-value stores, many key-value access operations, which will be processed in parallel on different servers, ...
0
Metrics
Total Citations0
research-article
Open Access
September 2023
RLPTO: A Reinforcement Learning-Based Performance-Time Optimized Task and Resource Scheduling Mechanism for Distributed Machine Learning
- Xiaofeng Lu,
- Chao Liu,
- Senhao Zhu,
- Yilu Mao,
- Pietro Lio,
- Pan Hui
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 34, Issue 12Dec. 2023, Pages 3266–3279https://doi.org/10.1109/TPDS.2023.3317388
With the wide application of deep learning, the amount of data required to train deep learning models is becoming increasingly larger, resulting in an increased training time and higher requirements for computing resources. To improve the throughput of a ...
0
Metrics
Total Citations0
research-article
September 2023
Task Placement and Resource Allocation for Edge Machine Learning: A GNN-Based Multi-Agent Reinforcement Learning Paradigm
- Yihong Li,
- Xiaoxi Zhang,
- Tianyu Zeng,
- Jingpu Duan,
- Chuan Wu,
- Di Wu,
- Xu Chen
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 34, Issue 12Dec. 2023, Pages 3073–3089https://doi.org/10.1109/TPDS.2023.3313779
Machine learning (ML) tasks are one of the major workloads in today's edge computing networks. Existing edge-cloud schedulers allocate the requested amounts of resources to each task, falling short of best utilizing the limited edge resources for ...
0
Metrics
Total Citations0
research-article
December 2022
Tenant-Grained Request Scheduling in Software-Defined Cloud Computing
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4654–4671https://doi.org/10.1109/TPDS.2022.3199031
Cloud providers host various services for tenants’ requests (e.g., software-as-a-service) and seek to serve as many requests as possible for revenue maximization. Considering a large number of requests, the previous works on fine-grained request ...
0
Metrics
Total Citations0
research-article
December 2022
A Bi-Objective Learn-and-Deploy Scheduling Method for Bursty and Stochastic Requests on Heterogeneous Cloud Servers
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4547–4562https://doi.org/10.1109/TPDS.2022.3196475
In this article, we consider the dynamic allocation of bursty requests stochastically arriving at heterogeneous servers with uncertain setup times. Lower expected response time and less power consumption are desirable objectives of users and service ...
0
Metrics
Total Citations0
research-article
December 2022
Theoretical Analysis of an Adaptive Periodic Multi Installment Scheduling With Result Retrieval for SAR Image Processing
- Gokul Madathupalyam Chinnappan,
- Bharadwaj Veeravalli
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4672–4683https://doi.org/10.1109/TPDS.2022.3194542
Processing a large-scale Synthetic Aperture Radar (SAR) image dataset on a distributed computing infrastructure poses a challenging problem. Large-scale load distribution strategies like multi-installment scheduling (MIS) assume that the size of the ...
0
Metrics
Total Citations0
research-article
Open Access
December 2022
Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4383–4394https://doi.org/10.1109/TPDS.2022.3189270
Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling <monospace>kind</monospace> and <...
0
Metrics
Total Citations0
research-article
December 2022
<italic>Eiffel</italic>: Efficient and Fair Scheduling in Adaptive Federated Learning
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4282–4294https://doi.org/10.1109/TPDS.2022.3187365
Emerging machine learning (ML) technologies, in combination with the increasing computational power of mobile devices, lead to the extensive adoption of ML-based applications. Different from conventional model training that needs to collect all the user ...
1
Metrics
Total Citations1
research-article
December 2022
PushBox: Making Use of Every Bit of Time to Accelerate Completion of Data-Parallel Jobs
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4256–4269https://doi.org/10.1109/TPDS.2022.3182037
To minimize a job's completion time, we need to minimize the completion time of its final stage's last task. Scheduling of machine slots and networks largely dominates the variable part of each task's duration. Finding an optimal ...
0
Metrics
Total Citations0
research-article
December 2022
Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4083–4099https://doi.org/10.1109/TPDS.2022.3181096
Energy conservation of large data centers for high performance computing workloads, such as deep learning with Big Data, is of critical significance, where cutting down a few percent of electricity translates into million-dollar savings. This work studies ...
2
Metrics
Total Citations2
research-article
December 2022
Real-Time Scheduling of Parallel Task Graphs With Critical Sections Across Different Vertices
- Xu Jiang,
- Nan Guan,
- Maolin Yang,
- Yang Wang,
- Yue Tang,
- Wang Yi
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 33, Issue 12Dec. 2022, Pages 4117–4133https://doi.org/10.1109/TPDS.2022.3179328
All existing work on real-time scheduling of parallel task graph models with shared resources assumes that a critical section must be contained inside a single vertex. However, this assumption does not hold in many realistic parallel real-time software. ...
1
Metrics
Total Citations1

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

All Publications

Content Type

Publisher

Publication Date

<italic>InSS</italic>: An Intelligent Scheduling Orchestrator for Multi-GPU Inference With Spatio-Temporal Sharing

End-to-End Bayesian Networks Exact Learning in Shared Memory

Multi-Agent Deep Reinforcement Learning Framework for Renewable Energy-Aware Workflow Scheduling on Distributed Cloud Data Centers

A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training

Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-Alignment

Batch Jobs Load Balancing Scheduling in Cloud Computing Using Distributional Reinforcement Learning

US-Byte: An Efficient Communication Framework for Scheduling Unequal-Sized Tensor Blocks in Distributed Deep Learning

SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor

FedHAP: Federated Hashing With Global Prototypes for Cross-Silo Retrieval

Consistent Low Latency Scheduler for Distributed Key-Value Stores

RLPTO: A Reinforcement Learning-Based Performance-Time Optimized Task and Resource Scheduling Mechanism for Distributed Machine Learning

Task Placement and Resource Allocation for Edge Machine Learning: A GNN-Based Multi-Agent Reinforcement Learning Paradigm

Tenant-Grained Request Scheduling in Software-Defined Cloud Computing

A Bi-Objective Learn-and-Deploy Scheduling Method for Bursty and Stochastic Requests on Heterogeneous Cloud Servers

Theoretical Analysis of an Adaptive Periodic Multi Installment Scheduling With Result Retrieval for SAR Image Processing

Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP

<italic>Eiffel</italic>: Efficient and Fair Scheduling in Adaptive Federated Learning

PushBox: Making Use of Every Bit of Time to Accelerate Completion of Data-Parallel Jobs

Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters

Real-Time Scheduling of Parallel Task Graphs With Critical Sections Across Different Vertices