ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management
Abstract
References
Index Terms
- ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management
Recommendations
Efficient multi-GPU shared memory via automatic optimization of fine-grained transfers
ISCA '21: Proceedings of the 48th Annual International Symposium on Computer ArchitectureDespite continuing research into inter-GPU communication mechanisms, extracting performance from multi-GPU systems remains a significant challenge. Inter-GPU communication via bulk DMA-based transfers exposes data transfer latency on the GPU's critical ...
GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on MicroarchitectureSuboptimal management of memory and bandwidth is one of the primary causes of low performance on systems comprising multiple GPUs. Existing memory management solutions like Unified Memory (UM) offer simplified programming but come at the cost of ...
Comments
Information & Contributors
Information
Published In
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Research-article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 159Total Downloads
- Downloads (Last 12 months)159
- Downloads (Last 6 weeks)159
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in