Cited By
View all- Jeon JGil MKim JPark JKoo GYoon MOh Y(2024)VitBit: Enhancing Embedded GPU Performance for AI Workloads through Register Operand PackingProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673045(1012-1021)Online publication date: 12-Aug-2024
- Pati SAga SIslam MJayasena NSinclair MTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & CollectivesProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640410(1146-1164)Online publication date: 27-Apr-2024
- Wang ZWang YDeng JZheng DLi ADing YTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input PreprocessingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640406(964-979)Online publication date: 27-Apr-2024
- Show More Cited By