Cited By
View all- Zhao HDeng JCui WChen QZhang YZeng DGuo M(2025)Adaptive Kernel Fusion for Improving the GPU Utilization While Ensuring QoSIEEE Transactions on Computers10.1109/TC.2024.347799574:2(386-400)Online publication date: Feb-2025
- Han YKim IKim JMoon G(2024)Tensor Core-Adapted Sparse Matrix Multiplication for Accelerating Sparse Deep Neural NetworksElectronics10.3390/electronics1320398113:20(3981)Online publication date: 10-Oct-2024
- Hanindhito BPatel BJohn L(2024)Bandwidth Characterization of DeepSpeed on Distributed Large Language Model Training2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS61541.2024.00031(241-256)Online publication date: 5-May-2024
- Show More Cited By