default search action
PPoPP 2024: Edinburgh, UK
- Michel Steuwer, I-Ting Angelina Lee, Milind Chabbi:
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2024, Edinburgh, United Kingdom, March 2-6, 2024. ACM 2024
Keynote
- Nir Shavit:
Sparsity in Deep Neural Nets (Keynote). 1
Synchronization and Concurrency Control I
- Pedro Ramalhete, Andreia Correia:
Scaling Up Transactions with Slower Clocks. 2-16 - Jonggyu Park, Young Ik Eom:
Locks as a Resource: Fairly Scheduling Lock Occupation with CFL. 17-29 - Daewoo Kim, Trevor Brown, Ajay Singh:
Are Your Epochs Too Epic? Batch Free Can Be Harmful. 30-41
Compilers and Runtimes for Parallel Systems
- Jiangsu Du, Jinhui Wei, Jiazhi Jiang, Shenggan Cheng, Dan Huang, Zhiguang Chen, Yutong Lu:
Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference. 42-54 - Jinchen Xu, Guanghui Song, Bei Zhou, Fei Li, Jiangwei Hao, Jie Zhao:
A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs. 55-67 - Stefan K. Muller:
Language-Agnostic Static Deadlock Detection for Futures. 68-79 - Akshay Bhosale, Rudolf Eigenmann:
Recurrence Analysis for Automatic Parallelization of Subscripted Subscripts. 80-93
High Performance Computing
- Kasra Jamshidi, Keval Vora:
OsirisBFT: Say No to Task Replication for Scalable Byzantine Fault Tolerant Analytics. 94-108 - Haozhong Qiu, Chuanfu Xu, Jianbin Fang, Liang Deng, Jian Zhang, Qingsong Wang, Yue Ding, Zhe Dai, Yonggang Che, Shizhao Chen, Jie Liu:
Towards Scalable Unstructured Mesh Computations on Shared Memory Many-Cores. 109-119 - Jiabin Xie, Guangnan Feng, Han Huang, Junxuan Feng, Zhiguang Chen, Yutong Lu:
Extreme-scale Direct Numerical Simulation of Incompressible Turbulence on the Heterogeneous Many-core System. 120-132 - James Psota, Armando Solar-Lezama:
Pure: Evolving Message Passing To Better Leverage Shared Memory Within Nodes. 133-146
Graph Processing
- Sungwoo Park, Seyeon Oh, Min-Soo Kim:
INFINEL: An efficient GPU-based processing method for unpredictable large output graph queries. 147-159 - Xinbiao Gan, Guang Wu, Shenghao Qiu, Feng Xiong, Jiaqi Si, Jianbin Fang, Dezun Dong, Chunye Gong, Tiejun Li, Zheng Wang:
GraphCube: Interconnection Hierarchy-aware Graph Processing. 160-174 - Zhiheng Lin, Ke Meng, Chaoyang Shui, Kewei Zhang, Junmin Xiao, Guangming Tan:
Exploiting Fine-Grained Redundancy in Set-Centric Graph Pattern Mining. 175-187
Synchronization and Concurrency Control II
- Vitaly Aksenov, Nikita Koval, Petr Kuznetsov, Anton Paramonov:
Memory Bounds for Concurrent Bounded Queues. 188-199 - Guy E. Blelloch, Yuanhao Wei:
VERLIB: Concurrent Versioned Pointers. 200-214 - Mohammad Khalaji, Trevor Brown, Khuzaima Daudjee, Vitaly Aksenov:
Practical Hardware Transactional vEB Trees. 215-228
ML Workloads
- Xiaoyan Liu, Xuegui Zheng, Hailong Yang, Zhongzhi Luan, Depei Qian:
Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU. 229-242 - Ismet Dagli, Mehmet E. Belviranli:
Shared Memory-contention-aware Concurrent DNN Execution for Diversely Heterogeneous System-on-Chips. 243-256 - Siyu Hu, Tong Zhao, Qiuchen Sha, Enji Li, Xiangyu Meng, Liping Liu, Lin-Wang Wang, Guangming Tan, Weile Jia:
Training one DeePMD Model in Minutes: a Step towards Online Learning. 257-269
Parallel Algorithms
- Magdalen Dobson Manohar, Zheqi Shen, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, Yihan Sun:
ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms. 270-285 - Quanquan C. Liu, Julian Shun, Igor Zablotchi:
Parallel k-Core Decomposition with Batched Updates and Asynchronous Reads. 286-300 - Xiaojun Dong, Laxman Dhulipala, Yan Gu, Yihan Sun:
Parallel Integer Sort: Theory and Practice. 301-315 - Zafar Ahmad, Reilly Browne, Rezaul Chowdhury, Rathish Das, Yushen Huang, Yimin Zhu:
Fast American Option Pricing using Nonlinear Stencils. 316-332
Optimizing for Memory
- Yuetao Chen, Kun Li, Yuhao Wang, Donglin Bai, Lei Wang, Lingxiao Ma, Liang Yuan, Yunquan Zhang, Ting Cao, Mao Yang:
ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores. 333-347 - Brian Wheatman, Randal C. Burns, Aydin Buluç, Helen Xu:
CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers. 348-363 - Hunter McCoy, Prashant Pandey:
Gallatin: A General-Purpose GPU Memory Manager. 364-376
Linear Algebra
- Meng Pang, Xiang Fei, Peng Qu, Youhui Zhang, Zhaolin Li:
A Row Decomposition-based Approach for Sparse Matrix Multiplication on GPUs. 377-389 - Abhinav Jangda, Mohit Yadav:
Fast Kronecker Matrix-Matrix Multiplication on GPUs. 390-403 - Lukas Gianinazzi, Alexandros Nikolaos Ziogas, Langwen Huang, Piotr Luczynski, Saleh Ashkboosh, Florian Scheidl, Armon Carigiet, Chio Ge, Nabil Abubaker, Maciej Besta, Tal Ben-Nun, Torsten Hoefler:
Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication. 404-416
Applications
- Shenggan Cheng, Xuanlei Zhao, Guangyang Lu, Jiarui Fang, Tian Zheng, Ruidong Wu, Xiwen Zhang, Jian Peng, Yang You:
FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters. 417-430 - Seongyeon Park, Junguk Hong, Jaeyong Song, Hajin Kim, Youngsok Kim, Jinho Lee:
AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping. 431-444
POSTER SESSION: Posters
- Zhuoran Ji, Zhaorui Zhang, Jiming Xu, Lei Ju:
POSTER: Accelerating High-Precision Integer Multiplication used in Cryptosystems with GPUs. 445-447 - Zhichen Feng, Jialin Li, Yaqian Gao, Shaobo Tian, Huang Ye, Jian Zhang:
POSTER: Enabling Extreme-Scale Phase Field Simulation with In-situ Feature Extraction. 448-450 - Lixian Ma, Haoruo Chen, En Shao, Leping Wang, Quan Chen, Guangming Tan:
POSTER: FineCo: Fine-grained Heterogeneous Resource Management for Concurrent DNN Inferences. 451-453 - Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu, Yafan Huang, Ken Raffenetti, Hui Zhou, Kai Zhao, Zizhong Chen, Franck Cappello, Yanfei Guo, Rajeev Thakur:
POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU Clusters. 454-456 - Guofeng Feng, Weile Jia, Ninghui Sun, Guangming Tan, Jiajia Li:
POSTER: Optimizing Sparse Tensor Contraction with Revisiting Hash Table Design. 457-459 - Juntao Zhao, Borui Wan, Chuan Wu, Yanghua Peng, Haibin Lin:
POSTER: LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization. 460-462 - dePaul Miller, Henry F. Korth, Roberto Palmieri:
POSTER: OCToPus: Semantic-aware Concurrency Control for Blockchain Transactions. 463-465 - Jiaao He, Shengqi Chen, Jidong Zhai:
POSTER: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training. 466-468 - Shunde Li, Junyu Gu, Jue Wang, Tiechui Yao, Zhiqiang Liang, Yumeng Shi, Shigang Li, Weiting Xi, Shushen Li, Chunbao Zhou, Yangang Wang, Xuebin Chi:
POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters. 469-471 - Yifei Li, Bole Zhou, Jiejing Zhang, Xuechao Wei, Yinghan Li, Yingda Chen:
POSTER: RadiK: Scalable Radix Top-K Selection on GPUs. 472-474 - Almog Zur, Nachshon Cohen, Michal Friedman, Erez Petrank:
POSTER: RELAX: Durable Data Structures with Swift Recovery. 475-476 - Yi Zong, Xinliang Wang, Haopeng Huang, Chensong Zhang, Xiaowen Xu, Jian Sun, Bowen Yan, Qin Wang, Sicong Li, Zhaohui Ding, Wei Xue:
POSTER: StructMG: A Fast and Scalable Structured Multigrid. 478-480
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.