Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU Clusters
SC '24: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisArticle No.: 26, Pages 1–18https://doi.org/10.1109/SC41406.2024.00032Large-scale computing systems are increasingly using accelerators such as GPUs to enable peta- and exa-scale levels of compute to meet the needs of Machine Learning (ML) and scientific computing applications. Given the widespread and growing use of ML, ...
- research-articleNovember 2024
Towards Highly Compatible I/O-Aware Workflow Scheduling on HPC Systems
SC '24: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisArticle No.: 25, Pages 1–15https://doi.org/10.1109/SC41406.2024.00031Scientific workflows on High-Performance Computing (HPC) consist of multiple data processing and computing tasks with dependencies. Efficiently scheduling computing resources and multi-tier storage across workflow tasks is crucial for optimizing ...
- research-articleNovember 2024
EcoLife: Carbon-Aware Serverless Function Scheduling for Sustainable Computing
SC '24: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisArticle No.: 12, Pages 1–15https://doi.org/10.1109/SC41406.2024.00018This work introduces EcoLife, the first carbon-aware serverless function scheduler to co-optimize carbon footprint and performance. EcoLife builds on the key insight of intelligently exploiting multi-generation hardware to achieve high performance and ...