default search action
Ahmad Abdelfattah
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j18]Piotr Luszczek, Ahmad Abdelfattah, Hartwig Anzt, Atsushi Suzuki, Stanimire Tomov:
Batched sparse and mixed-precision linear algebra interface for efficient use of GPU hardware accelerators in scientific applications. Future Gener. Comput. Syst. 160: 359-374 (2024) - 2023
- [c30]Wissam M. Sid-Lakhdar, Sébastien Cayrols, Daniel Bielich, Ahmad Abdelfattah, Piotr Luszczek, Mark Gates, Stanimire Tomov, Hans Johansen, David B. Williams-Young, Timothy A. Davis, Jack J. Dongarra, Hartwig Anzt:
PAQR: Pivoting Avoiding QR factorization. IPDPS 2023: 322-332 - [c29]Ahmad Abdelfattah, Stanimire Tomov, Piotr Luszczek, Hartwig Anzt, Jack J. Dongarra:
GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure. SC Workshops 2023: 1670-1679 - 2022
- [c28]Chiang-Heng Chien, Hongyi Fan, Ahmad Abdelfattah, Elias P. Tsigaridas, Stanimire Tomov, Benjamin B. Kimia:
GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision. CVPR 2022: 15744-15755 - [c27]Ahmad Abdelfattah, Stan Tomov, Jack J. Dongarra:
Batch QR Factorization on GPUs: Design, Optimization, and Tuning. ICCS (1) 2022: 60-74 - [c26]Mark Gates, Asim YarKhan, Dalal Sukkari, Kadir Akbudak, Sébastien Cayrols, Daniel Bielich, Ahmad Abdelfattah, Mohammed A. Al Farhan, Jack J. Dongarra:
Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era. P3HPC@SC 2022: 36-46 - [c25]Ahmad Abdelfattah, Pieter Ghysels, Wajih Boukaram, Stanimire Tomov, Xiaoye Sherry Li, Jack J. Dongarra:
Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers. SC 2022: 26:1-26:14 - [d5]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Rezgar Shakeri, Jeremy L. Thompson, Stanimire Tomov, James Wright:
libCEED: Efficient Extensible Discretization. Zenodo, 2022 - [d4]Mark Gates, Asim YarKhan, Dalal Sukkari, Kadir Akbudak, Sébastien Cayrols, Daniel Bielich, Ahmad Abdelfattah, Mohammed A. Al Farhan, Jack J. Dongarra:
Reproducability Artifact for Running SLATE's GEMM and POTRF Operations on Summit and Crusher. Version 2. Zenodo, 2022 [all versions] - 2021
- [j17]Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin C. Carson, Terry Cojean, Jack J. Dongarra, Alyson Fox, Mark Gates, Nicholas J. Higham, Xiaoye S. Li, Jennifer A. Loe, Piotr Luszczek, Srikara Pranesh, Siva Rajamanickam, Tobias Ribizel, Barry F. Smith, Kasia Swirydowicz, Stephen J. Thomas, Stanimire Tomov, Yaohung M. Tsai, Ulrike Meier Yang:
A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4) (2021) - [j16]Tzanio V. Kolev, Paul F. Fischer, Misun Min, Jack J. Dongarra, Jed Brown, Veselin Dobrev, Tim Warburton, Stanimire Tomov, Mark S. Shephard, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Noel Chalmers, Yohann Dudouit, Ali Karakus, Ian Karlin, Stefan Kerkemeier, Yu-Hsiang Lan, David S. Medina, Elia Merzari, Aleksandr Obabko, Will Pazner, Thilina Rathnayake, Cameron W. Smith, Lukas Spies, Kasia Swirydowicz, Jeremy L. Thompson, Ananias Tomboulides, Vladimir Z. Tomov:
Efficient exascale discretizations: High-order finite element methods. Int. J. High Perform. Comput. Appl. 35(6): 527-552 (2021) - [j15]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie N. Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Rathnayake, Jeremy L. Thompson, Stan Tomov:
libCEED: Fast algebra for high-order element-based discretizations. J. Open Source Softw. 6(63): 2945 (2021) - [j14]Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Ryan Bleile, Jed Brown, Jean-Sylvain Camier, Robert Carson, Noel Chalmers, Veselin Dobrev, Yohann Dudouit, Paul F. Fischer, Ali Karakus, Stefan Kerkemeier, Tzanio V. Kolev, Yu-Hsiang Lan, Elia Merzari, Misun Min, Malachi Phillips, Thilina Rathnayake, Robert N. Rieben, Thomas Stitt, Ananias Tomboulides, Stanimire Tomov, Vladimir Z. Tomov, Arturo Vargas, Tim Warburton, Kenneth Weiss:
GPU algorithms for Efficient Exascale Discretizations. Parallel Comput. 108: 102841 (2021) - [j13]Ahmad Abdelfattah, Timothy B. Costa, Jack J. Dongarra, Mark Gates, Azzam Haidar, Sven Hammarling, Nicholas J. Higham, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Mawussi Zounon:
A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines. ACM Trans. Math. Softw. 47(3): 21:1-21:23 (2021) - [d3]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Jeremy L. Thompson, Stan Tomov:
CEED/libCEED: v0.9.0. Zenodo, 2021 - [d2]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Jeremy L. Thompson, Stanimire Tomov:
libCEED: Efficient Extensible Discretization. Zenodo, 2021 - [d1]Jed Brown, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Veselin Dobrev, Yohann Dudouit, Leila Ghaffari, Tzanio V. Kolev, David S. Medina, Will Pazner, Thilina Ratnayaka, Jeremy L. Thompson, Stanimire Tomov:
libCEED: Efficient Extensible Discretization. Zenodo, 2021 - [i5]Tzanio V. Kolev, Paul F. Fischer, Misun Min, Jack J. Dongarra, Jed Brown, Veselin Dobrev, Tim Warburton, Stanimire Tomov, Mark S. Shephard, Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Jean-Sylvain Camier, Noel Chalmers, Yohann Dudouit, Ali Karakus, Ian Karlin, Stefan Kerkemeier, Yu-Hsiang Lan, David S. Medina, Elia Merzari, Aleksandr Obabko, Will Pazner, Thilina Rathnayake, Cameron W. Smith, Lukas Spies, Kasia Swirydowicz, Jeremy L. Thompson, Ananias Tomboulides, Vladimir Z. Tomov:
Efficient Exascale Discretizations: High-Order Finite Element Methods. CoRR abs/2109.04996 (2021) - [i4]Ahmad Abdelfattah, Valeria Barra, Natalie Beams, Ryan Bleile, Jed Brown, Jean-Sylvain Camier, Robert Carson, Noel Chalmers, Veselin Dobrev, Yohann Dudouit, Paul F. Fischer, Ali Karakus, Stefan Kerkemeier, Tzanio V. Kolev, Yu-Hsiang Lan, Elia Merzari, Misun Min, Malachi Phillips, Thilina Rathnayake, Robert N. Rieben, Thomas Stitt, Ananias Tomboulides, Stanimire Tomov, Vladimir Z. Tomov, Arturo Vargas, Tim Warburton, Kenneth Weiss:
GPU Algorithms for Efficient Exascale Discretizations. CoRR abs/2109.05072 (2021) - [i3]Chiang-Heng Chien, Hongyi Fan, Ahmad Abdelfattah, Elias P. Tsigaridas, Stanimire Tomov, Benjamin B. Kimia:
GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision. CoRR abs/2112.03444 (2021) - 2020
- [j12]Mohammed A. Al Farhan, Ahmad Abdelfattah, Stanimire Tomov, Mark Gates, Dalal Sukkari, Azzam Haidar, Robert Rosenberg, Jack J. Dongarra:
MAGMA templates for scalable linear algebra on emerging architectures. Int. J. High Perform. Comput. Appl. 34(6) (2020) - [j11]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Matrix multiplication on batches of small matrices in half and half-complex precisions. J. Parallel Distributed Comput. 145: 188-201 (2020) - [c24]Cade Brown, Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs. HPEC 2020: 1-7 - [c23]Ahmad Abdelfattah, Stan Tomov, Jack J. Dongarra:
Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs. ICCS (2) 2020: 237-250 - [c22]Hartwig Anzt, Yuhsiang M. Tsai, Ahmad Abdelfattah, Terry Cojean, Jack J. Dongarra:
Evaluating the Performance of NVIDIA's A100 Ampere GPU for Sparse and Batched Computations. PMBS@SC 2020: 26-38 - [c21]Natalie Beams, Ahmad Abdelfattah, Stan Tomov, Jack J. Dongarra, Tzanio V. Kolev, Yohann Dudouit:
High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs. ScalA@SC 2020: 53-60 - [i2]Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin C. Carson, Terry Cojean, Jack J. Dongarra, Mark Gates, Thomas Grützmacher, Nicholas J. Higham, Xiaoye Sherry Li, Neil Lindquist, Yang Liu, Jennifer A. Loe, Piotr Luszczek, Pratik Nayak, Srikara Pranesh, Sivasankaran Rajamanickam, Tobias Ribizel, Barry Smith, Kasia Swirydowicz, Stephen J. Thomas, Stanimire Tomov, Yaohung M. Tsai, Ichitaro Yamazaki, Ulrike Meier Yang:
A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic. CoRR abs/2007.06674 (2020)
2010 – 2019
- 2019
- [j10]Ian Masliah, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Marc Baboulin, Joël Falcou, Jack J. Dongarra:
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices. Parallel Comput. 81: 1-21 (2019) - [c20]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Progressive Optimization of Batched LU Factorization on GPUs. HPEC 2019: 1-6 - [c19]Jakub Kurzak, Yaohung M. Tsai, Mark Gates, Ahmad Abdelfattah, Jack J. Dongarra:
Massively Parallel Automated Software Tuning. ICPP 2019: 92:1-92:10 - [c18]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs. IPDPS 2019: 111-122 - [c17]Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs. ScalA@SC 2019: 17-24 - 2018
- [j9]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Batched one-sided factorizations of tiny matrices using GPUs: Challenges and countermeasures. J. Comput. Sci. 26: 226-236 (2018) - [j8]Azzam Haidar, Ahmad Abdelfattah, Mawussi Zounon, Stanimire Tomov, Jack J. Dongarra:
A Guide for Achieving High Performance with Very Small Matrices on GPU: A Case Study of Batched LU and Cholesky Factorizations. IEEE Trans. Parallel Distributed Syst. 29(5): 973-984 (2018) - [j7]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs. IEEE Trans. Parallel Distributed Syst. 29(12): 2700-2712 (2018) - [c16]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Optimizing GPU Kernels for Irregular Batch Workloads: A Case Study for Cholesky Factorization. HPEC 2018: 1-7 - [c15]Azzam Haidar, Ahmad Abdelfattah, Mawussi Zounon, Panruo Wu, Srikara Pranesh, Stanimire Tomov, Jack J. Dongarra:
The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques. ICCS (1) 2018: 586-600 - [c14]Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack J. Dongarra:
Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters. IPDPS 2018: 930-939 - 2017
- [j6]Jack J. Dongarra, Stanimire Tomov, Piotr Luszczek, Jakub Kurzak, Mark Gates, Ichitaro Yamazaki, Hartwig Anzt, Azzam Haidar, Ahmad Abdelfattah:
With Extreme Computing, the Rules Have Changed. Comput. Sci. Eng. 19(3): 52-62 (2017) - [j5]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Fast Cholesky factorization on GPUs for batch and native modes in MAGMA. J. Comput. Sci. 20: 85-93 (2017) - [c13]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures. ICCS 2017: 606-615 - [c12]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs. ICS 2017: 5:1-5:10 - [c11]Azzam Haidar, Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra:
High-performance Cholesky factorization for GPU-only execution. GPGPU@PPoPP 2017: 42-52 - 2016
- [j4]Ahmad Abdelfattah, Hartwig Anzt, Jack J. Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Asim YarKhan:
Linear algebra software for large-scale accelerated multicore computing. Acta Numer. 25: 1-160 (2016) - [j3]Ahmad Abdelfattah, Hatem Ltaief, David E. Keyes, Jack J. Dongarra:
Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs. Concurr. Comput. Pract. Exp. 28(12): 3447-3465 (2016) - [j2]Ahmad Abdelfattah, David E. Keyes, Hatem Ltaief:
KBLAS: An Optimized Library for Dense Matrix-Vector Multiplication on GPU Accelerators. ACM Trans. Math. Softw. 42(3): 18:1-18:31 (2016) - [c10]Ian Masliah, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Marc Baboulin, Joël Falcou, Jack J. Dongarra:
High-Performance Matrix-Matrix Multiplications of Very Small Matrices. Euro-Par 2016: 659-671 - [c9]Ahmad Abdelfattah, Marc Baboulin, Veselin Dobrev, Jack J. Dongarra, Christopher W. Earl, Joel Falcou, Azzam Haidar, Ian Karlin, Tzanio V. Kolev, Ian Masliah, Stanimire Tomov:
High-Performance Tensor Contractions for GPUs. ICCS 2016: 108-118 - [c8]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs. ICCS 2016: 119-130 - [c7]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures. IPDPS Workshops 2016: 1249-1258 - [c6]Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra:
Performance, Design, and Autotuning of Batched GEMM for GPUs. ISC 2016: 21-38 - 2015
- [b1]Ahmad Abdelfattah:
Accelerating Scientific Applications using High Performance Dense and Sparse Linear Algebra Kernels on GPUs. King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, 2015 - [j1]Jack J. Dongarra, Maksims Abalenkovs, Ahmad Abdelfattah, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki, Asim YarKhan:
Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems. Supercomput. Front. Innov. 2(4): 67-86 (2015) - [c5]Ahmad Abdelfattah, Hatem Ltaief, David E. Keyes:
High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications. Euro-Par 2015: 601-612 - 2014
- [c4]Ahmad Abdelfattah, Eric Gendron, Damien Gratadour, David E. Keyes, Hatem Ltaief, Arnaud Sevin, Fabrice Vidal:
High Performance Pseudo-analytical Simulation of Multi-Object Adaptive Optics over Multi-GPU Systems. Euro-Par 2014: 704-715 - [c3]Ali Charara, Hatem Ltaief, Damien Gratadour, David E. Keyes, Arnaud Sevin, Ahmad Abdelfattah, Eric Gendron, Carine Morel, Fabrice Vidal:
Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi-GPU System. SC 2014: 262-273 - [i1]Ahmad Abdelfattah, David E. Keyes, Hatem Ltaief:
KBLAS: An Optimized Library for Dense Matrix-Vector Multiplication on GPU Accelerators. CoRR abs/1410.1726 (2014) - 2012
- [c2]Ahmad Abdelfattah, David E. Keyes, Hatem Ltaief:
Systematic Approach in Optimizing Numerical Memory-Bound Kernels on GPU. Euro-Par Workshops 2012: 207-216 - [c1]Ahmad Abdelfattah, Jack J. Dongarra, David E. Keyes, Hatem Ltaief:
Optimizing Memory-Bound SYMV Kernel on GPU Hardware Accelerators. VECPAR 2012: 72-79
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-31 20:15 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint