Anzt et al., 2015 - Google Patents
Energy efficiency and performance frontiers for sparse computations on GPU supercomputersAnzt et al., 2015
View PDF- Document ID
- 16752959968310277716
- Author
- Anzt H
- Tomov S
- Dongarra J
- Publication year
- Publication venue
- Proceedings of the sixth international workshop on programming models and applications for multicores and manycores
External Links
Snippet
In this paper we unveil some energy efficiency and performance frontiers for sparse computations on GPU-based supercomputers. To do this, we consider state-of-the-art implementations of the sparse matrix-vector (SpMV) product in libraries like cuSPARSE …
- 239000011159 matrix material 0 abstract description 45
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G06F17/5018—Computer-aided design using simulation using finite difference methods or finite element methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30587—Details of specialised database models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3457—Performance evaluation by simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/78—Power analysis and optimization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | 86 PFLOPS deep potential molecular dynamics simulation of 100 million atoms with ab initio accuracy | |
Ashari et al. | Fast sparse matrix-vector multiplication on GPUs for graph applications | |
Kaya et al. | Scalable sparse tensor decompositions in distributed memory systems | |
Marek et al. | The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science | |
Anzt et al. | Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product. | |
Yamazaki et al. | Improving the performance of CA-GMRES on multicores with multiple GPUs | |
Wang et al. | A TensorFlow simulation framework for scientific computing of fluid flows on tensor processing units | |
Abbas‐Turki et al. | Pricing derivatives on graphics processing units using Monte Carlo simulation | |
Anzt et al. | Energy efficiency and performance frontiers for sparse computations on GPU supercomputers | |
Říha et al. | Massively parallel hybrid total FETI (HTFETI) solver | |
Anzt et al. | On the performance and energy efficiency of sparse linear algebra on GPUs | |
Xiang et al. | Scalable matrix inversion using mapreduce | |
Röhrig-Zöllner et al. | Increasing the performance of the Jacobi--Davidson method by blocking | |
Ortega et al. | Non-dominated sorting procedure for Pareto dominance ranking on multicore CPU and/or GPU | |
Tamascelli et al. | Improved scaling of time-evolving block-decimation algorithm through reduced-rank randomized singular value decomposition | |
Capuzzo-Dolcetta et al. | A performance comparison of different graphics processing units running direct N-body simulations | |
Huchette et al. | Parallel algebraic modeling for stochastic optimization | |
Langr et al. | Accelerating many-nucleon basis generation for high performance computing enabled ab initio nuclear structure studies | |
Riha et al. | A massively parallel and memory-efficient FEM toolbox with a hybrid total FETI solver with accelerator support | |
Chenhan et al. | INV-ASKIT: A parallel fast direct solver for kernel matrices | |
He et al. | A novel CSR‐based sparse matrix‐vector multiplication on GPUs | |
Herholz et al. | Sparsity-specific code optimization using expression trees | |
Kashi et al. | Integrating batched sparse iterative solvers for the collision operator in fusion plasma simulations on GPUs | |
Szałkowski et al. | Using distributed memory parallel computers and GPU clusters for multidimensional Monte Carlo integration | |
Rabbi et al. | Evaluation of directive-based GPU programming models on a block Eigensolver with consideration of large sparse matrices |