Mathematical analysis

Applied Filters

People

Publications

Publication Date

Searched The ACM Guide to Computing Literature (3,836,077 records)|Limit your search to The ACM Full-Text Collection (773,992 records)

Showing 1 - 16of16 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
November 2024
Acceleration of Tensor-Product Operations with Tensor Cores
- Cu Cui
ACM Transactions on Parallel Computing (TOPC), Volume 11, Issue 4Article No.: 15, Pages 1–24https://doi.org/10.1145/3695466
In this article, we explore the acceleration of tensor product operations in finite element methods, leveraging the computational power of the NVIDIA A100 GPU Tensor Cores. We provide an accessible overview of the necessary mathematical background and ...
0
281
Metrics
Total Citations0
Total Downloads281
Last 12 Months281
Last 6 weeks88
Get Access
introduction
Free
December 2023
Introduction to the Special Issue for SPAA’21
- Yossi Azar,
- Julian Shun
ACM Transactions on Parallel Computing (TOPC), Volume 10, Issue 4Article No.: 17, Page 1https://doi.org/10.1145/3630608
0
214
Metrics
Total Citations0
Total Downloads214
Last 12 Months197
Last 6 weeks34
View online with eReader
PDF
research-article
March 2023
Non-overlapping High-accuracy Parallel Closure for Compact Schemes: Application in Multiphysics and Complex Geometry
ACM Transactions on Parallel Computing (TOPC), Volume 10, Issue 1Article No.: 1, Pages 1–28https://doi.org/10.1145/3580005
Compact schemes are often preferred in performing scientific computing for their superior spectral resolution. Error-free parallelization of a compact scheme is a challenging task due to the requirement of additional closures at the inter-processor ...
6
188
Metrics
Total Citations6
Total Downloads188
Last 12 Months65
Last 6 weeks7
Get Access
research-article
November 2020
Optimizing the Linear Fascicle Evaluation Algorithm for Multi-core and Many-core Systems
- Karan Aggarwal,
- Uday Bondhugula
ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 4Article No.: 22, Pages 1–45https://doi.org/10.1145/3418075

Sparse matrix-vector multiplication (SpMV) operations are commonly used in various scientific and engineering applications. The performance of the SpMV operation often depends on exploiting regularity patterns in the matrix. Various representations and ...
1
70
Metrics
Total Citations1
Total Downloads70
Last 12 Months5
Last 6 weeks2
Get Access
research-article
October 2020
A High Accuracy Preserving Parallel Algorithm for Compact Schemes for DNS
ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 4Article No.: 21, Pages 1–32https://doi.org/10.1145/3418073

A new accuracy-preserving parallel algorithm employing compact schemes is presented for direct numerical simulation of the Navier-Stokes equations. Here the connotation of accuracy preservation is having the same level of accuracy obtained by the ...
20
208
Metrics
Total Citations20
Total Downloads208
Last 12 Months24
Last 6 weeks3
Get Access
research-article
June 2020
Algorithms and Data Structures for Matrix-Free Finite Element Operators with MPI-Parallel Sparse Multi-Vectors
- Denis Davydov,
- Martin Kronbichler
ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 3Article No.: 20, Pages 1–30https://doi.org/10.1145/3399736

Traditional solution approaches for problems in quantum mechanics scale as O(M³), where M is the number of electrons. Various methods have been proposed to address this issue and obtain a linear scaling O(M). One promising formulation is the direct ...
3
181
Metrics
Total Citations3
Total Downloads181
Last 12 Months21
Last 6 weeks6
Get Access
research-article
Public Access
March 2020
Load-balancing Sparse Matrix Vector Product Kernels on GPUs
ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 1Article No.: 2, Pages 1–26https://doi.org/10.1145/3380930

Efficient processing of Irregular Matrices on Single Instruction, Multiple Data (SIMD)-type architectures is a persistent challenge. Resolving it requires innovations in the development of data formats, computational techniques, and implementations that ...
30
2,114
Metrics
Total Citations30
Total Downloads2,114
Last 12 Months433
Last 6 weeks63
View online with eReader
View this article in HTML format
PDF
research-article
March 2020
Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation
ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 1Article No.: 4, Pages 1–19https://doi.org/10.1145/3380934

We describe the application of a communication-reduction technique for the PageRank algorithm that dynamically adapts the precision of the data access to the numerical requirements of the algorithm as the iteration converges. Our variable-precision ...
9
231
Metrics
Total Citations9
Total Downloads231
Last 12 Months14
Last 6 weeks1
Get Access
research-article
May 2019
Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors
- Martin Kronbichler,
- Karl Ljungkvist
ACM Transactions on Parallel Computing (TOPC), Volume 6, Issue 1Article No.: 2, Pages 1–32https://doi.org/10.1145/3322813

This article presents matrix-free finite-element techniques for efficiently solving partial differential equations on modern many-core processors, such as graphics cards. We develop a GPU parallelization of a matrix-free geometric multigrid iterative ...
48
750
Metrics
Total Citations48
Total Downloads750
Last 12 Months109
Last 6 weeks10
Get Access
research-article
January 2018
Partitioning Models for Scaling Parallel Sparse Matrix-Matrix Multiplication
ACM Transactions on Parallel Computing (TOPC), Volume 4, Issue 3Article No.: 13, Pages 1–34https://doi.org/10.1145/3155292

We investigate outer-product--parallel, inner-product--parallel, and row-by-row-product--parallel formulations of sparse matrix-matrix multiplication (SpGEMM) on distributed memory architectures. For each of these three formulations, we propose a ...
20
557
Metrics
Total Citations20
Total Downloads557
Last 12 Months55
Last 6 weeks3
Get Access
research-article
Public Access
January 2017
Trade-Offs Between Synchronization, Communication, and Computation in Parallel Linear Algebra Computations
ACM Transactions on Parallel Computing (TOPC), Volume 3, Issue 1Article No.: 3, Pages 1–47https://doi.org/10.1145/2897188

This article derives trade-offs between three basic costs of a parallel algorithm: synchronization, data movement, and computational cost. These trade-offs are lower bounds on the execution time of the algorithm that are independent of the number of ...
13
802
Metrics
Total Citations13
Total Downloads802
Last 12 Months113
Last 6 weeks15
View online with eReader
PDF
research-article
Public Access
December 2016
Hypergraph Partitioning for Sparse Matrix-Matrix Multiplication
ACM Transactions on Parallel Computing (TOPC), Volume 3, Issue 3Article No.: 18, Pages 1–34https://doi.org/10.1145/3015144

We propose a fine-grained hypergraph model for sparse matrix-matrix multiplication (SpGEMM), a key computational kernel in scientific computing and data analysis whose performance is often communication bound. This model correctly describes both the ...
31
1,003
Metrics
Total Citations31
Total Downloads1,003
Last 12 Months257
Last 6 weeks29
View online with eReader
PDF
research-article
September 2015
Profitable Scheduling on Multiple Speed-Scalable Processors
- Peter Kling,
- Peter Pietrzyk
ACM Transactions on Parallel Computing (TOPC), Volume 2, Issue 3Article No.: 19, Pages 1–19https://doi.org/10.1145/2809872

We present a new online algorithm for profit-oriented scheduling on multiple speed-scalable processors and provide a tight analysis of the algorithm’s competitiveness. Our results generalize and improve upon work by Chan et al. [2010], which considers a ...
1
137
Metrics
Total Citations1
Total Downloads137
Last 12 Months5
Last 6 weeks3
Get Access
research-article
September 2015
Work-Efficient Matrix Inversion in Polylogarithmic Time
ACM Transactions on Parallel Computing (TOPC), Volume 2, Issue 3Article No.: 15, Pages 1–29https://doi.org/10.1145/2809812

We present an algorithm for inversion of symmetric positive definite matrices that combines the practical requirement of an optimal number of arithmetic operations and the theoretical goal of a polylogarithmic critical path length. The algorithm reduces ...
2
218
Metrics
Total Citations2
Total Downloads218
Last 12 Months5
Last 6 weeks1
Get Access
research-article
April 2015
Noise-Tolerant Explicit Stencil Computations for Nonuniform Process Execution Rates
ACM Transactions on Parallel Computing (TOPC), Volume 2, Issue 1Article No.: 7, Pages 1–33https://doi.org/10.1145/2742351

Next-generation HPC computing platforms are likely to be characterized by significant, unpredictable nonuniformities in execution time among compute nodes and cores. The resulting load imbalances from this nonuniformity are expected to arise from a ...
11
177
Metrics
Total Citations11
Total Downloads177
Last 12 Months6
Last 6 weeks1
Get Access
research-article
Public Access
February 2015
Avoiding Communication in Successive Band Reduction
ACM Transactions on Parallel Computing (TOPC), Volume 1, Issue 2Article No.: 11, Pages 1–37https://doi.org/10.1145/2686877

The running time of an algorithm depends on both arithmetic and communication (i.e., data movement) costs, and the relative costs of communication are growing over time. In this work, we present sequential and distributed-memory parallel algorithms for ...
9
545
Metrics
Total Citations9
Total Downloads545
Last 12 Months102
Last 6 weeks14
View online with eReader
PDF

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

All Publications

Content Type

Media Formats

Publisher

Publication Date

Acceleration of Tensor-Product Operations with Tensor Cores

Introduction to the Special Issue for SPAA’21

Non-overlapping High-accuracy Parallel Closure for Compact Schemes: Application in Multiphysics and Complex Geometry

Optimizing the Linear Fascicle Evaluation Algorithm for Multi-core and Many-core Systems

A High Accuracy Preserving Parallel Algorithm for Compact Schemes for DNS

Algorithms and Data Structures for Matrix-Free Finite Element Operators with MPI-Parallel Sparse Multi-Vectors

Load-balancing Sparse Matrix Vector Product Kernels on GPUs

Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation

Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

Partitioning Models for Scaling Parallel Sparse Matrix-Matrix Multiplication

Trade-Offs Between Synchronization, Communication, and Computation in Parallel Linear Algebra Computations

Hypergraph Partitioning for Sparse Matrix-Matrix Multiplication

Profitable Scheduling on Multiple Speed-Scalable Processors

Work-Efficient Matrix Inversion in Polylogarithmic Time

Noise-Tolerant Explicit Stencil Computations for Nonuniform Process Execution Rates

Avoiding Communication in Successive Band Reduction