skip to main content
Reflects downloads up to 07 Nov 2024Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Memory-Aware Functional IR for Higher-Level Synthesis of Accelerators

Specialized accelerators deliver orders of a magnitude of higher performance than general-purpose processors. The ever-changing nature of modern workloads is pushing the adoption of Field Programmable Gate Arrays (FPGAs) as the substrate of choice. ...

research-article
Open Access
The Forward Slice Core: A High-Performance, Yet Low-Complexity Microarchitecture

Superscalar out-of-order cores deliver high performance at the cost of increased complexity and power budget. In-order cores, in contrast, are less complex and have a smaller power budget, but offer low performance. A processor architecture should ideally ...

research-article
Open Access
MAPPER: Managing Application Performance via Parallel Efficiency Regulation*

State-of-the-art systems, whether in servers or desktops, provide ample computational and storage resources to allow multiple simultaneously executing potentially parallel applications. However, performance tends to be unpredictable, being a function of ...

research-article
Open Access
Low-power Near-data Instruction Execution Leveraging Opcode-based Timing Analysis

Traditional processor architectures utilize an external DRAM for data storage, while they also operate under worst-case timing constraints. Such designs are heavily constrained by the delay costs of the data transfer between the core pipeline and the DRAM,...

research-article
Open Access
GiantVM: A Novel Distributed Hypervisor for Resource Aggregation with DSM-aware Optimizations

We present GiantVM,1 an open-source distributed hypervisor that provides the many-to-one virtualization to aggregate resources from multiple physical machines. We propose techniques to enable distributed CPU and I/O virtualization and distributed shared ...

research-article
Open Access
Cooperative Slack Management: Saving Energy of Multicore Processors by Trading Performance Slack Between QoS-Constrained Applications

Processor resources can be adapted at runtime according to the dynamic behavior of applications to reduce the energy consumption of multicore processors without affecting the Quality-of-Service (QoS). To achieve this, an online resource management scheme ...

research-article
Open Access
Weaving Synchronous Reactions into the Fabric of SSA-form Compilers

We investigate the programming of reactive systems combining closed-loop control with performance-intensive components such as Machine Learning (ML). Reactive control systems are often safety-critical and associated with real-time execution requirements, ...

research-article
Open Access
Register-Pressure-Aware Instruction Scheduling Using Ant Colony Optimization

This paper describes a new approach to register-pressure-aware instruction scheduling, using Ant Colony Optimization (ACO). ACO is a nature-inspired optimization technique that researchers have successfully applied to NP-hard sequencing problems like the ...

research-article
Open Access
MemHC: An Optimized GPU Memory Management Framework for Accelerating Many-body Correlation

The many-body correlation function is a fundamental computation kernel in modern physics computing applications, e.g., Hadron Contractions in Lattice quantum chromodynamics (QCD). This kernel is both computation and memory intensive, involving a series of ...

research-article
Open Access
Dependence-aware Slice Execution to Boost MLP in Slice-out-of-order Cores

Exploiting memory-level parallelism (MLP) is crucial to hide long memory and last-level cache access latencies. While out-of-order (OoO) cores, and techniques building on them, are effective at exploiting MLP, they deliver poor energy efficiency due to ...

research-article
Open Access
MetaSys: A Practical Open-source Metadata Management System to Implement and Evaluate Cross-layer Optimizations

This article introduces the first open-source FPGA-based infrastructure, MetaSys, with a prototype in a RISC-V system, to enable the rapid implementation and evaluation of a wide range of cross-layer techniques in real hardware. Hardware-software ...

research-article
Open Access
ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes

Parallel applications often rely on work stealing schedulers in combination with fine-grained tasking to achieve high performance and scalability. However, reducing the total energy consumption in the context of work stealing runtimes is still challenging,...

research-article
Open Access
Preserving Addressability Upon GC-Triggered Data Movements on Non-Volatile Memory

This article points out an important threat that application-level Garbage Collection (GC) creates to the use of non-volatile memory (NVM). Data movements incurred by GC may invalidate the pointers to objects on NVM and, hence, harm the reusability of ...

research-article
Open Access
A Case For Intra-rack Resource Disaggregation in HPC

The expected halt of traditional technology scaling is motivating increased heterogeneity in high-performance computing (HPC) systems with the emergence of numerous specialized accelerators. As heterogeneity increases, so does the risk of underutilizing ...

Subjects

Comments