Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2024
Joint Pruning and Channel-Wise Mixed-Precision Quantization for Efficient Deep Neural Networks
- Beatrice Alessandra Motetti,
- Matteo Risso,
- Alessio Burrello,
- Enrico Macii,
- Massimo Poncino,
- Daniele Jahier Pagliari
IEEE Transactions on Computers (ITCO), Volume 73, Issue 11Pages 2619–2633https://doi.org/10.1109/TC.2024.3449084The resource requirements of deep neural networks (DNNs) pose significant challenges to their deployment on edge devices. Common approaches to address this issue are pruning and mixed-precision quantization, which lead to latency and memory occupation ...
- research-articleAugust 2024
Memristor-Based Approximate Query Architecture for In-Memory Hyperdimensional Computing
IEEE Transactions on Computers (ITCO), Volume 73, Issue 11Pages 2605–2618https://doi.org/10.1109/TC.2024.3441861As a new computing paradigm, hyperdimensional computing (HDC) has gradually manifested its advantages in edge-side intelligent applications by virtue of its interpretability, hardware-friendliness and robustness. The core of HDC is to encode input samples ...
- research-articleAugust 2024
A Parallel Tag Cache for Hardware Managed Tagged Memory in Multicore Processors
IEEE Transactions on Computers (ITCO), Volume 73, Issue 11Pages 2488–2503https://doi.org/10.1109/TC.2024.3441835Hardware-managed tagged memory is the dominant way of supporting tags in current processor designs. Most of these processors reserve a hidden tag partition in the memory dedicated for tags and use a small tag cache (TC) to reduce the extra memory accesses ...
- research-articleJune 2024
HYDRA: A Hybrid Resistance Drift Resilient Architecture for Phase Change Memory-Based Neural Network Accelerators
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2123–2135https://doi.org/10.1109/TC.2024.3404096In-memory Computing (IMC) using Phase Change Memory (PCM) has proven to be effective for efficient processing of Deep Neural Networks (DNNs). However, with the use of multi-level cell PCM (MLC-PCM) in NVMs-based accelerators, errors due to resistance ...
- research-articleJune 2024
Enabling Reliable Memory-Mapped I/O With Auto-Snapshot for Persistent Memory Systems
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2290–2304https://doi.org/10.1109/TC.2024.3416683Persistent memory (PM) is promising to be the next-generation storage device with better I/O performance. Since the traditional I/O path is too lengthy to drive PM featuring low latency and high bandwidth, prior works proposed memory-mapped I/O (MMIO) to ...
-
- research-articleJune 2024
ISSA: Architecting CNN Accelerators Using Input-Skippable, Set-Associative Computing-in-Memory
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2136–2149https://doi.org/10.1109/TC.2024.3404060Among several emerging architectures, computing in memory (CIM), which features in-situ analog computation, is a potential solution to the data movement bottleneck of the Von Neumann architecture for artificial intelligence (AI). Interestingly, more ...
- research-articleMay 2024
SimBU: Self-Similarity-Based Hybrid Binary-Unary Computing for Nonlinear Functions
IEEE Transactions on Computers (ITCO), Volume 73, Issue 9Pages 2192–2205https://doi.org/10.1109/TC.2024.3398512Unary computing is a relatively new method for implementing arbitrary nonlinear functions that uses unpacked thermometer number encoding, enabling much lower hardware costs. In its original form, unary computing provides no trade-off between accuracy and ...
- research-articleJanuary 2024
Achieving DRAM-Like PCM by Trading Off Capacity for Latency
IEEE Transactions on Computers (ITCO), Volume 73, Issue 4Pages 1180–1189https://doi.org/10.1109/TC.2024.3355779Phase Change Memory (PCM) is considered one of the most promising scalable non-volatile main memory alternatives to DRAM. It provides <inline-formula><tex-math notation="LaTeX">$\sim$</tex-math><alternatives><mml:math><mml:mo>∼</mml:mo></mml:math><...
- research-articleJanuary 2024
Enhancing Graph Random Walk Acceleration via Efficient Dataflow and Hybrid Memory Architecture
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 887–901https://doi.org/10.1109/TC.2023.3347674Graph random walk sampling is becoming increasingly important with the widespread popularity of graph applications. It aims to capture the desirable graph properties by launching multiple walkers to collect feature paths. However, previous research ...
- research-articleDecember 2023
Learning the Error Features of Approximate Multipliers for Neural Network Applications
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 842–856https://doi.org/10.1109/TC.2023.3345163Approximate multipliers (AMs) have widely been investigated to pursue high-performance and energy-efficient hardware designs for error-tolerant applications, such as neural networks (NNs). The computing accuracy of an AM has been evaluated by using ...
- research-articleDecember 2023
Honeycomb: Ordered Key-Value Store Acceleration on an FPGA-Based SmartNIC
- Junyi Liu,
- Aleksandar Dragojević,
- Shane Fleming,
- Antonios Katsarakis,
- Dario Korolija,
- Igor Zablotchi,
- Ho-Cheung Ng,
- Anuj Kalia,
- Miguel Castro
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 857–871https://doi.org/10.1109/TC.2023.3345173In-memory ordered key-value stores are an important building block in modern distributed applications. We present Honeycomb, a hybrid software-hardware system for accelerating read-dominated workloads on ordered key-value stores that provides ...
- research-articleDecember 2023
CDS: Coupled Data Storage to Enhance Read Performance of 3D TLC NAND Flash Memory
IEEE Transactions on Computers (ITCO), Volume 73, Issue 3Pages 694–707https://doi.org/10.1109/TC.2023.3338474Due to the strong demand of massive storage capacity, the density of flash memory has been improved in terms of technology node scaling, multi-bit per cell technique, and 3D stacking. However, these techniques also degrade read performance and ...
- research-articleDecember 2023
Wrong-Path-Aware Entangling Instruction Prefetcher
IEEE Transactions on Computers (ITCO), Volume 73, Issue 2Pages 548–559https://doi.org/10.1109/TC.2023.3337308Instruction prefetching is instrumental for guaranteeing a high flow of instructions through the processor front end for applications whose working set does not fit in the lower-level caches. Examples of such applications are server workloads, whose ...
- research-articleNovember 2023
Stochastic Circuits for Computing Weighted Ratio With Applications to Multiclass Bayesian Inference Machine
IEEE Transactions on Computers (ITCO), Volume 73, Issue 2Pages 621–630https://doi.org/10.1109/TC.2023.3329998Bayesian inference is one method of statistical inference in machine learning. It predicts the probability that a given test belongs to a certain class and is widely used in various applications such as medical diagnosis, spam classification and fraud ...
- research-articleNovember 2023
A High-Performance, Energy-Efficient Modular DMA Engine Architecture
- Thomas Benz,
- Michael Rogenmoser,
- Paul Scheffler,
- Samuel Riedel,
- Alessandro Ottaviano,
- Andreas Kurth,
- Torsten Hoefler,
- Luca Benini
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 263–277https://doi.org/10.1109/TC.2023.3329930Data transfers are essential in today's computing systems as latency and complex memory access patterns are increasingly challenging to manage. Direct memory access engines (DMAES) are critically needed to transfer data independently of the ...
- research-articleOctober 2023
An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage Systems
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 164–177https://doi.org/10.1109/TC.2023.3325625With the popularity of cloud services, cloud block storage (CBS) systems have been widely deployed by cloud providers. Cloud cache plays a vital role in maintaining high and stable performance in cloud block storage systems. In the past few decades, much ...
- research-articleSeptember 2023
Split-Radix Based Compact Hardware Architecture for CRYSTALS-Kyber
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 97–108https://doi.org/10.1109/TC.2023.3320040Facing the threat of large-scale quantum computers to traditional public-key cryptography, the National Institute of Standards and Technology has conducted Post-Quantum Cryptography algorithms evaluation for a long time, and CRYSTALS-Kyber has been ...
- research-articleAugust 2023
MemPool: A Scalable Manycore Architecture With a Low-Latency Shared L1 Memory
IEEE Transactions on Computers (ITCO), Volume 72, Issue 12Pages 3561–3575https://doi.org/10.1109/TC.2023.3307796Shared L1 memory clusters are a common architectural pattern (e.g., in GPGPUs) for building efficient and flexible multi-processing-element (PE) engines. However, it is a common belief that these tightly-coupled clusters would not scale beyond a few tens ...
- research-articleAugust 2023
Unified Digit Selection for Radix-4 Recurrence Division and Square Root
IEEE Transactions on Computers (ITCO), Volume 73, Issue 1Pages 292–300https://doi.org/10.1109/TC.2023.3305760Division and square root are fundamental operations required by most computer systems. They are commonly implemented in hardware using radix-4 recurrence, which produces a 2-bit result digit on each step. Unified digit selection logic chooses the next ...
- research-articleAugust 2023
An Area-Efficient In-Memory Implementation Method of Arbitrary Boolean Function Based on SRAM Array
IEEE Transactions on Computers (ITCO), Volume 72, Issue 12Pages 3416–3430https://doi.org/10.1109/TC.2023.3301156In-memory computing is an emerging computing paradigm to breakthrough the von-Neumann bottleneck. The SRAM based in-memory computing (SRAM-IMC) attracts great concerns from industries and academia, because the SRAM is technology compatible with the widely-...