Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024JUST ACCEPTED
MasterPlan: A Reinforcement Learning Based Scheduler for Archive Storage
ACM Transactions on Architecture and Code Optimization (TACO), Just Accepted https://doi.org/10.1145/3708542With the sheer volume of data in today’s world, archive storage systems play a significant role in persisting the cold data. Due to stringent cost concerns, one popular design is to organize disks into groups and periodically switch them to be powered on ...
- research-articleDecember 2024JUST ACCEPTED
AIS: An Active Idleness I/O Scheduler to Reduce Buffer-Exhausted Degradation of Solid-State Drives
ACM Transactions on Architecture and Code Optimization (TACO), Just Accepted https://doi.org/10.1145/3708538Modern solid-state drives (SSDs) continue to boost storage density and I/O bandwidth at the cost of flash-access I/O latency, especially for write, hence prevalently deploy a build-in buffer to absorb incoming writes. However, when the buffer is used up, ...
- research-articleNovember 2024JUST ACCEPTED
exZNS: Extending Zoned Namespace to Support Byte-loggable Zones
ACM Transactions on Architecture and Code Optimization (TACO), Just Accepted https://doi.org/10.1145/3705318Emerging Zoned Namespace (ZNS) provides hosts with fine-grained, performance-predictable storage management. ZNS organizes the address space into zones composed of fixed-size, sequentially written, non-overwritable blocks, making it suitable for log-...
- research-articleNovember 2024
SuccinctKV: a CPU-efficient LSM-tree Based KV Store with Scan-based Compaction
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 4Article No.: 90, Pages 1–26https://doi.org/10.1145/3695873The CPU overhead of the LSM-tree becomes increasingly significant when high-speed storage devices are utilized. In this article, we propose SuccinctKV, a key-value store based on LSM-tree that is optimized to improve CPU efficiency in mixed workload ...
- research-articleNovember 2024
Optimizing Garbage Collection for ZNS SSDs via In-storage Data Migration and Address Remapping
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 4Article No.: 77, Pages 1–25https://doi.org/10.1145/3689336The NVMe Zoned Namespace (ZNS) is a high-performance interface for flash-based solid-state drives (SSDs), which divides the logical address space into fixed-size and sequential-write zones. Meanwhile, ZNS SSDs eliminate in-device garbage collection (GC) ...
-
- research-articleNovember 2024JUST ACCEPTED
ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management
ACM Transactions on Architecture and Code Optimization (TACO), Just Accepted https://doi.org/10.1145/3701996Due to the limited GPU memory, the performance of large DNNs training is constrained by the unscalable batch size. Existing researches partially address the issue of GPU memory limit through tensor recomputation and swapping, but overlook the exploration ...
- research-articleNovember 2024JUST ACCEPTED
SPIRIT: Scalable and Persistent In-Memory Indices for Real-Time Search
ACM Transactions on Architecture and Code Optimization (TACO), Just Accepted https://doi.org/10.1145/3703351Today, real-time search over big microblogging data requires low indexing and query latency. Online services, therefore, prefer to host inverted indices in memory. Unfortunately, as datasets grow, indices grow proportionally, and with limited DRAM scaling,...
- research-articleSeptember 2024
Lavender: An Efficient Resource Partitioning Framework for Large-Scale Job Colocation
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 58, Pages 1–23https://doi.org/10.1145/3674736Workload consolidation is a widely used approach to enhance resource utilization in modern data centers. However, the concurrent execution of multiple jobs on a shared server introduces contention for essential shared resources such as CPU cores, Last ...
- research-articleSeptember 2024
Scythe: A Low-latency RDMA-enabled Distributed Transaction System for Disaggregated Memory
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 57, Pages 1–26https://doi.org/10.1145/3666004Disaggregated memory separates compute and memory resources into independent pools connected by RDMA (Remote Direct Memory Access) networks, which can improve memory utilization, reduce cost, and enable elastic scaling of compute and memory resources. ...
- research-articleSeptember 2024
Characterizing and Optimizing LDPC Performance on 3D NAND Flash Memories
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 62, Pages 1–26https://doi.org/10.1145/3663478With the development of NAND flash memories’ bit density and stacking technologies, while storage capacity keeps increasing, the issue of reliability becomes increasingly prominent. Low-density parity check (LDPC) code, as a robust error-correcting code, ...
- research-articleSeptember 2024
D2Comp: Efficient Offload of LSM-tree Compaction with Data Processing Units on Disaggregated Storage
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 46, Pages 1–22https://doi.org/10.1145/3656584LSM-based key-value stores suffer from sub-optimal performance due to their slow and heavy background compactions. The compaction brings severe CPU and network overhead on high-speed disaggregated storage. This article further reveals that data-intensive ...
- research-articleFebruary 2024
A Concise Concurrent B+-Tree for Persistent Memory
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 2Article No.: 24, Pages 1–25https://doi.org/10.1145/3638717Persistent memory (PM) presents a unique opportunity for designing data management systems that offer improved performance, scalability, and instant restart capability. As a widely used data structure for managing data in such systems, B+-Tree must ...
- research-articleFebruary 2024
- research-articleJanuary 2024
WA-Zone: Wear-Aware Zone Management Optimization for LSM-Tree on ZNS SSDs
- Linbo Long,
- Shuiyong He,
- Jingcheng Shen,
- Renping Liu,
- Zhenhua Tan,
- Congming Gao,
- Duo Liu,
- Kan Zhong,
- Yi Jiang
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 1Article No.: 16, Pages 1–23https://doi.org/10.1145/3637488ZNS SSDs divide the storage space into sequential-write zones, reducing costs of DRAM utilization, garbage collection, and over-provisioning. The sequential-write feature of zones is well-suited for LSM-based databases, where random writes are organized ...
- research-articleDecember 2023
Fastensor: Optimise the Tensor I/O Path from SSD to GPU for Deep Learning Training
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 4Article No.: 62, Pages 1–25https://doi.org/10.1145/3630108In recent years, benefiting from the increase in model size and complexity, deep learning has achieved tremendous success in computer vision (CV) and (NLP). Training deep learning models using accelerators such as GPUs often requires much iterative data ...
- research-articleDecember 2023
gPPM: A Generalized Matrix Operation and Parallel Algorithm to Accelerate the Encoding/Decoding Process of Erasure Codes
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 4Article No.: 51, Pages 1–25https://doi.org/10.1145/3625005Erasure codes are widely deployed in modern storage systems, leading to frequent usage of their encoding/decoding operations. The encoding/decoding process for erasure codes is generally carried out using the parity-check matrix approach. However, this ...
- research-articleOctober 2023
Smart-DNN+: A Memory-efficient Neural Networks Compression Framework for the Model Inference
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 4Article No.: 49, Pages 1–24https://doi.org/10.1145/3617688Deep Neural Networks (DNNs) have achieved remarkable success in various real-world applications. However, running a Deep Neural Network (DNN) typically requires hundreds of megabytes of memory footprints, making it challenging to deploy on resource-...
- research-articleAugust 2023
- research-articleJuly 2023
rNdN: Fast Query Compilation for NVIDIA GPUs
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 3Article No.: 41, Pages 1–25https://doi.org/10.1145/3603503GPU database systems are an effective solution to query optimization, particularly with compilation and data caching. They fall short, however, in end-to-end workloads, as existing compiler toolchains are too expensive for use with short-running queries. ...
- research-articleNovember 2022
Lock-Free High-performance Hashing for Persistent Memory via PM-aware Holistic Optimization
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 1Article No.: 5, Pages 1–26https://doi.org/10.1145/3561651Persistent memory (PM) provides large-scale non-volatile memory (NVM) with DRAM-comparable performance. The non-volatility and other unique characteristics of PM architecture bring new opportunities and challenges for the efficient storage system design. ...