default search action
ACM Transactions on Architecture and Code Optimization, Volume 17
Volume 17, Number 1, March 2020
- Yuhao Li, Dan Sun, Benjamin C. Lee:
Dynamic Colocation Policies with Reinforcement Learning. 1:1-1:25 - Nikolaos Tampouratzis, Ioannis Papaefstathiou, Antonios Nikitakis, Andreas Brokalakis, Stamatis Andrianakis, Apostolos Dollas, Marco Marcon, Emanuele Plebani:
A Novel, Highly Integrated Simulator for Parallel and Distributed Systems. 2:1-2:28 - Lijuan Jiang, Chao Yang, Wenjing Ma:
Enabling Highly Efficient Batched Matrix Multiplications on SW26010 Many-core Processor. 3:1-3:23 - Mustafa Cavus, Resit Sendag, Joshua J. Yi:
Informed Prefetching for Indirect Memory Accesses. 4:1-4:29 - Yohann Uguen, Florent de Dinechin, Victor Lezaud, Steven Derrien:
Application-Specific Arithmetic in High-Level Synthesis Tools. 5:1-5:23 - Yang Song, Bill Lin:
Improving Memory Efficiency in Heterogeneous MPSoCs through Row-Buffer Locality-aware Forwarding. 6:1-6:26 - Hao Wu, Weizhi Liu, Huanxin Lin, Cho-Li Wang:
A Model-Based Software Solution for Simultaneous Multiple Kernels on GPUs. 7:1-7:26 - Xuanhua Shi, Wei Liu, Ligang He, Hai Jin, Ming Li, Yong Chen:
Optimizing the SSD Burst Buffer by Traffic Detection. 8:1-8:26
Volume 17, Number 2, June 2020
- Charu Kalra, Fritz Previlon, Norm Rubin, David R. Kaeli:
ArmorAll: Compiler-based Resilience Targeting GPU Applications. 9:1-9:24 - Stefano Cherubin, Daniele Cattaneo, Michele Chiari, Giovanni Agosta:
Dynamic Precision Autotuning with TAFFO. 10:1-10:26 - Ahmet Erdem, Cristina Silvano, Thomas Boesch, Andrea C. Ornstein, Surinder Pal Singh, Giuseppe Desoli:
Runtime Design Space Exploration and Mapping of DCNNs for the Ultra-Low-Power Orlando SoC. 11:1-11:25 - Amir Hossein Nodehi Sabet, Junqiao Qiu, Zhijia Zhao, Sriram Krishnamoorthy:
Reliability Analysis for Unreliable FSM Computations. 12:1-12:23 - Jiachen Xue, T. N. Vijaykumar, Mithuna Thottethodi:
Network Interface Architecture for Remote Indirect Memory Access (RIMA) in Datacenters. 13:1-13:22 - Qinggang Wang, Long Zheng, Jieshan Zhao, Xiaofei Liao, Hai Jin, Jingling Xue:
A Conflict-free Scheduler for High-performance Graph Processing on Multi-pipeline FPGAs. 14:1-14:26 - Anita Tino, Caroline Collange, André Seznec:
SIMT-X: Extending Single-Instruction Multi-Threading to Out-of-Order Cores. 15:1-15:23
Volume 17, Number 3, August 2020
- David R. Kaeli:
Editorial: A Message from the Editor-in-Chief. 16:1-16:2 - Ram Rangan, Mark W. Stephenson, Aditya Ukarande, Shyam Murthy, Virat Agarwal, Marc Blackstein:
Zeroploit: Exploiting Zero Valued Operands in Interactive Gaming Applications. 17:1-17:26 - Karel Adámek, Sofia Dimoudi, Mike B. Giles, Wesley Armour:
GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory. 18:1-18:20 - Arnab Das, Sriram Krishnamoorthy, Ian Briggs, Ganesh Gopalakrishnan, Ramakrishna Tipireddy:
FPDetect: Efficient Reasoning About Stencil Programs Using Selective Direct Evaluation. 19:1-19:27 - Tarek S. Abdelrahman:
Cooperative Software-hardware Acceleration of K-means on a Tightly Coupled CPU-FPGA System. 20:1-20:24 - Jaekyu Lee, Yasuo Ishii, Dam Sunwoo:
Securing Branch Predictors with Two-Level Encryption. 21:1-21:25 - Luca Cerina, Marco D. Santambrogio, Giuseppe Franco, Claudio Gallicchio, Alessio Micheli:
EchoBay: Design and Optimization of Echo State Networks under Memory and Time Constraints. 22:1-22:24 - Savvas Sioutas, Sander Stuijk, Twan Basten, Henk Corporaal, Lou J. Somers:
Schedule Synthesis for Halide Pipelines on GPUs. 23:1-23:25 - Muhammad Huzaifa, Johnathan Alsop, Abdulrahman Mahmoud, Giordano Salvador, Matthew D. Sinclair, Sarita V. Adve:
Inter-kernel Reuse-aware Thread Block Scheduling. 24:1-24:27
Volume 17, Number 4, November 2020
- Gokul Subramanian Ravi, Joshua San Miguel, Mikko H. Lipasti:
SHASTA: Synergic HW-SW Architecture for Spatio-temporal Approximation. 25:1-25:26 - Aravind Acharya, Uday Bondhugula, Albert Cohen:
Effective Loop Fusion in Polyhedral Compilation Using Fusion Conflict Graphs. 26:1-26:26 - Steffen Maass, Mohan Kumar Kumar, Taesoo Kim, Tushar Krishna, Abhishek Bhattacharjee:
ECOTLB: Eventually Consistent TLBs. 27:1-27:24 - Anchu Rajendran, V. Krishna Nandivada:
DisGCo: A Compiler for Distributed Graph Analytics. 28:1-28:26 - Yu Zhang, Xiaofei Liao, Lin Gu, Hai Jin, Kan Hu, Haikun Liu, Bingsheng He:
AsynGraph: Maximizing Data Parallelism for Efficient Iterative Graph Processing on GPUs. 29:1-29:21 - Yemao Xu, Dezun Dong, Yawei Zhao, Weixia Xu, Xiangke Liao:
OD-SGD: One-Step Delay Stochastic Gradient Descent for Distributed Training. 30:1-30:26 - Xinfeng Xie, Xing Hu, Peng Gu, Shuangchen Li, Yu Ji, Yuan Xie:
NNBench-X: A Benchmarking Methodology for Neural Network Accelerator Designs. 31:1-31:25 - S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta, Y. N. Srikant:
IR2VEC: LLVM IR Based Scalable Program Embeddings. 32:1-32:27 - Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, Carole-Jean Wu:
GEVO: GPU Code Optimization Using Evolutionary Computation. 33:1-33:28 - Rolando Brondolin, Marco D. Santambrogio:
A Black-box Monitoring Approach to Measure Microservices Runtime Performance. 34:1-34:26 - Utpal Bora, Santanu Das, Pankaj Kukreja, Saurabh Joshi, Ramakrishna Upadrasta, Sanjay V. Rajopadhye:
LLOV: A Fast Static Data-Race Checker for OpenMP Programs. 35:1-35:26 - George Christou, Giorgos Vasiliadis, Vassilis Papaefstathiou, Antonis Papadogiannakis, Sotiris Ioannidis:
On Architectural Support for Instruction Set Randomization. 36:1-36:26 - Athanasios Stratikopoulos, Christos Kotselidis, John Goodacre, Mikel Luján:
FastPath_MP: Low Overhead & Energy-efficient FPGA-based Storage Multi-paths. 37:1-37:23 - Cristóbal Ramírez, César-Alejandro Hernández-Calderón, Oscar Palomar, Osman S. Unsal, Marco Antonio Ramírez, Adrián Cristal:
A RISC-V Simulator and Benchmark Suite for Designing and Evaluating Vector Architectures. 38:1-38:30 - Sam Likun Xi, Yuan Yao, Kshitij Bhardwaj, Paul N. Whatmough, Gu-Yeon Wei, David Brooks:
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads. 39:1-39:26 - Albin Eldstål-Ahrens, Ioannis Sourdis:
MemSZ: Squeezing Memory Traffic with Lossy Compression. 40:1-40:25 - Dennis Pinto, José-María Arnau, Antonio González:
Design and Evaluation of an Ultra Low-power Human-quality Speech Recognition System. 41:1-41:19
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.