Keyword: prefetching : Search

research-article

Optimizing tensor contractions for embedded devices with racetrack memory scratch-pads

LCTES 2019: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded SystemsPages 5–18https://doi.org/10.1145/3316482.3326351

Tensor contraction is a fundamental operation in many algorithms with a plethora of applications ranging from quantum chemistry over fluid dynamics and image processing to machine learning. The performance of tensor computations critically depends on ...

research-article

Open Access

Stream-based memory access specialization for general purpose processors

ISCA '19: Proceedings of the 46th International Symposium on Computer ArchitecturePages 736–749https://doi.org/10.1145/3307650.3322229

Because of severe limitations in technology scaling, architects have innovated in specializing general purpose processors for computation primitives (e.g. vector instructions, loop accelerators). The general principle is exposing rich semantics to the ...

extended-abstract

Public Access

Proactive Caching for Low Access-Delay Services under Uncertain Predictions

SIGMETRICS '19: Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsPages 89–90https://doi.org/10.1145/3309697.3331471

Network traffic for delay-sensitive services has become a dominant part in the network. Proactive caching with the aid of predictive information has been proposed as a promising method to enhance delay performance. In this paper, we analytically ...

research-article

Public Access

Towards a Smart, Internet-Scale Cache Service for Data Intensive Scientific Applications

ScienceCloud '19: Proceedings of the 10th Workshop on Scientific Cloud ComputingPages 11–18https://doi.org/10.1145/3322795.3331464

Data and services provided by shared facilities, such as large-scale observing facilities, have become important enablers of scientific insights and discoveries across many science and engineering disciplines. Ensuring satisfactory quality of service ...

research-article

Public Access

Proactive Caching for Low Access-Delay Services under Uncertain Predictions

Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS), Volume 3, Issue 1Article No.: 2, Pages 1–46https://doi.org/10.1145/3322205.3311073

Network traffic of delay-sensitive services has become a dominant part in the network. Proactive caching with the aid of predictive information has been proposed as a promising method to enhance the delay performance, which is one of the principal ...

research-article

Making content caching policies 'smart' using the deepcache framework

ACM SIGCOMM Computer Communication Review (SIGCOMM-CCR), Volume 48, Issue 5Pages 64–69https://doi.org/10.1145/3310165.3310174

In this paper, we present Deepcache a novel Framework for content caching, which can significantly boost cache performance. Our Framework is based on powerful deep recurrent neural network models. It comprises of two main components: i) Object ...

research-article

Performance evaluation of main-memory hash joins on KNL

International Journal of Computational Science and Engineering (IJCSE), Volume 20, Issue 4Pages 425–438https://doi.org/10.1504/ijcse.2019.104443

New hardware features have propelled designs and analysis in main-memory hash joins. In previous studies, memory access has always been the primary bottleneck for hash join algorithms. However, there are relatively few studies devoted to bottlenecks ...

research-article

Open Access

AVPP: Address-first Value-next Predictor with Value Prefetching for Improving the Efficiency of Load Value Prediction

ACM Transactions on Architecture and Code Optimization (TACO), Volume 15, Issue 4Article No.: 49, Pages 1–30https://doi.org/10.1145/3239567

Value prediction improves instruction level parallelism in superscalar processors by breaking true data dependencies. Although this technique can significantly improve overall performance, most of the state-of-the-art value prediction approaches require ...

research-article

Exploiting locality in graph analytics through hardware-accelerated traversal scheduling

MICRO-51: Proceedings of the 51st Annual IEEE/ACM International Symposium on MicroarchitecturePages 1–14https://doi.org/10.1109/MICRO.2018.00010

Graph processing is increasingly bottlenecked by main memory accesses. On-chip caches are of little help because the irregular structure of graphs causes seemingly random memory references. However, most real-world graphs offer significant potential ...

poster

SSD QoS Improvements through Machine Learning

SoCC '18: Proceedings of the ACM Symposium on Cloud ComputingPage 511https://doi.org/10.1145/3267809.3275453

The recent deceleration of Moore's law bespeaks new approaches for optimization of resources. Machine learning has been applied to a wide variety of problems across multiple domains; however, the space of machine learning research for storage ...

research-article

Public Access

Empirically assessing opportunities for prefetching and caching in mobile apps

ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software EngineeringPages 554–564https://doi.org/10.1145/3238147.3238215

Network latency in mobile software has a large impact on user experience, with potentially severe economic consequences. Prefetching and caching have been shown effective in reducing the latencies in browser-based systems. However, those techniques ...

research-article

Open Access

LAPPS: Locality-Aware Productive Prefetching Support for PGAS

ACM Transactions on Architecture and Code Optimization (TACO), Volume 15, Issue 3Article No.: 28, Pages 1–26https://doi.org/10.1145/3233299

Prefetching is a well-known technique to mitigate scalability challenges in the Partitioned Global Address Space (PGAS) model. It has been studied as either an automated compiler optimization or a manual programmer optimization. Using the PGAS locality ...

research-article

Public Access

DeepCache: A Deep Learning Based Framework For Content Caching

NetAI'18: Proceedings of the 2018 Workshop on Network Meets AI & MLPages 48–53https://doi.org/10.1145/3229543.3229555

In this paper, we present DEEPCACHE a novel Framework for content caching, which can significantly boost cache performance. Our Framework is based on powerful deep recurrent neural network models. It comprises of two main components: i) Object ...

demonstration

Low-latency delivery of news-based video content

MMSys '18: Proceedings of the 9th ACM Multimedia Systems ConferencePages 537–540https://doi.org/10.1145/3204949.3208110

Nowadays, news-based websites and portals provide significant amounts of multimedia content to accompany news stories and articles. Within this context, HTTP Adaptive Streaming is generally used to deliver video over the best-effort Internet, allowing ...

research-article

Rethinking belady's algorithm to accommodate prefetching

ISCA '18: Proceedings of the 45th Annual International Symposium on Computer ArchitecturePages 110–123https://doi.org/10.1109/ISCA.2018.00020

This paper shows that in the presence of data prefetchers, cache replacement policies are faced with a large unexplored design space. In particular, we observe that while Belady's MIN algorithm minimizes the total number of cache misses---including ...

research-article

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies

ISCA '18: Proceedings of the 45th Annual International Symposium on Computer ArchitecturePages 96–109https://doi.org/10.1109/ISCA.2018.00019

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...

research-article

Division of labor: a more effective approach to prefetching

ISCA '18: Proceedings of the 45th Annual International Symposium on Computer ArchitecturePages 83–95https://doi.org/10.1109/ISCA.2018.00018

Prefetching is a central component in most microarchitectures. Many different algorithms have been proposed with varying degrees of complexity and effectiveness. There are inherent tradeoffs among various metrics especially when we try to exploit both ...

research-article

Improving energy efficiency of database clusters through prefetching and caching

CCGrid '18: Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid ComputingPages 388–391https://doi.org/10.1109/CCGRID.2018.00065

The goal of this study is to optimize energy efficiency of database clusters through prefetching and caching strategies. We design a workload-skewness scheme to collectively manage a set of hot and cold nodes in a database cluster system. The prefetching ...

research-article

Public Access

Minnow: Lightweight Offload Engines for Worklist Management and Worklist-Directed Prefetching

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsPages 593–607https://doi.org/10.1145/3173162.3173197

The importance of irregular applications such as graph analytics is rapidly growing with the rise of Big Data. However, parallel graph workloads tend to perform poorly on general-purpose chip multiprocessors (CMPs) due to poor cache locality, low ...

Also Published in:

ACM SIGPLAN Notices: Volume 53 Issue 2

research-article

An Event-Triggered Programmable Prefetcher for Irregular Workloads

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsPages 578–592https://doi.org/10.1145/3173162.3173189

Many modern workloads compute on large amounts of data, often with irregular memory accesses. Current architectures perform poorly for these workloads, as existing prefetching techniques cannot capture the memory access patterns; these applications end ...

Also Published in:

ACM SIGPLAN Notices: Volume 53 Issue 2

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Also Published in:

Also Published in: