tutorial

Performance Optimization by Dynamically Altering Cache Replacement Algorithm in CPU-GPU Heterogeneous Multi-Core Architecture

Authors:

Lijun SunAuthors Info & Claims

CCGrid '17: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

Pages 723 - 727

https://doi.org/10.1109/CCGRID.2017.54

Published: 14 May 2017 Publication History

Get Access

Abstract

Cache memory helps in expediting the speed of data retrieval time in processors in heterogeneous multi-core architecture, which is the main factor that affects system performance and power consumption. The implementation algorithm of cache replacement in current heterogeneous multi-core environment is thread-blinded, leading to a lower utilization of the cache. In fact, each of the CPU and GPU applications has its own characteristics, where CPU is responsible for the implementation of tasks and serial logic control, while GPU has a great advantage in parallel computing, which causes the need of cache blocks for CPU more sensitive than those for GPU. With that in mind, this research gives full consideration to the increment of thread priority in the cache replacement algorithm and takes a novel strategy to improve the work efficiency of last-level-cache (LLC), where the CPU and GPU applications share LLC dynamically and not in an absolutely fair status. Furthermore, our methodology switches policies between the LRU (Least Recently Used) and LFU (Least Frequently Used) effectively by comparing the number of cache misses on the LLC, which takes both the time and frequency of the accessing cache block into consideration. The experimental results indicate that this optimization method can effectively improve system performance.

References

[1]

J. Fang, L. Yu, S. Liu, J. Lu, and T. Chen, "Kl_ga: an application mapping algorithm for mesh-of-tree (mot) architecture in network-on-chip design," The Journal of Supercomputing, 2015, vol. 71(11), pp. 4056--4071.

Digital Library

Google Scholar

[2]

Z. Yang, X. Zuocheng, and M. Xiao, "DIPP---an LLC replacement policy for on-chip dynamic heterogeneous multi-core architecture," International Conference of Young Computer Scientists, Springer Berlin Heidelberg, Oct. 2015, pp. 386--397.

Google Scholar

[3]

S. Muthukumar, and P. K. Jawahar, "Cache replacement policies for improving LLC performance in multi-core processors," International Journal of Computer Applications, 2014, pp. 105(8).

Google Scholar

[4]

S. Muthukumar, and P. K. Jawahar, "Cache replacement for multithreaded applications using context based data pattern exploitation technique," Malays. J. Comput. Sci., 2014, vol. 26(4).

Google Scholar

[5]

N. Megiddo, and D. S. Modha, "Outperforming lru with an adaptive replacement cache algorithm,". Computer, Apr. 2004, vol. 37, pp. 58--65.

Digital Library

Google Scholar

[6]

V. V. Fedorov, S. Qiu, A. L. Reddy, and P. V Gratz, "ARI: Adaptive LLC-memory traffic management," ACM T. Archit. Code. Op. (TACO), Dec. 2013, vol. 10, pp. 46.

Digital Library

Google Scholar

[7]

X. Sui, J. Wu, and G. Chen, "ELF: Shared cache management strategy based on block elimination and low reuse block filtering," Journal of Computer Science, 2011, vol. 34(1), pp. 143--153.

Google Scholar

[8]

M. Kharbutli, and Y. Solihin, "Counter-based cache replacement and bypassing algorithms," IEEE T. Comput., Apr. 2008, vol. 57(4), pp. 433--447.

Digital Library

Google Scholar

[9]

P. Petoumenos, G. Keramidas, and S. Kaxiras, "Instruction-based reuse-distance prediction for effective cache management," SAMOS'09. International Symposium on Systems, Architectures, Modeling, and Simulation IEEE, Jul. 2009, pp. 49--58.

Digital Library

Google Scholar

[10]

J. Power, J. Hestness, M. S. Orr, M. D. Hill, and D. A.Wood, "Gem5-gpu: A heterogeneous cpu-gpu simulator,". IEEE Comput. Archit. L., Jan. 2015, vol. 14(1), pp. 34--36.

Crossref

Google Scholar

[11]

J. L. Henning, "SPEC CPU2006 benchmark descriptions," ACM SIGARCH Comput. Archit. News, Sep. 2006, vol. 34(4), pp. 1--17.

Digital Library

Google Scholar

[12]

S. Che, M. Boyer, J. Meng, D. Tarjan, S. Lee, J. W. Sheaffer, and K Skadron, "Radinia: A benchmark suite for heterogeneous computing," 2009 IEEE International Symposium on Workload Characterization (IISWC), Oct. 2009, pp. 44--54

Digital Library

Google Scholar

Cited By

View all

Recommendations

Wlru cpu cache replacement algorithm
Miss-aware LLC buffer management strategy based on heterogeneous multi-core
Abstract
When multiple processor (CPU) cores and a GPU integrated together on the same chip share the last-level cache (LLC), the competition for LLC is more serious. CPU and GPU have different memory access characteristics, so that they have differences ...
Optimized HPL for AMD GPU and multi-core CPU usage

The installation of the LOEWE-CSC ( http://csc.uni-frankfurt.de/csc/__ __51 ) supercomputer at the Goethe University in Frankfurt lead to the development of a Linpack which can fully utilize the installed AMD Cypress GPUs. At its core, a fast DGEMM for ...

Comments

Information & Contributors

Information

Published In

CCGrid '17: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

May 2017

1167 pages

ISBN:9781509066100

Publisher

IEEE Press

Publication History

Published: 14 May 2017

Check for updates

Author Tags

Qualifiers

Tutorial
Research
Refereed limited

Conference

CCGrid '17

Sponsor:

SIGARCH

CCGrid '17: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

May 14 - 17, 2017

Madrid, Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
105
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 31 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations

Wlru cpu cache replacement algorithm

Miss-aware LLC buffer management strategy based on heterogeneous multi-core

Optimized HPL for AMD GPU and multi-core CPU usage

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations