skip to main content
10.1109/CCGRID.2017.54acmconferencesArticle/Chapter ViewAbstractPublication PagesccgridConference Proceedingsconference-collections
tutorial

Performance Optimization by Dynamically Altering Cache Replacement Algorithm in CPU-GPU Heterogeneous Multi-Core Architecture

Published: 14 May 2017 Publication History

Abstract

Cache memory helps in expediting the speed of data retrieval time in processors in heterogeneous multi-core architecture, which is the main factor that affects system performance and power consumption. The implementation algorithm of cache replacement in current heterogeneous multi-core environment is thread-blinded, leading to a lower utilization of the cache. In fact, each of the CPU and GPU applications has its own characteristics, where CPU is responsible for the implementation of tasks and serial logic control, while GPU has a great advantage in parallel computing, which causes the need of cache blocks for CPU more sensitive than those for GPU. With that in mind, this research gives full consideration to the increment of thread priority in the cache replacement algorithm and takes a novel strategy to improve the work efficiency of last-level-cache (LLC), where the CPU and GPU applications share LLC dynamically and not in an absolutely fair status. Furthermore, our methodology switches policies between the LRU (Least Recently Used) and LFU (Least Frequently Used) effectively by comparing the number of cache misses on the LLC, which takes both the time and frequency of the accessing cache block into consideration. The experimental results indicate that this optimization method can effectively improve system performance.

References

[1]
J. Fang, L. Yu, S. Liu, J. Lu, and T. Chen, "Kl_ga: an application mapping algorithm for mesh-of-tree (mot) architecture in network-on-chip design," The Journal of Supercomputing, 2015, vol. 71(11), pp. 4056--4071.
[2]
Z. Yang, X. Zuocheng, and M. Xiao, "DIPP---an LLC replacement policy for on-chip dynamic heterogeneous multi-core architecture," International Conference of Young Computer Scientists, Springer Berlin Heidelberg, Oct. 2015, pp. 386--397.
[3]
S. Muthukumar, and P. K. Jawahar, "Cache replacement policies for improving LLC performance in multi-core processors," International Journal of Computer Applications, 2014, pp. 105(8).
[4]
S. Muthukumar, and P. K. Jawahar, "Cache replacement for multithreaded applications using context based data pattern exploitation technique," Malays. J. Comput. Sci., 2014, vol. 26(4).
[5]
N. Megiddo, and D. S. Modha, "Outperforming lru with an adaptive replacement cache algorithm,". Computer, Apr. 2004, vol. 37, pp. 58--65.
[6]
V. V. Fedorov, S. Qiu, A. L. Reddy, and P. V Gratz, "ARI: Adaptive LLC-memory traffic management," ACM T. Archit. Code. Op. (TACO), Dec. 2013, vol. 10, pp. 46.
[7]
X. Sui, J. Wu, and G. Chen, "ELF: Shared cache management strategy based on block elimination and low reuse block filtering," Journal of Computer Science, 2011, vol. 34(1), pp. 143--153.
[8]
M. Kharbutli, and Y. Solihin, "Counter-based cache replacement and bypassing algorithms," IEEE T. Comput., Apr. 2008, vol. 57(4), pp. 433--447.
[9]
P. Petoumenos, G. Keramidas, and S. Kaxiras, "Instruction-based reuse-distance prediction for effective cache management," SAMOS'09. International Symposium on Systems, Architectures, Modeling, and Simulation IEEE, Jul. 2009, pp. 49--58.
[10]
J. Power, J. Hestness, M. S. Orr, M. D. Hill, and D. A.Wood, "Gem5-gpu: A heterogeneous cpu-gpu simulator,". IEEE Comput. Archit. L., Jan. 2015, vol. 14(1), pp. 34--36.
[11]
J. L. Henning, "SPEC CPU2006 benchmark descriptions," ACM SIGARCH Comput. Archit. News, Sep. 2006, vol. 34(4), pp. 1--17.
[12]
S. Che, M. Boyer, J. Meng, D. Tarjan, S. Lee, J. W. Sheaffer, and K Skadron, "Radinia: A benchmark suite for heterogeneous computing," 2009 IEEE International Symposium on Workload Characterization (IISWC), Oct. 2009, pp. 44--54

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCGrid '17: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
May 2017
1167 pages
ISBN:9781509066100

Sponsors

Publisher

IEEE Press

Publication History

Published: 14 May 2017

Check for updates

Author Tags

  1. CPU-GPU
  2. System performance
  3. cache replacement algorithm
  4. heterogeneous multi-core

Qualifiers

  • Tutorial
  • Research
  • Refereed limited

Conference

CCGrid '17
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 105
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media