skip to main content
10.1145/1654059.1654116acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Machine learning-based prefetch optimization for data center applications

Published: 14 November 2009 Publication History

Abstract

Performance tuning for data centers is essential and complicated. It is important since a data center comprises thousands of machines and thus a single-digit performance improvement can significantly reduce cost and power consumption. Unfortunately, it is extremely difficult as data centers are dynamic environments where applications are frequently released and servers are continually upgraded.
In this paper, we study the effectiveness of different processor prefetch configurations, which can greatly influence the performance of memory system and the overall data center. We observe a wide performance gap when comparing the worst and best configurations, from 1.4% to 75.1%, for 11 important data center applications. We then develop a tuning framework which attempts to predict the optimal configuration based on hardware performance counters. The framework achieves performance within 1% of the best performance of any single configuration for the same set of applications.

References

[1]
David W. Aha and Dennis Kibler. Instance-based learning algorithms. Machine Learning, 6(1):37--66, 1991.
[2]
M. D. Buhmann. Radial Basis Functions: Theory and Implementations. Cambridge University Press, Cambridge, UK, 2003.
[3]
Y. Chou. Low-cost epoch-based correlation prefetching for commercial applications. In MICRO '07: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pages 301--313, Washington, DC, USA, 2007. IEEE Computer Society.
[4]
W. W. Cohen. Fast effective rule induction. In Proceedings of the Twelfth International Conference on Machine Learning, pages 115--123. Morgan Kaufmann, 1995.
[5]
G. Hamerly, E. Perelman, J. Lau, B. Calder, and T. Sherwood. Using machine learning to guide architecture simulation. J. Mach. Learn. Res., 7:343--378, 2006.
[6]
R. A. Hankins, T. Diep, M. Annavaram, B. Hirano, H. Eri, H. Nueckel, and J. P. Shen. Scaling and charact rizing database workloads: Bridging the gap between research and practice. In MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, page 151, Washington, DC, USA, 2003. IEEE Computer Society.
[7]
J. A. Hartigan and M. A. Wong. A k-means clustering algorithm. Applied Statistics, 28(1):100--108, 1979.
[8]
S. Haykin. Neural Networks: A Comprehensive Foundation, 2nd edition. Prentice Hall, Upper Saddle River, NJ, 1999.
[9]
I. Hur and C. Lin. Memory prefetching using adaptive stream detection. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pages 397--408, Washington, DC, USA, 2006. IEEE Computer Society.
[10]
Intel 64 and IA-32 Architectures Software Developer's Manuals. www.intel.com/products/processor/manuals.
[11]
E. Ïpek, S. A. McKee, R. Caruana, B. R. de Supinski, and M. Schulz. Efficiently exploring architectural design spaces via predictive modeling. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 195--206, New York, NY, USA, 2006. ACM.
[12]
G. H. John and P. Langley. Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338--345. Morgan Kaufmann, August 1995.
[13]
D. Joseph and D. Grunwald. Prefetching using markov predictors. In ISCA '97: Proceedings of the 24th annual international symposium on Computer architecture, pages 252--263, New York, NY, USA, 1997. ACM.
[14]
B. C. Lee and D. Brooks. Efficiency trends and limits from comprehensive microarchitectural adaptivity. In ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, pages 36--47, New York, NY, USA, 2008. ACM.
[15]
B. C. Lee and D. M. Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pages 185--194, New York, NY, USA, 2006. ACM.
[16]
perfmon2: the hardware-based performance monitoring interface for Linux. perfmon2.sourceforge.net.
[17]
J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In Advances in kernel methods: support vector learning, pages 185--208, Cambridge, MA, USA, 1999. MIT Press.
[18]
J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, 1993.
[19]
S. S. le Cessie and J. van Houwelingen. Ridge estimators in logistic regression. Applied Statistics, 41(1):191--201, 1992.
[20]
S. Somogyi, T. F. Wenisch, A. Ailamaki, and B. Falsafi. Spatio-temporal memory streaming. In ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture, pages 69--80, New York, NY, USA, 2009. ACM.
[21]
S. Somogyi, T. F. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos. Spatial memory streaming. In ISCA '06: Proceedings of the 33rd Annual International Symposium on Computer Architecture, pages 252--263, Washington, DC, USA, June 2006. IEEE Computer Society.
[22]
SPEC CPU2006. Standard Performance Evaluation Corporation. www.spec.org/cpu2006.
[23]
SPECweb2009. Standard Performance Evaluation Corporation. www.spec.org/web2009.
[24]
TPC-C. www.tpc.org/tpcc.
[25]
T. F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki, and B. Falsafi. Temporal streaming of shared memory. In ISCA '05: Proceedings of the 32nd annual international symposium on Computer Architecture, pages 222--233, Washington, DC, USA, 2005. IEEE Computer Society.
[26]
R. E. Wunderlich, T. F. Wenisch, B. Falsafi, and J. C. Hoe. Smarts: Accelerating microarchitecture simulation via rigorous statistical sampling. In Proceedings of the 30th annual international symposium on Computer architecture, pages 84--97. ACM Press, 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
November 2009
778 pages
ISBN:9781605587448
DOI:10.1145/1654059
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 November 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SC '09
Sponsor:

Acceptance Rates

SC '09 Paper Acceptance Rate 59 of 261 submissions, 23%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)70
  • Downloads (Last 6 weeks)11
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media