skip to main content
research-article

Selective resource characterization for evaluation of system dynamics

Published: 09 April 2012 Publication History

Abstract

Management decisions to achieve peak performance operations, scalability and availability in distributed systems require a continuous statistical characterization of data sets coming from server and network monitors. Due to the increasing sizes of data centers and their continuous dynamic changes, the traditional approaches that work on all data sets in a centralized way are impractical. We propose a strategy for data processing that is able to limit the analysis of the large sets of collected measures to a smaller subset of significant information for a twofold purpose: to classify the collected data sets in few classes characterized by similar statistical behaviors, to evaluate the dynamics of the overall system and its most relevant changes. The proposed strategy works at the level of server resources and of significant aggregation of servers of the overall distributed system. Several experimental results demonstrate the feasibility of the proposed strategy that is validated in real contexts.

References

[1]
H. Abdi and L. Williams. Principal componentm analysis. Computational Statistics, 2010.
[2]
B. Abrahao and A. Zhang. Characterizing application workloads on cpu utilization in utility computing. Technical Report HPL-2004-157, Hewlet-Packard Labs, 2004.
[3]
M. Andreolini, S. Casolari, and M. Colajanni. Models and framework for supporting run-time decisions in web-based systems. ACM Transaction on the Web, Vol. 2, No. 3, 2008.
[4]
L. Anukool, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Structural analysis of network traffic flows. Joint International Conference on Measurement and Modeling of Computer Systems, 2004.
[5]
M. Argollo de Menezes and A. Barabasi. Separating internal and external dynamics of complex systems. Physical review letters, Vol. 93, No. 6, 2004.
[6]
Y. Baryshnikov, E. Coffman, G. Pierre, D. Rubenstein, M. Squillante, and T. Yimwadsana. Predictability of Web server traffic congestion. In Proc. of the 10th International Workshop on Web Content Caching and Distribution, Sophia Antipolis, FR, 2005.
[7]
S. Casolari, F. Lo Presti, and S. Tosi. An adaptive model for online detection of relevant state changes in internet-based systems. Performance Evaluation, 2011.
[8]
R. Cattell. The scree test for the number of factors. Multivariate behavioral research. Psychology Press, 1966.
[9]
C. Chatfield. The Analysis of Time Series: An Introduction. Chapman and Hall, 1989.
[10]
P. Dinda and D. O'Hallaron. Host load prediction using linear models. Cluster Computing, Vol. 3, No. 4, 2000.
[11]
J. E. Gentle. Computational Statistics. Statistics and Computing. Springer, 2009.
[12]
D. Gmach, J. Rolia, L. Cherkasova, G. Belrose, T. Turicchi, and A. Kemper. An integrated approach to resource pool management: Policies, efficiency and quality metrics. In Proc. of the IEEE International Conference on Dependable Systems and Networks (DSN), 2008.
[13]
D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper. Workload analysis and demand prediction of enterprise data center applications. In Proc. of the 2007 IEEE 10th International Symposium on Workload Characterization, 2007.
[14]
R. Gnanadesikan and M. B. Wilk. Probability plotting methods for the analysis of data. Biometrika, Vol. 55, No. 1, 1968.
[15]
G. H. Golub and C. F. Van Loan. Matrix Computations (Johns Hopkins Studies in Mathematical Sciences). The Johns Hopkins University Press, Oct. 1996.
[16]
M. Greenacre. Correspondence analysis in practice. Chapman & Hall/CRC, 2007.
[17]
P. Hoogenboom and J. Lepreau. Computer system performance problem detection using time series models. In Proc. of the USENIX Summer 1993 Technical Conference on Summer technical conference. USENIX Association, 1993.
[18]
H. Hotelling. Analysis of a complex of statistica variables into principal components. Journal of Educational Psychology, Vol. 24, No. 7, 1933.
[19]
A. Hyvärinen and E. Oja. Independent Component Analysis: algorithms and applications. Neural Networks, Vol. 13, No. 4-5, 2000.
[20]
J. Jackson. A user's guide to principal components. Wiley series in probability and mathematical statistics: Applied probability and statistics. Wiley, 1991.
[21]
I. Jolliffe. Principal Component Analysis. Encyclopedia of Statistics in Behavioral Science, 2005.
[22]
H. F. Kaiser. An index of factorial simplicity. Psychometrica, Vol. 39, No. 1, 1974.
[23]
R. Khattree and D. Naik. Multivariate data reduction and discrimination with SAS software. Wiley series in probability and statistics. SAS Institute Inc., 2000.
[24]
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, Vol. 401, No. 6755, 1999.
[25]
X. Liao, H. Jin, and X. Yuan. Espm: An optimized resource distribution policy in virtual user environment. In Future Generation Computer Systems, pages 1393--1402. Springer-Verlag, 2010.
[26]
K. Mardia, J. Kent, and J. Bibby. Multivariate Analysis. Probability and Mathematical Statistics. Academic Press, 1995.
[27]
D. A. Menascé, V. A. F. Almeida, and L. W. Dowdy. Capacity planning and performance modeling: from mainframes to client-server systems. Prentice-Hall, Inc., 1994.
[28]
A. J. Oliner and A. Aiken. Online detection of multi-component interactions in production systems. Dependable Systems and Networks, International Conference on, 0:49--60, 2011.
[29]
A. V. Oppenheim, R. W. Schafer, and J. R. Buck. Discrete-time signal processing. Prentice Hall, 1999.
[30]
G. Pacifici, W. Segmuller, M. Spreitzer, and A. Tantawi. Cpu demand for web serving: Measurement analysis and dynamic estimation. Performance Evaluation, Vol. 65, No. 6-7, 2008.
[31]
S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In In VLDB, pages 697--708, 2005.
[32]
J. Rolia, L. Cherkasova, M. Arlitt, and A. Andrzejak. A capacity management service for resource pools. In Proc. of the 5th international workshop on Software and performance, New York, NY, USA, 2005.
[33]
A. Sharma and K. K. Paliwal. Fast principal component analysis using fixed-point algorithm. Pattern Recognition Letters, 28(10):1151--1155, 2007.
[34]
N. Singh and S. Rao. Energy optimization policies for server clusters. 2010.
[35]
N. Tran and D. Reed. Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Trans. Parallel and Distributed Systems, Vol. 15, No. 4, 2004.
[36]
R. Van Der Krogt, J. Feldman, J. Little, and D. Stynes. An integrated business rules and constraints approach to data centre capacity management. In Proceedings of the 16th international conference on Principles and practice of constraint programming, pages 568--582. Springer-Verlag, 2010.
[37]
W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan. Online system problem detection by mining patterns of console logs. In Proceedings of the 2009 Ninth IEEE International Conference on Data Mining, ICDM '09, pages 588--597. IEEE Computer Society, 2009.
[38]
X. e. a. Zhu. 1000 islands: an integrated approach to resource management for virtualized data centers. Cluster Computing, Vol. 12, No. 1, 2008.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 39, Issue 4
March 2012
134 pages
ISSN:0163-5999
DOI:10.1145/2185395
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2012
Published in SIGMETRICS Volume 39, Issue 4

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 89
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media