skip to main content
research-article

Towards Accurate Prediction for High-Dimensional and Highly-Variable Cloud Workloads with Deep Learning

Published: 01 April 2020 Publication History

Abstract

Resource provisioning for cloud computing necessitates the adaptive and accurate prediction of cloud workloads. However, the existing methods cannot effectively predict the high-dimensional and highly-variable cloud workloads. This results in resource wasting and inability to satisfy service level agreements (SLAs). Since recurrent neural network (RNN) is naturally suitable for sequential data analysis, it has been recently used to tackle the problem of workload prediction. However, RNN often performs poorly on learning long-term memory dependencies, and thus cannot make the accurate prediction of workloads. To address these important challenges, we propose a deep Learning based Prediction Algorithm for cloud Workloads (L-PAW). First, a top-sparse auto-encoder (TSA) is designed to effectively extract the essential representations of workloads from the original high-dimensional workload data. Next, we integrate TSA and gated recurrent unit (GRU) block into RNN to achieve the adaptive and accurate prediction for highly-variable workloads. Using real-world workload traces from Google and Alibaba cloud data centers and the DUX-based cluster, extensive experiments are conducted to demonstrate the effectiveness and adaptability of the L-PAW for different types of workloads with various prediction lengths. Moreover, the performance results show that the L-PAW achieves superior prediction accuracy compared to the classic RNN-based and other workload prediction methods for high-dimensional and highly-variable real-world cloud workloads.

References

[1]
A. Alsarhan, A. Itradat, A. Y. Al-Dubai, A. Y. Zomaya, and G. Min, “Adaptive resource allocation and provisioning in multi-service cloud environments,” IEEE Trans. Parallel Distrib. Syst., vol. 29, no. 1, pp. 31–42, Jan. 2018.
[2]
Z. Wang, M. M. Hayat, N. Ghani, and K. B. Shaban, “Optimizing cloud-service performance: Efficient resource provisioning via optimal workload allocation,” IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 6, pp. 1689–1702, Jun. 2017.
[3]
Z. Chen, J. Hu, and G. Min, “Learning-based resource allocation in cloud data center using advantage actor-critic,” in Proc. 53rd IEEE Int. Conf. Commun., 2019, pp. 1–6.
[4]
F. Xu, H. Zheng, H. Jiang, W. Shao, H. Liu, and Z. Zhou, “Cost-effective cloud server provisioning for predictable performance of big data analytics,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 5, pp. 1036–1051, May 2019.
[5]
C. Reiss, J. Wilkes, and J. L. Hellerstein, “Google cluster-usage traces: Format+ schema,” Google Inc., White Paper, pp. 1–14, 2011.
[6]
P. A. Dinda, “The statistical properties of host load,” Sci. Program., vol. 7, no. 3/4, pp. 211–229, 1999.
[7]
J. Guo, et al., “Who limits the resource efficiency of my datacenter: An analysis of alibaba datacenter traces,” in Proc. 27th Int. Symp. Qual. Service, 2019, pp. 1–10.
[8]
Q. Zhang, L. T. Yang, Z. Yan, Z. Chen, and P. Li, “An efficient deep learning model to predict cloud workload for industry informatics,” IEEE Trans. Ind. Informat., vol. 14, no. 7, pp. 3170–3178, Jul. 2018.
[9]
Y. Hsu, K. Matsuda, and M. Matsuoka, “Self-aware workload forecasting in data center power prediction,” in Proc. 18th IEEE/ACM Int. Symp. Cluster Cloud Grid Comput., 2018, pp. 321–330.
[10]
Z. Zhou, F. Liu, Z. Li, and H. Jin, “When smart grid meets geo-distributed cloud: An auction approach to datacenter demand response,” in Proc. 34th Int. Conf. Comput. Commun., 2015, pp. 2650–2658.
[11]
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.
[12]
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” in Proc. NIPS Workshop Deep Learn., Dec. 2014.
[13]
R. Jozefowicz, W. Zaremba, and I. Sutskever, “An empirical exploration of recurrent network architectures,” in Proc. 32nd Int. Conf. Mach. Learn., 2015, pp. 2342–2350.
[14]
I. Jolliffe, “Principal component analysis,” in International Encyclopedia of Statistical Science, Berlin, Germany: Springer, 2011, pp. 1094–1096.
[15]
Y. Hu, B. Deng, F. Peng, and D. Wang, “Workload prediction for cloud computing elasticity mechanism,” in Proc. IEEE Int. Conf. Cloud Comput. Big Data Anal., 2016, pp. 244–249.
[16]
X. Tang, X. Liao, J. Zheng, and X. Yang, “Energy efficient job scheduling with workload prediction on cloud data center,” Cluster Comput., vol. 21, no. 3, pp. 1581–1593, 2018.
[17]
J. Kumar and A. K. Singh, “Workload prediction in cloud using artificial neural network and adaptive differential evolution,” Future Gener. Comput. Syst., vol. 81, pp. 41–52, 2018.
[18]
Y. Yu, V. Jindal, F. Bastani, F. Li, and I. Yen, “Improving the smartness of cloud management via machine learning based workload prediction,” in Proc. IEEE 42nd Annu. Comput. Softw. Appl. Conf., 2018, pp. 38–44.
[19]
S. Salaria, A. Drozd, A. Podobas, and S. Matsuoka, “Predicting performance using collaborative filtering,” in Proc. IEEE Int. Conf. Cluster Comput., 2018, pp. 504–514.
[20]
K. Mason, M. Duggan, E. Barrett, J. Duggan, and E. Howley, “Predicting host CPU utilization in the cloud using evolutionary neural networks,” Future Gener. Comput. Syst., vol. 86, pp. 162–173, 2018.
[21]
G. Kaur, A. Bala, and I. Chana, “An intelligent regressive ensemble approach for predicting resource usage in cloud computing,” J. Parallel Distrib. Comput., vol. 123, pp. 1–12, 2019.
[22]
W. Zhang, B. Li, D. Zhao, F. Gong, and Q. Lu, “Workload prediction for cloud cluster using a recurrent neural network,” in Proc. IEEE Int. Conf. Identification Inf. Knowl. Internet Things, 2016, pp. 104–109.
[23]
Z. Huang, J. Peng, H. Lian, J. Guo, and W. Qiu, “Deep recurrent model for server load and performance prediction in data center,” Complexity, vol. 2017, no. 99, pp. 1–10, 2017.
[24]
M. Duggan, K. Mason, J. Duggan, E. Howley, and E. Barrett, “Predicting host CPU utilization in cloud computing using recurrent neural networks,” in Proc. 12th Int. Conf. Internet Technol. Secured Trans., 2017, pp. 67–72.
[25]
H. Shi, M. Xu, and R. Li, “Deep learning for household load forecasting–A novel pooling deep RNN,” IEEE Trans. Smart Grid, vol. 9, no. 5, pp. 5271–5280, Sep. 2018.
[26]
W. Kong, Z. Y. Dong, Y. Jia, D. J. Hill, Y. Xu, and Y. Zhang, “Short-term residential load forecasting based on LSTM recurrent neural network,” IEEE Trans. Smart Grid, vol. 10, no. 1, pp. 841–851, Jan. 2019.
[27]
B. Song, Y. Yu, Y. Zhou, Z. Wang, and S. Du, “Host load prediction with long short-term memory in cloud computing,” The J. Supercomputing, vol. 74, no. 12, pp. 6554–6568, 2018.
[28]
Q. Yang, Y. Zhou, Y. Yu, J. Yuan, X. Xing, and S. Du, “Multi-step-ahead host load prediction using autoencoder and echo state networks in cloud computing,” The J. Supercomputing, vol. 71, no. 8, pp. 3037–3053, 2015.
[29]
S. Kumar, N. Muthiyan, S. Gupta, A. Dileep, and A. Nigam, “Association learning based hybrid model for cloud workload prediction,” in Proc. IEEE 31st Int. Joint Conf. Neural Netw., 2018, pp. 1–8.
[30]
Y. Guo and W. Yao, “Applying gated recurrent units approaches for workload prediction,” in Proc. IEEE 16th Netw. Operations Manage. Symp., 2018, pp. 1–6.
[31]
H. Liu, “A measurement study of server utilization in public clouds,” in Proc. IEEE 9th Int. Conf. Depend. Auton. Secure Comput., 2011, pp. 435–442.
[32]
E. Cortez, A. Bonde, A. Muzio, M. Russinovich, M. Fontoura, and R. Bianchini, “Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms,” in Proc. 26th Symp. Operating Syst. Princ., 2017, pp. 153–167.
[33]
M. Ranjbari and J. A. Torkestani, “A learning automata-based algorithm for energy and SLA efficient consolidation of virtual machines in cloud data centers,” J. Parallel Distrib. Comput., vol. 113, pp. 55–62, 2018.
[34]
C. Delimitrou and C. Kozyrakis, “Paragon: QoS-aware scheduling for heterogeneous datacenters,” in Proc. 18th Int. Conf. Architectural Support Program. Lang. Operating Syst., 2013, vol. 48, pp. 77–88.
[35]
K. Wang, C. Xu, Y. Zhang, S. Guo, and A. Zomaya, “Robust big data analytics for electricity price forecasting in the smart grid,” IEEE Trans. Big Data, vol. 5, no. 1, pp. 34–45, Mar. 2019.
[36]
M. Abadi, et al., “TensorFlow: A system for large-scale machine learning,” in Proc. 12th USENIX Symp. Operating Syst. Des. Implementation, 2016, pp. 265–283.

Cited By

View all

Index Terms

  1. Towards Accurate Prediction for High-Dimensional and Highly-Variable Cloud Workloads with Deep Learning
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image IEEE Transactions on Parallel and Distributed Systems
          IEEE Transactions on Parallel and Distributed Systems  Volume 31, Issue 4
          April 2020
          248 pages

          Publisher

          IEEE Press

          Publication History

          Published: 01 April 2020

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 23 Dec 2024

          Other Metrics

          Citations

          Cited By

          View all

          View Options

          View options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media