skip to main content
10.5555/2876341.2876360acmconferencesArticle/Chapter ViewAbstractPublication PagesspringsimConference Proceedingsconference-collections
research-article

An adaptive fault-tolerance scheme for distributed load balancing systems

Published: 12 April 2015 Publication History

Abstract

Load balancing of distributed virtual simulations has been developing into a critical mechanism for enabling these simulations as their complexity grows to model more realistic scenarios. As the scale of these systems increases, they become more susceptible to load imbalances caused by the heterogeneity and non-dedication of resources and by their own simulation load oscillations. Due to its importance, many balancing systems have been designed for distributed simulations. Nevertheless, none of the previous systems consider the existence of failures in their own systems, which can partially hamper or completely interrupt their balancing capabilities. Therefore, a fault-tolerant mechanism is introduced for load balancing systems to keep some minimal services running properly or enable the recovery of components when faults unpredictably occur. The proposed solution employs election and grouping tools to reconfigure the balancing system dynamically. Experiments have been conducted in order to evaluate the benefit of the proposed fault-tolerant balancing system.

References

[1]
Ajaltouni, E. E., Boukerche, A., and Zhang, M. An efficient dynamic load balancing scheme for distributed simulations on a grid infrastructure. In Proceedings of the 12th 2008 International Symposium on Distributed Simulation and Real-Time Applications, IEEE Computer Society (2008), 61--68.
[2]
Avril, H., and Tropper, C. The dynamic load balancing of clustered time warp for logic simulation. In Proceedings of the 10th Workshop on Parallel and Distributed Simulation, IEEE Computer Society (1996), 20--27.
[3]
Bononi, L., Bracuto, M., D'Angelo, G., and Donatiello, L. An adaptive load balancing middleware for distributed simulation. In Proceedings of the Workshop on Middleware and Performance (WOMP) (2006), 864--872.
[4]
Boukerche, A. An adaptive partitioning algorithm for distributed discrete event simulation systems. J. Parallel Distrib. Comput. 62, 9 (2002), 1454--1475.
[5]
Boukerche, A., and Das, S. K. Dynamic load balancing strategies for conservative parallel simulations. In Proceedings of the 11th Workshop on Parallel and Distributed Simulation (PADS97), IEEE Computer Society (1997), 32--37.
[6]
Boukerche, A., and Grande, R. E. D. Optimized federate migration for large-scale hla-based simulations. In Proceedings of the 12th International Symposium on Distributed Simulation and Real-Time Applications, IEEE Computer Society (2008), 227--235.
[7]
Boukerche, A., and Tropper, C. A static partitioning and mapping algorithm for conservative parallel simulations. In Proceedings of the 8th workshop on Parallel and distributed simulation, IEEE Computer Society (1994), 164--172.
[8]
Burdorf, C., and Marti, J. Load balancing strategies for time warp on multi-user workstations. The Computer Journal 36, 2 (1993), 168--176.
[9]
Cai, W., Turner, S. J., and Zhao, H. The resource sharing system: dynamic federate mapping for hla-based distributed simulation. In Proceedings of the 6th International Workshop on Distributed Simulation and Real-Time Applications, IEEE Computer Society (2002), 7--14.
[10]
Carothers, C. D., and Fujimoto, R. M. Background execution of time warp programs. In Proceedings of the 10th Workshop on Parallel and Distributed Simulation, IEEE Computer Society (1996), 12--19.
[11]
Carothers, C. D., and Fujimoto, R. M. Efficient execution of time warp programs on heterogeneous, now platforms. IEEE Transactions on Parallel and Distributed Systems 11, 3 (mar 2000), 299--317.
[12]
Deelman, E., and Szymanski, B. K. Dynamic load balancing in parallel discrete event simulation for spatially explicit problems. In Proceedings of the 12th workshop on Parallel and distributed simulation, IEEE Computer Society (1998), 46--53.
[13]
Foster, I., Kesselman, C., and Tuecke, S. The anatomy of the grid: Enabling scalable virtual organizations. International Journal of High Performance Computing Applications 15, 3 (2001), 200--222.
[14]
Gan, B. P., Low, Y. H., Jain, S., Turner, S. J., Hsu, W. C. W. J., and Huang, S. Y. Load balancing for conservative simulation on shared memory multiprocessor systems. In Proceedings of the 14th workshop on Parallel and distributed simulation, IEEE Computer Society (2000), 139--146.
[15]
Glazer, D. W., and Tropper, C. On process migration and load balancing in time warp. IEEE Transactions on Parallel and Distributed Systems 4, 3 (1993), 318--327.
[16]
Grande, R. E. D., and Boukerche, A. A dynamic, distributed, hierarchical load repartitioning for hla-based simulations on large-scale environments. In Proceedings of the International European Conference on Parallel Processing (Euro-Par), EuroPar'10, Springer-Verlag (Berlin, Heidelberg, 2010), 242--253.
[17]
Grande, R. E. D., Boukerche, A., and Ramadan, H. M. S. Decreasing communication latency through dynamic measurement, analysis, and partitioning for distributed virtual simulations. Instrumentation and Measurement, IEEE Transactions on 60, 1 (jan. 2011), 81--92.
[18]
Grande, R. E. D., Boukerche, A., and Ramadan, H. M. S. Measuring communication delay for dynamic balancing strategies of distributed virtual simulations. Instrumentation and Measurement, IEEE Transactions on 60, 11 (nov. 2011), 3559--3569.
[19]
Jiang, J., Anane, R., and Theodoropoulos, G. Load balancing in distributed simulations on the grid. In Proceedings of the International Conference on Systems, Man and Cybernetics, IEEE Computer Society (2004), 3232--3238.
[20]
Li, Z., Cai, W., Turner, S. J., and Pan, K. Federate migration in a service oriented hla rti. In Proceedings of the 11th International Symposium on Distributed Simulation and Real-Time Applications, IEEE Computer Society (2007), 113--121.
[21]
Low, M. Y. H. Dynamic load-balancing for bsp time warp. In Proceedings of the 35th Annual Simulation Symposium (SS02), IEEE Computer Society (2002), 267--274.
[22]
Luthi, J., and Grossmann, S. The resource sharing system: dynamic federate mapping for hla-based distributed simulation. In Proceedings of the 15th Workshop on Parallel and Distributed Simulation, IEEE Computer Society (2001), 91--98.
[23]
Peschlow, P., Honecker, H., and Martini, P. A flexible dynamic partitioning algorithm for optimistic distributed simulation. In Proceedings of the 21st Workshop on Parallel and Distributed Simulation (PADS07), IEEE Computer Society (2007), 219--228.
[24]
(SISC), S. I. S. C. Ieee standard for modeling and simulation (M&S) high level architecture (hla) framework and rules. IEEE Computer Society, September 2000.
[25]
Tan, G., and Lim, K. C. Load distribution services in hla. In Proceedings of 8th IEEE Distributed Simulation and Real-time Applications, IEEE Computer Society (2004), 133--141.
[26]
Wilson, L. F., and Shen, W. Experiments in load migration and dynamic load balancing in speedes. In Proceedings of the 1998 Winter Simulation Conference, IEEE Computer Society (1998), 483--490.
[27]
Zajac, K., Bubak, M., Malawski, M., and Sloot, P. Towards a grid management system for hla-based interactive simulations. In Proceedings of the 7th International Symposium on Distributed Simulation and Real-Time Applications, IEEE Comp. Society (2003), 4--11.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ANSS '15: Proceedings of the 48th Annual Simulation Symposium
April 2015
217 pages
ISBN:9781510800991

Sponsors

Publisher

Society for Computer Simulation International

San Diego, CA, United States

Publication History

Published: 12 April 2015

Check for updates

Author Tags

  1. adaptation
  2. fault tolerance
  3. high level architecture

Qualifiers

  • Research-article

Conference

SpringSim '15
Sponsor:
SpringSim '15: 2015 Spring Simulation Multiconference
April 12 - 15, 2015
Virginia, Alexandria

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 50
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media