skip to main content
10.1145/3575870.3587117acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article
Open access

Interval Markov Decision Processes with Continuous Action-Spaces

Published: 09 May 2023 Publication History

Abstract

Interval Markov Decision Processes (IMDPs) are finite-state uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic systems for control synthesis. However, due to the absence of algorithms for synthesis over IMDPs with continuous action-spaces, the action-space is assumed discrete a-priori, which is a restrictive assumption for many applications. Motivated by this, we introduce continuous-action IMDPs (caIMDPs), where the bounds on transition probabilities are functions of the action variables, and study value iteration for maximizing expected cumulative rewards. Specifically, we decompose the max-min problem associated to value iteration to |đť’¬| max problems, where |đť’¬| is the number of states of the caIMDP. Then, exploiting the simple form of these max problems, we identify cases where value iteration over caIMDPs can be solved efficiently (e.g., with linear or convex programming). We also gain other interesting insights: e.g., in certain cases where the action set đť’ś is a polytope, synthesis over a discrete-action IMDP, where the actions are the vertices of đť’ś, is sufficient for optimality. We demonstrate our results on a numerical example. Finally, we include a short discussion on employing caIMDPs as abstractions for control synthesis.

References

[1]
Dimitri P Bertsekas and Steven Shreve. 2004. Stochastic optimal control: the discrete-time case.
[2]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.
[3]
Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, and Ufuk Topcu. 2021. Convex Optimization for Parameter Synthesis in MDPs. IEEE Trans. Automat. Control (2021).
[4]
Giannis Delimpaltadakis, Luca Laurenti, and Manuel Mazo Jr. 2022. Formal Analysis of the Sampling Behaviour of Stochastic Event-Triggered Control. arXiv preprint arXiv:2202.10178 (2022).
[5]
Maxence Dutreix, Jeongmin Huh, and Samuel Coogan. 2022. Abstraction-based synthesis for stochastic systems with omega-regular objectives. Nonlinear Analysis: Hybrid Systems 45 (2022), 101204.
[6]
James E Falk. 1973. A linear max—min problem. Mathematical Programming 5, 1 (1973), 169–188.
[7]
Sicun Gao, Soonho Kong, and Edmund M Clarke. 2013. dReal: An SMT solver for nonlinear theories over the reals. In International conference on automated deduction. Springer, 208–214.
[8]
Robert Givan, Sonia Leach, and Thomas Dean. 2000. Bounded-parameter Markov decision processes. Artificial Intelligence 122, 1-2 (2000), 71–109.
[9]
Ernst Moritz Hahn, Holger Hermanns, and Lijun Zhang. 2011. Probabilistic reachability for parametric Markov models. International Journal on Software Tools for Technology Transfer 13, 1 (2011), 3–19.
[10]
John Jackson, Luca Laurenti, Eric Frew, and Morteza Lahijanian. 2021. Strategy synthesis for partially-known switched stochastic systems. In Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control. 1–11.
[11]
Xenofon Koutsoukos and Derek Riley. 2006. Computational methods for reachability analysis of stochastic hybrid systems. In International Workshop on Hybrid Systems: Computation and Control. Springer, 377–391.
[12]
Morteza Lahijanian, Sean B Andersson, and Calin Belta. 2015. Formal verification and synthesis for discrete-time stochastic systems. IEEE Trans. Automat. Control 60, 8 (2015), 2031–2045.
[13]
Ruggero Lanotte, Andrea Maggiolo-Schettini, and Angelo Troina. 2007. Parametric probabilistic transition systems for system design and analysis. Formal Aspects of Computing 19, 1 (2007), 93–109.
[14]
Luca Laurenti, Morteza Lahijanian, Alessandro Abate, Luca Cardelli, and Marta Kwiatkowska. 2020. Formal and efficient synthesis for continuous-time linear stochastic hybrid processes. IEEE Trans. Automat. Control 66, 1 (2020), 17–32.
[15]
Abolfazl Lavaei, Sadegh Soudjani, Alessandro Abate, and Majid Zamani. 2022. Automated verification and synthesis of stochastic hybrid systems: A survey. Automatica 146 (2022), 110617.
[16]
Arnab Nilim and Laurent El Ghaoui. 2005. Robust control of Markov decision processes with uncertain transition matrices. Operations Research 53, 5 (2005), 780–798.
[17]
Andrzej S Nowak. 1984. On zero-sum stochastic games with general state space I. Probability and Mathematical Statistics 4, 1 (1984), 13–32.
[18]
R Tyrrell Rockafellar. 1970. Convex analysis. Vol. 18. Princeton university press.
[19]
G George Yin and Chao Zhu. 2009. Hybrid switching diffusions: properties and applications. Vol. 63. Springer Science & Business Media.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HSCC '23: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control
May 2023
239 pages
ISBN:9798400700330
DOI:10.1145/3575870
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2023

Check for updates

Badges

Author Tags

  1. bounded-parameter Markov decision processes
  2. control synthesis
  3. planning under uncertainty
  4. uncertain Markov decision processes
  5. value iteration

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

HSCC '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 153 of 373 submissions, 41%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)245
  • Downloads (Last 6 weeks)43
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media