research-article

Open access

Interval Markov Decision Processes with Continuous Action-Spaces

Authors:

Giannis Delimpaltadakis,

Morteza Lahijanian,

Manuel Mazo Jr.,

Luca LaurentiAuthors Info & Claims

HSCC '23: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control

Article No.: 12, Pages 1 - 10

https://doi.org/10.1145/3575870.3587117

Published: 09 May 2023 Publication History

All formats PDF

Abstract

Interval Markov Decision Processes (IMDPs) are finite-state uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic systems for control synthesis. However, due to the absence of algorithms for synthesis over IMDPs with continuous action-spaces, the action-space is assumed discrete a-priori, which is a restrictive assumption for many applications. Motivated by this, we introduce continuous-action IMDPs (caIMDPs), where the bounds on transition probabilities are functions of the action variables, and study value iteration for maximizing expected cumulative rewards. Specifically, we decompose the max-min problem associated to value iteration to |𝒬| max problems, where |𝒬| is the number of states of the caIMDP. Then, exploiting the simple form of these max problems, we identify cases where value iteration over caIMDPs can be solved efficiently (e.g., with linear or convex programming). We also gain other interesting insights: e.g., in certain cases where the action set 𝒜 is a polytope, synthesis over a discrete-action IMDP, where the actions are the vertices of 𝒜, is sufficient for optimality. We demonstrate our results on a numerical example. Finally, we include a short discussion on employing caIMDPs as abstractions for control synthesis.

References

[1]

Dimitri P Bertsekas and Steven Shreve. 2004. Stochastic optimal control: the discrete-time case.

[2]

Stephen Boyd and Lieven Vandenberghe. 2004. Convex optimization. Cambridge university press.

Digital Library

[3]

Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, and Ufuk Topcu. 2021. Convex Optimization for Parameter Synthesis in MDPs. IEEE Trans. Automat. Control (2021).

[4]

Giannis Delimpaltadakis, Luca Laurenti, and Manuel Mazo Jr. 2022. Formal Analysis of the Sampling Behaviour of Stochastic Event-Triggered Control. arXiv preprint arXiv:2202.10178 (2022).

[5]

Maxence Dutreix, Jeongmin Huh, and Samuel Coogan. 2022. Abstraction-based synthesis for stochastic systems with omega-regular objectives. Nonlinear Analysis: Hybrid Systems 45 (2022), 101204.

[6]

James E Falk. 1973. A linear max—min problem. Mathematical Programming 5, 1 (1973), 169–188.

Digital Library

[7]

Sicun Gao, Soonho Kong, and Edmund M Clarke. 2013. dReal: An SMT solver for nonlinear theories over the reals. In International conference on automated deduction. Springer, 208–214.

Digital Library

[8]

Robert Givan, Sonia Leach, and Thomas Dean. 2000. Bounded-parameter Markov decision processes. Artificial Intelligence 122, 1-2 (2000), 71–109.

Digital Library

[9]

Ernst Moritz Hahn, Holger Hermanns, and Lijun Zhang. 2011. Probabilistic reachability for parametric Markov models. International Journal on Software Tools for Technology Transfer 13, 1 (2011), 3–19.

Digital Library

[10]

John Jackson, Luca Laurenti, Eric Frew, and Morteza Lahijanian. 2021. Strategy synthesis for partially-known switched stochastic systems. In Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control. 1–11.

Digital Library

[11]

Xenofon Koutsoukos and Derek Riley. 2006. Computational methods for reachability analysis of stochastic hybrid systems. In International Workshop on Hybrid Systems: Computation and Control. Springer, 377–391.

Digital Library

[12]

Morteza Lahijanian, Sean B Andersson, and Calin Belta. 2015. Formal verification and synthesis for discrete-time stochastic systems. IEEE Trans. Automat. Control 60, 8 (2015), 2031–2045.

[13]

Ruggero Lanotte, Andrea Maggiolo-Schettini, and Angelo Troina. 2007. Parametric probabilistic transition systems for system design and analysis. Formal Aspects of Computing 19, 1 (2007), 93–109.

Digital Library

[14]

Luca Laurenti, Morteza Lahijanian, Alessandro Abate, Luca Cardelli, and Marta Kwiatkowska. 2020. Formal and efficient synthesis for continuous-time linear stochastic hybrid processes. IEEE Trans. Automat. Control 66, 1 (2020), 17–32.

[15]

Abolfazl Lavaei, Sadegh Soudjani, Alessandro Abate, and Majid Zamani. 2022. Automated verification and synthesis of stochastic hybrid systems: A survey. Automatica 146 (2022), 110617.

Digital Library

[16]

Arnab Nilim and Laurent El Ghaoui. 2005. Robust control of Markov decision processes with uncertain transition matrices. Operations Research 53, 5 (2005), 780–798.

Digital Library

[17]

Andrzej S Nowak. 1984. On zero-sum stochastic games with general state space I. Probability and Mathematical Statistics 4, 1 (1984), 13–32.

[18]

R Tyrrell Rockafellar. 1970. Convex analysis. Vol. 18. Princeton university press.

[19]

G George Yin and Chao Zhu. 2009. Hybrid switching diffusions: properties and applications. Vol. 63. Springer Science & Business Media.

Cited By

Wooding BLavaei A(2024)IMPaCT: Interval MDP Parallel Construction for Controller Synthesis of Large-Scale STochastic SystemsQuantitative Evaluation of Systems and Formal Modeling and Analysis of Timed Systems10.1007/978-3-031-68416-6_15(249-267)Online publication date: 29-Aug-2024
https://doi.org/10.1007/978-3-031-68416-6_15
Jiang JCoogan SZhao Y(2023)Abstraction-Based Planning for Uncertainty-Aware Legged NavigationIEEE Open Journal of Control Systems10.1109/OJCSYS.2023.32960002(221-234)Online publication date: 2023
https://doi.org/10.1109/OJCSYS.2023.3296000
Reed RLaurenti LLahijanian M(2023)Promises of Deep Kernel Learning for Control SynthesisIEEE Control Systems Letters10.1109/LCSYS.2023.33409957(3986-3991)Online publication date: 2023
https://doi.org/10.1109/LCSYS.2023.3340995

Index Terms

Interval Markov Decision Processes with Continuous Action-Spaces
1. Computing methodologies
  1. Artificial intelligence
    1. Control methods
      1. Computational control theory
    2. Planning and scheduling
      1. Planning under uncertainty
2. Mathematics of computing
  1. Mathematical analysis
    1. Mathematical optimization
      1. Continuous optimization
        Stochastic control and optimization
  2. Probability and statistics
    1. Stochastic processes
      1. Markov processes

Recommendations

Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes

This paper deals with a mean-variance problem for finite horizon semi-Markov decision processes. The state and action spaces are Borel spaces, while the reward function may be unbounded. The goal is to seek an optimal policy with minimal finite horizon ...
The risk probability criterion for discounted continuous-time Markov decision processes

In this paper, we consider the risk probability minimization problem for infinite discounted continuous-time Markov decision processes (CTMDPs) with unbounded transition rates. First, we introduce a class of policies depending on histories with the ...
Partially Observable Risk-Sensitive Markov Decision Processes

We consider the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite time horizon that is generated by a partially observable Markov decision process POMDP. In contrast to a risk-neutral decision ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HSCC '23: Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control

May 2023

239 pages

ISBN:9798400700330

DOI:10.1145/3575870

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2023

Check for updates

Badges

Results Reproduced / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

HSCC '23

Sponsor:

SIGBED

HSCC '23: 26th ACM International Conference on Hybrid Systems: Computation and Control

May 9 - 12, 2023

TX, San Antonio, USA

Acceptance Rates

Overall Acceptance Rate 153 of 373 submissions, 41%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
405
Total Downloads

Downloads (Last 12 months)245
Downloads (Last 6 weeks)43

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wooding BLavaei A(2024)IMPaCT: Interval MDP Parallel Construction for Controller Synthesis of Large-Scale STochastic SystemsQuantitative Evaluation of Systems and Formal Modeling and Analysis of Timed Systems10.1007/978-3-031-68416-6_15(249-267)Online publication date: 29-Aug-2024
https://doi.org/10.1007/978-3-031-68416-6_15
Jiang JCoogan SZhao Y(2023)Abstraction-Based Planning for Uncertainty-Aware Legged NavigationIEEE Open Journal of Control Systems10.1109/OJCSYS.2023.32960002(221-234)Online publication date: 2023
https://doi.org/10.1109/OJCSYS.2023.3296000
Reed RLaurenti LLahijanian M(2023)Promises of Deep Kernel Learning for Control SynthesisIEEE Control Systems Letters10.1109/LCSYS.2023.33409957(3986-3991)Online publication date: 2023
https://doi.org/10.1109/LCSYS.2023.3340995

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents