Skip to main content

Showing 1–22 of 22 results for author: Murphy, S A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.10526  [pdf, other

    cs.CY cs.AI

    Effective Monitoring of Online Decision-Making Algorithms in Digital Intervention Implementation

    Authors: Anna L. Trella, Susobhan Ghosh, Erin E. Bonar, Lara Coughlin, Finale Doshi-Velez, Yongyi Guo, Pei-Yao Hung, Inbal Nahum-Shani, Vivek Shetty, Maureen Walton, Iris Yan, Kelly W. Zhang, Susan A. Murphy

    Abstract: Online AI decision-making algorithms are increasingly used by digital interventions to dynamically personalize treatment to individuals. These algorithms determine, in real-time, the delivery of treatment based on accruing data. The objective of this paper is to provide guidelines for enabling effective monitoring of online decision-making algorithms with the goal of (1) safeguarding individuals a… ▽ More

    Submitted 30 August, 2024; originally announced September 2024.

  2. arXiv:2409.02069  [pdf, other

    cs.AI cs.HC

    A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial

    Authors: Anna L. Trella, Kelly W. Zhang, Hinal Jajal, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

    Abstract: Dental disease is a prevalent chronic condition associated with substantial financial burden, personal suffering, and increased risk of systemic diseases. Despite widespread recommendations for twice-daily tooth brushing, adherence to recommended oral self-care behaviors remains sub-optimal due to factors such as forgetfulness and disengagement. To address this, we developed Oralytics, a mHealth i… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  3. arXiv:2406.13127  [pdf, other

    cs.AI

    Oralytics Reinforcement Learning Algorithm

    Authors: Anna L. Trella, Kelly W. Zhang, Stephanie M. Carpenter, David Elashoff, Zara M. Greer, Inbal Nahum-Shani, Dennis Ruenger, Vivek Shetty, Susan A. Murphy

    Abstract: Dental disease is still one of the most common chronic diseases in the United States. While dental disease is preventable through healthy oral self-care behaviors (OSCB), this basic behavior is not consistently practiced. We have developed Oralytics, an online, reinforcement learning (RL) algorithm that optimizes the delivery of personalized intervention prompts to improve OSCB. In this paper, we… ▽ More

    Submitted 12 September, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2403.10946  [pdf, other

    stat.ML cs.LG

    The Fallacy of Minimizing Local Regret in the Sequential Task Setting

    Authors: Ziping Xu, Kelly W. Zhang, Susan A. Murphy

    Abstract: In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret. In a stationary setting, strong theoretical guarantees, like a sublinear ($\sqrt{T}$) regret bound, can be obtained, which typically implies the convergence to an optimal policy and the cessation of explor… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  5. arXiv:2403.05911  [pdf, other

    cs.HC cs.AI

    Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning

    Authors: Zana Buçinca, Siddharth Swaroop, Amanda E. Paluch, Susan A. Murphy, Krzysztof Z. Gajos

    Abstract: Imagine if AI decision-support tools not only complemented our ability to make accurate decisions, but also improved our skills, boosted collaboration, and elevated the joy we derive from our tasks. Despite the potential to optimize a broad spectrum of such human-centric objectives, the design of current AI tools remains focused on decision accuracy alone. We propose offline reinforcement learning… ▽ More

    Submitted 14 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  6. arXiv:2402.17003  [pdf, other

    cs.LG cs.AI cs.CY

    Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials

    Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Iris Yan, Finale Doshi-Velez, Susan A. Murphy

    Abstract: Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in cli… ▽ More

    Submitted 12 August, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  7. arXiv:2402.03110  [pdf, other

    cs.LG cs.AI

    Non-Stationary Latent Auto-Regressive Bandits

    Authors: Anna L. Trella, Walter Dempsey, Finale Doshi-Velez, Susan A. Murphy

    Abstract: We consider the stochastic multi-armed bandit problem with non-stationary rewards. We present a novel formulation of non-stationarity in the environment where changes in the mean reward of the arms over time are due to some unknown, latent, auto-regressive (AR) state of order $k$. We call this new environment the latent AR bandit. Different forms of the latent AR bandit appear in many real-world s… ▽ More

    Submitted 12 August, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  8. arXiv:2306.11208  [pdf, other

    cs.LG cs.AI stat.ML

    The Unintended Consequences of Discount Regularization: Improving Regularization in Certainty Equivalence Reinforcement Learning

    Authors: Sarah Rathnam, Sonali Parbhoo, Weiwei Pan, Susan A. Murphy, Finale Doshi-Velez

    Abstract: Discount regularization, using a shorter planning horizon when calculating the optimal policy, is a popular choice to restrict planning to a less complex set of policies when estimating an MDP from sparse or noisy data (Jiang et al., 2015). It is commonly understood that discount regularization functions by de-emphasizing or ignoring delayed effects. In this paper, we reveal an alternate view of d… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  9. arXiv:2305.09913  [pdf, other

    cs.LG cs.AI

    Assessing the Impact of Context Inference Error and Partial Observability on RL Methods for Just-In-Time Adaptive Interventions

    Authors: Karine Karine, Predrag Klasnja, Susan A. Murphy, Benjamin M. Marlin

    Abstract: Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual's time varying state. In this work, we explore the application of re… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted at UAI 2023

  10. arXiv:2208.07406  [pdf, other

    cs.AI cs.LG

    Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care

    Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

    Abstract: Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in o… ▽ More

    Submitted 14 September, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

  11. Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines

    Authors: Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

    Abstract: Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurat… ▽ More

    Submitted 18 August, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

  12. arXiv:2203.00097  [pdf

    stat.ME cs.AI cs.LG econ.EM math.OC

    Estimating causal effects with optimization-based methods: A review and empirical comparison

    Authors: Martin Cousineau, Vedat Verter, Susan A. Murphy, Joelle Pineau

    Abstract: In the absence of randomized controlled and natural experiments, it is necessary to balance the distributions of (observable) covariates of the treated and control groups in order to obtain an unbiased estimate of a causal effect of interest; otherwise, a different effect size may be estimated, and incorrect recommendations may be given. To achieve this balance, there exist a wide variety of metho… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: In Press, Corrected Proof

    Journal ref: European Journal of Operational Research, 2022, 14 pages

  13. arXiv:2202.07098  [pdf, ps, other

    cs.LG stat.ME

    Statistical Inference After Adaptive Sampling for Longitudinal Data

    Authors: Kelly W. Zhang, Lucas Janson, Susan A. Murphy

    Abstract: Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or "p… ▽ More

    Submitted 19 April, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: Fixing typos

  14. arXiv:2109.08134  [pdf, other

    cs.LG stat.ML

    Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning

    Authors: Sarah Rathnam, Susan A. Murphy, Finale Doshi-Velez

    Abstract: In batch reinforcement learning, there can be poorly explored state-action pairs resulting in poorly learned, inaccurate models and poorly performing associated policies. Various regularization methods can mitigate the problem of learning overly-complex models in Markov decision processes (MDPs), however they operate in technically and intuitively distinct ways and lack a common form in which to c… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: ICML Workshop on Reinforcement Learning Theory 2021

  15. arXiv:2104.14074  [pdf, other

    cs.LG

    Statistical Inference with M-Estimators on Adaptively Collected Data

    Authors: Kelly W. Zhang, Lucas Janson, Susan A. Murphy

    Abstract: Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence interval… ▽ More

    Submitted 19 November, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

    Journal ref: Advances in Neural Information Processing Systems, 2021

  16. arXiv:2005.05880  [pdf

    cs.HC stat.ME

    The Micro-Randomized Trial for Developing Digital Interventions: Experimental Design Considerations

    Authors: Ashley E. Walton, Linda M. Collins, Predrag Klasnja, Inbal Nahum-Shani, Mashfiqui Rabbi, Maureen A. Walton, Susan A. Murphy

    Abstract: Just-in-time adaptive interventions (JITAIs) are time-varying adaptive interventions that use frequent opportunities for the intervention to be adapted such as weekly, daily, or even many times a day. This high intensity of adaptation is facilitated by the ability of digital technology to continuously collect information about an individual's current context and deliver treatments adapted to this… ▽ More

    Submitted 23 April, 2020; originally announced May 2020.

    MSC Class: 62P15

  17. arXiv:2003.12881  [pdf, other

    stat.ML cs.LG

    Streamlined Empirical Bayes Fitting of Linear Mixed Models in Mobile Health

    Authors: Marianne Menictas, Sabina Tomkins, Susan A Murphy

    Abstract: To effect behavior change a successful algorithm must make high-quality decisions in real-time. For example, a mobile health (mHealth) application designed to increase physical activity must make contextually relevant suggestions to motivate users. While machine learning offers solutions for certain stylized settings, such as when batch data can be processed offline, there is a dearth of approache… ▽ More

    Submitted 28 March, 2020; originally announced March 2020.

  18. arXiv:2002.03217  [pdf, other

    cs.LG stat.ML

    Inference for Batched Bandits

    Authors: Kelly W. Zhang, Lucas Janson, Susan A. Murphy

    Abstract: As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data. In this work, we develop methods for inference on data collected in batches using a bandit algorithm. We first prove that the ordinary least squares estimator (OLS), which is asympto… ▽ More

    Submitted 8 January, 2021; v1 submitted 8 February, 2020; originally announced February 2020.

    Journal ref: NeurIPS 2020

  19. arXiv:1706.09090  [pdf, other

    stat.ML cs.LG

    An Actor-Critic Contextual Bandit Algorithm for Personalized Mobile Health Interventions

    Authors: Huitian Lei, Yangyi Lu, Ambuj Tewari, Susan A. Murphy

    Abstract: Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative and highly personalized health interventions. A Just-In-Time Adaptive Intervention (JITAI) uses real-time data collection and communication capabilities of modern mobile devices to deliver interventions in real-time that are adapted to the in-the-moment needs of the u… ▽ More

    Submitted 22 April, 2022; v1 submitted 27 June, 2017; originally announced June 2017.

    Comments: The theoretical analyses in this version are stronger compared to the previous one. This manuscript is not intended for publication

  20. arXiv:1607.05047  [pdf, other

    stat.ML cs.LG

    A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward

    Authors: S. A. Murphy, Y. Deng, E. B. Laber, H. R. Maei, R. S. Sutton, K. Witkiewitz

    Abstract: We develop an off-policy actor-critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view towards its use in mobile health.

    Submitted 18 July, 2016; originally announced July 2016.

  21. arXiv:1206.3274  [pdf

    cs.LG stat.ML

    Small Sample Inference for Generalization Error in Classification Using the CUD Bound

    Authors: Eric B. Laber, Susan A. Murphy

    Abstract: Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization error by resampling and then assume the resampled estimator follows a known distribution to form a confidence set [Kohavi 1995, Martin 1996,Yang 2006]. Alternatively, one might bootstrap the resampled estimator of the genera… ▽ More

    Submitted 13 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

    Report number: UAI-P-2008-PG-357-365

  22. arXiv:1202.3714  [pdf

    cs.LG stat.ML

    Active Learning for Developing Personalized Treatment

    Authors: Kun Deng, Joelle Pineau, Susan A. Murphy

    Abstract: The personalization of treatment via bio-markers and other risk categories has drawn increasing interest among clinical scientists. Personalized treatment strategies can be learned using data from clinical trials, but such trials are very costly to run. This paper explores the use of active learning techniques to design more efficient trials, addressing issues such as whom to recruit, at what poin… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Report number: UAI-P-2011-PG-161-168