A new Q(lambda) with interim forward view and Monte Carlo equivalence.

AllImages Shopping Books Maps Videos News

A new Q(lambda) with interim forward view and Monte Carlo equivalence

In this paper, we introduce a new version of Q(lambda) that does exactly that, without significantly increased algorithmic complexity.

(PDF) A new Q(λ) with interim forward view and Monte Carlo equivalence

www.researchgate.net › publication › 26...

We apply this technique to derive first a new off-policy version of TD(λ), called PTD(λ), and then our new Q(λ), called PQ(λ).

A new Q (λ) with interim forward view and Monte Carlo equivalence

dl.acm.org › doi › abs

En route to our new Q(λ), we introduce a new derivation technique based on the forward-view/backward-view analysis familiar from TD(λ) but extended to apply at ...

A new Q(lambda) with interim forward view and Monte Carlo equivalence

www.semanticscholar.org › paper › A-ne...

A new version of Q(λ) is introduced that does exactly that, without significantly increased algorithmic complexity, and introduces a new derivation ...

[PDF] A new Q ( ) with interim forward view and Monte Carlo ...

www.semanticscholar.org › paper › A-ne...

A new version of Q( ) is introduced that approaches exactness in the conventional online case as the step-size parameter approaches zero, and a new ...

[PDF] A New Q(λ)

www.incompleteideas.net › A-New-...

Finally, we intro- duce an interim forward view for action values and use it to derive and prove equivalence of our new Q(λ). Like the original equivalences, ...

target policy - ICML Beijing

icml.cc › index › article › target_policy

A new Q(lambda) with interim forward view and Monte Carlo equivalence (pdf). Q-learning, the most popular of reinforcement learning algorithms, has always ...

‪A. Rupam Mahmood‬ - ‪Google Scholar‬

scholar.google.ca › citations

A new Q (λ) with interim forward view and Monte Carlo equivalence. RS Sutton, AR Mahmood, D Precup, M CA, H van Hasselt, U CA. (ICML) In International ...

Publications - Rupam Mahmood

armahmood.github.io › publications

A new Q(λ) with interim forward view and Monte Carlo equivalence. In Proceedings of the 31st International Conference on Machine Learning (ICML), Beijing ...

[PDF] arXiv:1602.04951v2 [cs.AI] 11 Aug 2016

arxiv.org › pdf

Aug 11, 2016 · A new q (λ) with interim forward view and monte carlo equivalence. In International. Conference on Machine Learning, pages 568–576, 2014. 18 ...