Cited By
View all- Zhou THairi FYang HLiu JTong TYang FMomma MGao YSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Finite-time convergence and sample complexity of actor-critic multi-objective reinforcement learningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694632(61913-61933)Online publication date: 21-Jul-2024
- Yamamoto KOko KYang ZSuzuki TSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Mean field Langevin actor-criticProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694366(55706-55738)Online publication date: 21-Jul-2024
- Wang THerbert SGao SSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Mollification effects of policy gradient methodsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694140(50580-50598)Online publication date: 21-Jul-2024
- Show More Cited By