|
ADDQ: adaptive distributional double Q-learning
Döring, Leif
;
Wille, Benedikt
;
Birr, Maximilian
;
Bîrsan, Mihail
;
Slowik, Martin
|
URL:
|
https://proceedings.mlr.press/v267/doring25a.html
|
|
URN:
|
urn:nbn:de:bsz:180-madoc-719430
|
|
Document Type:
|
Conference or workshop publication
|
|
Year of publication:
|
2025
|
|
Book title:
|
Proceedings of the 42nd International Conference on Machine Learning, PMLR
|
|
The title of a journal, publication series:
|
Proceedings of Machine Learning Research : PMLR
|
|
Volume:
|
267
|
|
Page range:
|
14344-14390
|
|
Conference title:
|
International Conference on Machine Learning
|
|
Location of the conference venue:
|
Vancouver, Canada
|
|
Date of the conference:
|
13.-19.07.2025
|
|
Publisher:
|
Singh, Aarti
;
Fazel, Maryam
;
Hsu, Daniel
;
Lacoste-Julien, Simon
;
Berkenkamp, Felix
;
Maharaj, Tegan
;
Wagstaff, Kiri
;
Zhu, Jerry
|
|
Place of publication:
|
Red Hook, NY
|
|
Publishing house:
|
Curran Associates, Inc.
|
|
ISSN:
|
2640-3498
|
|
Related URLs:
|
|
|
Publication language:
|
English
|
|
Institution:
|
School of Business Informatics and Mathematics > Stochastics (Junioprofessur) (Slowik 2021-)
|
|
Subject:
|
004 Computer science, internet
|
|
Abstract:
|
Bias problems in the estimation of Q-values are a well-known obstacle that slows down convergence of Q-learning and actor-critic methods. One of the reasons of the success of modern RL algorithms is partially a direct or indirect overestimation reduction mechanism. We introduce an easy to implement method built on top of distributional reinforcement learning (DRL) algorithms to deal with the overestimation in a locally adaptive way. Our framework ADDQ is simple to implement, existing DRL implementations can be improved with a few lines of code. We provide theoretical backup and experimental results in tabular, Atari, and MuJoCo environments, comparisons with state-of-the-art methods, and a proof of convergence in the tabular case.
|
 | Dieser Eintrag ist Teil der Universitätsbibliographie. |
 | Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt. |
Search Authors in
BASE:
Döring, Leif
;
Wille, Benedikt
;
Birr, Maximilian
;
Bîrsan, Mihail
;
Slowik, Martin
Google Scholar:
Döring, Leif
;
Wille, Benedikt
;
Birr, Maximilian
;
Bîrsan, Mihail
;
Slowik, Martin
ORCID:
Döring, Leif ORCID: 0000-0002-4569-5083 ; Wille, Benedikt ; Birr, Maximilian ; Bîrsan, Mihail ; Slowik, Martin ORCID: 0000-0001-5373-5754
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
 |
Show item |
|