Mitigating information loss in tree-based reinforcement learning via direct optimization


Marton, Sascha ; Grams, Tim ; Vogt, Florian ; Lüdtke, Stefan ; Bartelt, Christian ; Stuckenschmidt, Heiner


[img] PDF
1507_Mitigating_Information_Lo.pdf - Published

Download (1MB)

URL: https://openreview.net/forum?id=qpXctF2aLZ
URN: urn:nbn:de:bsz:180-madoc-695066
Document Type: Conference or workshop publication
Year of publication Online: 2025
Book title: The Thirteens International Conference on Learning Representations
Conference title: ICLR 2025, The Thirteenth International Conference on Learning Representations
Location of the conference venue: Singapur, Singapore
Date of the conference: 24.-28.04.2025
Publishing house: OpenReview.net
Related URLs:
Publication language: English
Institution: Außerfakultäre Einrichtungen > Institut für Enterprise Systems (InES)
School of Business Informatics and Mathematics > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
Pre-existing license: Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 004 Computer science, internet
Keywords (English): symbolic reinforcement learning , interpretable reinforcement learning , reinforcement learning , decision trees , policy gradient , proximal policy optimization
Abstract: Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challenging. In this paper, we introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions while maintaining a high level of interpretability. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches in terms of performance and interpretability. Unlike existing methods, it enables gradient-based, end-to-end learning of interpretable, axis-aligned decision trees within standard on-policy RL algorithms. Therefore, SYMPOL can become the foundation for a new class of interpretable RL based on decision trees. Our implementation is available under: https://github.com/s-marton/sympol




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Diese Publikation ist bisher nur Online erschienen. Diese Publikation nun als "Jetzt in Print erschienen" melden.




Metadata export


Citation


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item