Investigating, predicting, and mitigating collusive behavior in deep reinforcement learning-based pricing AIs


Schlechtinger, Michael


[img] PDF
Dissertation_FINAL.pdf - Published

Download (30MB)

URN: urn:nbn:de:bsz:180-madoc-680183
Document Type: Doctoral dissertation
Year of publication: 2024
Place of publication: Mannheim
University: Universität Mannheim
Evaluator: Paulheim, Heiko
Date of oral examination: 26 September 2024
Publication language: English
Institution: School of Business Informatics and Mathematics > Data Science (Paulheim 2018-)
Subject: 004 Computer science, internet
Keywords (English): reinforcement learning , pricing algorithms , collusion
Abstract: In the rapidly evolving landscape of eCommerce, dynamic pricing using Artificial Intelligence (AI) has become increasingly prevalent. Pricing AIs, particularly those utilizing (deep) Reinforcement Learning, continuously adjust to dynamic market conditions, raising concerns about potential market collusion. This thesis addresses this issue through several approaches. Firstly, we investigate a modified prisoner’s dilemma scenario where three agents play rock-paper-scissors. Results indicate potentially collaborative behavior characterized by an action selection dissectable into specific stages, establishing the possibility of developing collusion prevention systems that can recognize situations that might lead to collusion between competitors. We provide evidence that agents can perform tacit cooperation strategies without being explicitly trained to do so. Second, this research employs an experimental oligopoly model of repeated price competition, systematically varying the environment to cover scenarios from basic economic theory to subjective consumer demand preferences. We explore strategies and emerging pricing patterns leading to collusion, including scenarios where agents cannot observe competitors’ prices. Comprehensive legal analysis is provided across all scenarios. Third, we examine the agents’ ability to use pricing information to predict their competitors’ behavior. Thus, predictive statistical techniques and time series analysis methodologies are employed to anticipate collusive outcomes. Findings indicate that self-learning pricing algorithms’ convergence towards a collusive market outcome can be accurately anticipated using machine learning methodologies. Finally, we develop a method to mitigate predictive and supracompetitive pricing using a combination of various training strategies. By implementing a supervision algorithm penalizing collusion and incentivizing competitiveness we effectively incentivize the agents to act competitively rather than collaboratively. The results demonstrate that the convergence of self-learning pricing algorithms towards collusive outcomes can be accurately predicted and mitigated in real time.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadata export


Citation


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item