Investigating, predicting, and mitigating collusive behavior in deep reinforcement learning-based pricing AIs


Schlechtinger, Michael


[img] PDF
Dissertation_FINAL.pdf - Veröffentlichte Version

Download (30MB)

URN: urn:nbn:de:bsz:180-madoc-680183
Dokumenttyp: Dissertation
Erscheinungsjahr: 2024
Ort der Veröffentlichung: Mannheim
Hochschule: Universität Mannheim
Gutachter: Paulheim, Heiko
Datum der mündl. Prüfung: 26 September 2024
Sprache der Veröffentlichung: Englisch
Einrichtung: Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Data Science (Paulheim 2018-)
Fachgebiet: 004 Informatik
Freie Schlagwörter (Englisch): reinforcement learning , pricing algorithms , collusion
Abstract: In the rapidly evolving landscape of eCommerce, dynamic pricing using Artificial Intelligence (AI) has become increasingly prevalent. Pricing AIs, particularly those utilizing (deep) Reinforcement Learning, continuously adjust to dynamic market conditions, raising concerns about potential market collusion. This thesis addresses this issue through several approaches. Firstly, we investigate a modified prisoner’s dilemma scenario where three agents play rock-paper-scissors. Results indicate potentially collaborative behavior characterized by an action selection dissectable into specific stages, establishing the possibility of developing collusion prevention systems that can recognize situations that might lead to collusion between competitors. We provide evidence that agents can perform tacit cooperation strategies without being explicitly trained to do so. Second, this research employs an experimental oligopoly model of repeated price competition, systematically varying the environment to cover scenarios from basic economic theory to subjective consumer demand preferences. We explore strategies and emerging pricing patterns leading to collusion, including scenarios where agents cannot observe competitors’ prices. Comprehensive legal analysis is provided across all scenarios. Third, we examine the agents’ ability to use pricing information to predict their competitors’ behavior. Thus, predictive statistical techniques and time series analysis methodologies are employed to anticipate collusive outcomes. Findings indicate that self-learning pricing algorithms’ convergence towards a collusive market outcome can be accurately anticipated using machine learning methodologies. Finally, we develop a method to mitigate predictive and supracompetitive pricing using a combination of various training strategies. By implementing a supervision algorithm penalizing collusion and incentivizing competitiveness we effectively incentivize the agents to act competitively rather than collaboratively. The results demonstrate that the convergence of self-learning pricing algorithms towards collusive outcomes can be accurately predicted and mitigated in real time.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadaten-Export


Zitation


+ Suche Autoren in

+ Download-Statistik

Downloads im letzten Jahr

Detaillierte Angaben



Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail


Actions (login required)

Eintrag anzeigen Eintrag anzeigen