Investigating, predicting, and mitigating collusive behavior in deep reinforcement learning-based pricing AIs
Schlechtinger, Michael
URN:
|
urn:nbn:de:bsz:180-madoc-680183
|
Dokumenttyp:
|
Dissertation
|
Erscheinungsjahr:
|
2024
|
Ort der Veröffentlichung:
|
Mannheim
|
Hochschule:
|
Universität Mannheim
|
Gutachter:
|
Paulheim, Heiko
|
Datum der mündl. Prüfung:
|
26 September 2024
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Data Science (Paulheim 2018-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
reinforcement learning , pricing algorithms , collusion
|
Abstract:
|
In the rapidly evolving landscape of eCommerce, dynamic pricing using Artificial Intelligence (AI) has become increasingly prevalent. Pricing AIs, particularly those utilizing (deep) Reinforcement Learning, continuously adjust to dynamic market conditions, raising concerns about potential market collusion. This thesis addresses this issue through several approaches.
Firstly, we investigate a modified prisoner’s dilemma scenario where three agents play rock-paper-scissors. Results indicate potentially collaborative behavior characterized by an action selection dissectable into specific stages, establishing the possibility of developing collusion prevention systems that can recognize situations that might lead to collusion between competitors. We provide evidence that agents can perform tacit cooperation strategies without being explicitly trained to do so.
Second, this research employs an experimental oligopoly model of repeated price competition, systematically varying the environment to cover scenarios from basic economic theory to subjective consumer demand preferences. We explore strategies and emerging pricing patterns leading to collusion, including scenarios where agents cannot observe competitors’ prices. Comprehensive legal analysis is
provided across all scenarios.
Third, we examine the agents’ ability to use pricing information to predict their competitors’ behavior. Thus, predictive statistical techniques and time series analysis methodologies are employed to anticipate collusive outcomes. Findings indicate that self-learning pricing algorithms’ convergence towards a collusive market outcome can be accurately anticipated using machine learning methodologies.
Finally, we develop a method to mitigate predictive and supracompetitive pricing using a combination of various training strategies. By implementing a supervision algorithm penalizing collusion and incentivizing competitiveness we effectively incentivize the agents to act competitively rather than collaboratively. The results demonstrate that the convergence of self-learning pricing algorithms towards collusive outcomes can be accurately predicted and mitigated in real time.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
| Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt. |
Suche Autoren in
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|
|