Root cause analysis in IT infrastructures using ontologies and abduction in Markov Logic Networks


Schönfisch, Jörg ; Meilicke, Christian ; Stülpnagel, Janno von ; Ortmann, Jens ; Stuckenschmidt, Heiner



DOI: https://doi.org/10.1016/j.is.2017.11.003
URL: http://www.sciencedirect.com/science/article/pii/S...
Additional URL: http://publications.wim.uni-mannheim.de/informatik...
Document Type: Article
Year of publication: 2018
The title of a journal, publication series: Information Systems : IS
Volume: 74
Issue number: 2
Page range: 103-116
Place of publication: Amsterdam
Publishing house: Elsevier
ISSN: 0094-453X , 0306-4379
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
Subject: 004 Computer science, internet
Keywords (English): Root Cause Analysis , IT Infrastructure Management , Markov Logic Network , Ontology , Abductive Reasoning
Abstract: Information systems play a crucial role in most of today’s business operations. High availability and reliability of services and hardware, and, in the case of outages, short response times are essential. Thus, a high amount of tool support and automation in risk management is desirable to decrease downtime. We propose a new approach for calculating the root cause for an observed failure in an IT infrastructure. Our approach is based on abduction in Markov Logic Networks. Abduction aims to find an explanation for a given observation in the light of some background knowledge. In failure diagnosis, the explanation corresponds to the root cause, the observation to the failure of a component, and the background knowledge to the dependency graph extended by potential risks. We apply a method to extend a Markov Logic Network in order to conduct abductive reasoning, which is not naturally supported in this formalism. Our approach exhibits a high amount of reusability and facilitates modeling by using ontologies as background knowledge. This enables users without spe- cific knowledge of a concrete infrastructure to gain viable insights in the case of an incident. We implemented the method in a tool and illustrate its suitabil- ity for root cause analysis by applying it to a sample scenario and testing its scalability on randomly generated infrastructures.




Dieser Eintrag ist Teil der Universitätsbibliographie.




Metadata export


Citation


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item