Automated Knowledge Base Extension Using Open Information

Dutta, Arnab

dutta.dissertation.pdf - Published

Download (1MB)

URN: urn:nbn:de:bsz:180-madoc-404692
Document Type: Doctoral dissertation
Year of publication: 2016
Place of publication: Mannheim
University: Universität Mannheim
Evaluator: Stuckenschmidt, Heiner
Date of oral examination: 4 February 2016
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
Subject: 004 Computer science, internet
Subject headings (SWD): Wissensverarbeitung , Wissenstechnik
Keywords (English): Data Integration , Markov Clustering , Enriching Knowledge Bases , Probabilistic Inference
Abstract: Open Information Extractions (OIE) (like Nell, Reverb) frameworks provide us with domain independent facts in natural language forms containing knowledge from varied sources. Extraction mechanisms for structured knowledge bases (KB) (like DBpedia, Yago) often fail to retrieve such facts due to its resource specific extraction schemes. Hence, the structured KBs can extend themselves by augmenting their coverage with the facts discovered by OIE systems. This possibility motivates us to integrate these two genres of extractions into one interactive framework. In this work, we present a complete, ontology independent, generalized architecture for achieving this integration. Our proposed solution is modularized which solves a specific set of tasks: (1) mapping subject and object terms from OIE facts to KB instances (2) mapping the OIE relational phrases to object properties defined in the KB. Furthermore, in an open extraction setting identical semantic relationships can be represented by different surface forms, making it necessary to group them together. To solve this problem, (3) we propose the use of markov clustering to cluster OIE relations. Key to our approach lies in exploiting the inherent dependancies between relations and its arguments. This makes our approach completely context agnostic and generally applicable. We evaluated our method on the two state of the art extraction systems, achieving over 85% precision on instance mappings and over 90% for the relation mappings. We also created a distant supervision based gold standard for the purpose and the data has been released as part of this work. Furthermore, we analyze the effect of clustering and empirically show its effectiveness as a relation mapping technique over other techniques. Overall, our work positions itself on the intersection of information extraction, ontology mapping and reasoning.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

BASE: Dutta, Arnab

Google Scholar: Dutta, Arnab

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item