Profiling the potential of web tables for augmenting cross-domain knowledge bases
Ritze, Dominique
;
Lehmberg, Oliver
;
Oulabi, Yaser
;
Bizer, Christian
DOI:
|
https://doi.org/10.1145/2872427.2883017
|
URL:
|
http://dl.acm.org/citation.cfm?doid=2872427.288301...
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2016
|
Buchtitel:
|
Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11 - 15, 2016
|
Seitenbereich:
|
251-261
|
Veranstaltungstitel:
|
25th International World Wide Web Conference (WWW 2016)
|
Veranstaltungsort:
|
Montréal, Canada
|
Veranstaltungsdatum:
|
April 11-15, 2016
|
Herausgeber:
|
Bourdeau, Jacqueline
|
Ort der Veröffentlichung:
|
Geneva, Switzerland
|
Verlag:
|
ACM
|
ISBN:
|
978-1-4503-4143-1
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems V: Web-based Systems (Bizer 2012-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
data fusion , data profiling , knowledge base augmentation , schema and data matching , slot filling , web tables
|
Abstract:
|
Cross-domain knowledge bases such as DBpedia, YAGO, or the Google Knowledge Graph have gained increasing attention over the last years and are starting to be deployed within various use cases. However, the content of such knowledge bases is far from being complete, far from always being correct, and suffers from deprecation (i.e. population numbers become outdated after some time). Hence, there are efforts to leverage various types of Web data to complement, update and extend such knowledge bases. A source of Web data that potentially provides a very wide coverage are millions of relational HTML tables that are found on the Web. The existing work on using data from Web tables to augment cross-domain knowledge bases reports only aggregated performance numbers. The actual content of the Web tables and the topical areas of the knowledge bases that can be complemented using the tables remain unclear. In this paper, we match a large, publicly available Web table corpus to the DBpedia knowledge base. Based on the matching results, we profile the potential of Web tables for augmenting different parts of cross-domain knowledge bases and report detailed statistics about classes, properties, and instances for which missing values can be filled using Web table data as evidence. In order to estimate the potential quality of the new values, we empirically examine the Local Closed World Assumption and use it to determine the maximal number of correct facts that an ideal data fusion strategy could generate. Using this as ground truth, we compare three data fusion strategies and conclude that knowledge-based trust outperforms PageRank- and voting-based fusion.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
BASE:
Ritze, Dominique
;
Lehmberg, Oliver
;
Oulabi, Yaser
;
Bizer, Christian
Google Scholar:
Ritze, Dominique
;
Lehmberg, Oliver
;
Oulabi, Yaser
;
Bizer, Christian
ORCID:
Ritze, Dominique ; Lehmberg, Oliver ; Oulabi, Yaser ; Bizer, Christian ORCID: 0000-0003-2367-0237
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|