Matching HTML tables to DBpedia
Ritze, Dominique
;
Lehmberg, Oliver
;
Bizer, Christian
DOI:
|
https://doi.org/10.1145/2797115.2797118
|
URL:
|
http://doi.acm.org/10.1145/2797115.2797118
|
Weitere URL:
|
https://www.semanticscholar.org/paper/Matching-HTM...
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2015
|
Buchtitel:
|
Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics, WIMS 2015, Larnaca, Cyprus, July 13-15, 2015
|
Seitenbereich:
|
Paper 10, 1-6
|
Veranstaltungsdatum:
|
13.-15.7.2015
|
Herausgeber:
|
Akerkar, Rajendra
|
Ort der Veröffentlichung:
|
New York, NY
|
Verlag:
|
ACM
|
ISBN:
|
978-1-4503-3293-4
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems V: Web-based Systems (Bizer 2012-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
html tables , knowledge base extension , tables matching
|
Abstract:
|
Millions of HTML tables containing structured data can be found on the Web. With their wide coverage, these tables are potentially very useful for filling missing values and extending cross-domain knowledge bases such as DBpedia, YAGO, or the Google Knowledge Graph. As a prerequisite for being able to use table data for knowledge base extension, the HTML tables need to be matched with the knowledge base, meaning that correspondences between table rows/columns and entities/schema elements of the knowledge base need to be found. This paper presents the T2D gold standard for measuring and comparing the performance of HTML table to knowledge base matching systems. T2D consists of 8 700 schema-level and 26 100 entity-level correspondences between the WebDataCommons Web Tables Corpus and the DBpedia knowledge base. In contrast related work on HTML table to knowledge base matching, the Web Tables Corpus (147 million tables), the knowledge base, as well as the gold standard are publicly available. The gold standard is used afterward to evaluate the performance of T2K Match, an iterative matching method which combines schema and instance matching. T2K Match is designed for the use case of matching large quantities of mostly small and narrow HTML tables against large cross-domain knowledge bases. The evaluation using the T2D gold standard shows that T2K Match discovers table-to-class correspondences with a precision of 94%, row-to-entity correspondences with a precision of 90%, and column-to-property correspondences with a precision of 77%.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
BASE:
Ritze, Dominique
;
Lehmberg, Oliver
;
Bizer, Christian
Google Scholar:
Ritze, Dominique
;
Lehmberg, Oliver
;
Bizer, Christian
ORCID:
Ritze, Dominique, Lehmberg, Oliver and Bizer, Christian ORCID: https://orcid.org/0000-0003-2367-0237
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|
|