Exploiting microdata annotations to consistently categorize product offers at web scale


Meusel, Robert ; Primpeli, Anna ; Meilicke, Christian ; Paulheim, Heiko ; Bizer, Christian



DOI: https://doi.org/10.1007/978-3-319-27729-5_7
URL: https://www.researchgate.net/publication/300113438...
Additional URL: http://dws.informatik.uni-mannheim.de/fileadmin/le...
Document Type: Conference or workshop publication
Year of publication: 2015
Book title: E-Commerce and Web Technologies : 16th International Conference on Electronic Commerce and Web Technologies, EC-Web 2015, Valencia, Spain, September 2015, revised selected papers
The title of a journal, publication series: Lecture Notes in Business Information Processing
Volume: 239
Page range: 83-99
Conference title: EC-Web 2015
Location of the conference venue: Valencia, Spain
Date of the conference: September 01-02, 2015
Author/Publisher of the book
(only the first ones mentioned)
:
Stuckenschmidt, Heiner
Place of publication: Cham
Publishing house: Springer International Publishing
ISBN: 978-3-319-27728-8 , 978-3-319-27729-5
Publication language: English
Institution: School of Business Informatics and Mathematics > Wirtschaftsinformatik V (Bizer)
School of Business Informatics and Mathematics > Web Data Mining (Juniorprofessur) (Paulheim 2013-2017)
Subject: 004 Computer science, internet
Keywords (English): Microdata , RDFa , Structured Web Data , Classification
Abstract: Semantically annotated data, using markup languages like RDFa and Microdata, has become more and more publicly available in the Web, especially in the area of e-commerce. Thus, a large amount of structured product descriptions are freely available and can be used for various applications, such as product search or recommendation. However, little efforts have been made to analyze the categories of the available product descriptions. Although some products have an explicit category assigned, the categorization schemes vary a lot, as the products originate from thousands of different sites. This heterogeneity makes the use of supervised methods, which have been proposed by most previous works, hard to apply. Therefore, in this paper, we explain how distantly supervised approaches can be used to exploit the heterogeneous category information in order to map the products to set of target categories from an existing product catalogue. Our results show that, even though this task is by far not trivial, we can reach almost 56% accuracy for classifying products into 37 categories.

Dieser Eintrag ist Teil der Universitätsbibliographie.




+ Citation Example and Export

Meusel, Robert ; Primpeli, Anna ; Meilicke, Christian ; Paulheim, Heiko ORCID: 0000-0003-4386-8195 ; Bizer, Christian ORCID: 0000-0003-2367-0237 Exploiting microdata annotations to consistently categorize product offers at web scale. Stuckenschmidt, Heiner Lecture Notes in Business Information Processing 239 83-99 In: E-Commerce and Web Technologies : 16th International Conference on Electronic Commerce and Web Technologies, EC-Web 2015, Valencia, Spain, September 2015, revised selected papers (2015) Cham EC-Web 2015 (Valencia, Spain) [Conference or workshop publication]


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item