Explaining differences between unaligned table snapshots

Fink, Manuel ; Meilicke, Christian ; Stuckenschmidt, Heiner

paper_55.pdf - Published

Download (1MB)

URL: https://madoc.bib.uni-mannheim.de/55359
Additional URL: https://openproceedings.org/2020/conf/edbt/paper_5...
URN: urn:nbn:de:bsz:180-madoc-553594
Document Type: Conference or workshop publication
Year of publication: 2020
Book title: Advances in Database Technology - EDBT 2020, 23rd International Conference on Extending Database Technology, Copenhagen, Denmark, March 30 - April 02, proceedings
Page range: 133-144
Conference title: EDBT 2020
Location of the conference venue: Copenhagen, Denmark
Date of the conference: 30.03.-02.04.2020
Publisher: Bonifati, Angela
Place of publication: Copenhagen
Publishing house: OpenProceedings.org
ISBN: 978-3-89318-083-7
ISSN: 2367-2005
Related URLs:
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
Pre-existing license: Creative Commons Attribution, Non-Commercial, No Derivatives 4.0 International (CC BY-NC-ND 4.0)
Subject: 004 Computer science, internet
Abstract: We study the problem of explaining differences between two snapshots of the same database table including record insertions, deletions and in particular record updates. Unlike existing alternatives, our solution induces transformation functions and does not require knowledge of the correct alignment between the record sets. This allows profiling snapshots of tables with unspecified or modified primary keys. In such a problem setting, there are always multiple explanations for the differences. Our goal is to find the simplest explanation. We propose to measure the complexity of explanations on the basis of minimum description length in order to formulate the task as an optimization problem. We show that the problem is NP-hard and propose a heuristic search algorithm to solve practical problem instances. We implement a prototype called Affidavit to assess the explanatory qualities of our approach in experiments based on different real-world data sets. We show that it can scale to both a large number of records and attributes and is able to reliably provide correct explanations under practical levels of modifications.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item