A Linear-Time Algorithm for Optimal Tree Sibling Partitioning and its Application to XML Data Stores

Kanne, Carl-Christian ; Moerkotte, Guido

TR_2006_006.pdf - Published

Download (248kB)

URL: http://ub-madoc.bib.uni-mannheim.de/1168
URN: urn:nbn:de:bsz:180-madoc-11689
Document Type: Working paper
Year of publication: 2006
The title of a journal, publication series: None
Publication language: English
Institution: School of Business Informatics and Mathematics > Sonstige - Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik
MADOC publication series: Veröffentlichungen der Fakultät für Mathematik und Informatik > Institut für Informatik > Technical Reports
Subject: 004 Computer science, internet
Subject headings (SWD): Datenbankmanagementsysteme , Baum <Mathematik> , Partition <Informatik> , Algorithmus
Individual keywords (German): tree partitioning
Keywords (English): tree partitioning
Abstract: Document insertion into a native XML Data Store (XDS) requires to partition the document tree into a number of storage units with limited capacity, such as records on disk pages. As intra partition navigation is much faster than navigation between partitions, minimizing the number of partitions has a beneficial effect on query performance. We present a linear time algorithm to optimally partition an ordered, labeled, weighted tree such that each partition does not exceed a fixed weight limit. Whereas traditionally tree partitioning algorithms only allow child nodes to share a partition with their parent node (i.e. a partition corresponds to a subtree), our algorithm also considers partitions containing several subtrees as long as their roots are adjacent siblings. We call this sibling partitioning. Based on our study of the optimal algorithm, we further introduce two novel, near-optimal heuristics. They are easier to implement, do not need to hold the whole document instance in memory, and require much less runtime than the optimal algorithm. Finally, we provide an experimental study comparing our novel and existing algorithms. One important finding is that compared to partitioning that exclusively considers parent-child partitions, including sibling partitioning as well can decrease the total number of partitions by more than 90%, and improve query performance by more than a factor of two.
Additional information:

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item