Scalable frequent sequence mining with flexible subsequence constraints


Renz-Wieland, Alexander ; Bertsch, Mattias ; Gemulla, Rainer


[img]
Preview
PDF
Scalable Frequent Sequence Mining With Flexible Subsequence Constraints.pdf - Published

Download (305kB)

DOI: https://doi.org/10.1109/ICDE.2019.00134
URL: https://madoc.bib.uni-mannheim.de/48219
Additional URL: https://ieeexplore.ieee.org/document/8731375
URN: urn:nbn:de:bsz:180-madoc-482199
Document Type: Conference or workshop publication
Year of publication: 2019
Book title: IEEE 35th International Conference on Data Engineering : ICDE 2019 : Macau SAR, China, 8-11 April 2019 : proceedings
Page range: 1490-1501
Conference title: 2019 IEEE International Conference on Data Engineering (ICDE)
Location of the conference venue: Macao, Macao, China
Date of the conference: 8-11 April 2019
Place of publication: Piscataway, NJ
Publishing house: IEEE
ISBN: 978-1-5386-7474-1 , 978-1-5386-7475-8
ISSN: 1063-6382 , 2375-026X
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science I: Data Analytics (Gemulla 2014-)
Subject: 004 Computer science, internet
Abstract: We study scalable algorithms for frequent sequence mining under flexible subsequence constraints. Such constraints enable applications to specify concisely which patterns are of interest and which are not. We focus on the bulk synchronous parallel model with one round of communication; this model is suitable for platforms such as MapReduce or Spark. We derive a general framework for frequent sequence mining under this model and propose the D-SEQ and D-CAND algorithms within this framework. The algorithms differ in what data are communicated and how computation is split up among workers. To the best of our knowledge, D-SEQ and D-CAND are the first scalable algorithms for frequent sequence mining with flexible constraints. We conducted an experimental study on multiple real-world datasets that suggests that our algorithms scale nearly linearly, outperform common baselines, and offer acceptable generalization overhead over existing, less general mining algorithms.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadata export


Citation


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item