Automatically curated data sets
Kessel, Marcus
;
Atkinson, Colin
DOI:
|
https://doi.org/10.1109/SCAM.2019.00010
|
URL:
|
https://ieeexplore.ieee.org/document/8930881
|
Document Type:
|
Conference or workshop publication
|
Year of publication:
|
2019
|
Book title:
|
SCAM 2019 : 19th International Working Conference on Source Code Analysis and Manipulation, September 30 - October 1, 2019, Cleveland, Ohio : proceedings
|
Page range:
|
56-61
|
Conference title:
|
19th International Working Conference on Source Code Analysis and Manipulation (SCAM)
|
Location of the conference venue:
|
Cleveland, OH
|
Date of the conference:
|
30.09.-01.10.19
|
Publisher:
|
O'Conner, Lisa
|
Place of publication:
|
Los Alamitos, CA [u.a.]
|
Publishing house:
|
IEEE
|
ISBN:
|
978-1-7281-4938-7 , 978-1-7281-4937-0
|
ISSN:
|
1942-5430 , 2470-6892
|
Publication language:
|
English
|
Institution:
|
School of Business Informatics and Mathematics > Software Engineering (Atkinson 2003-)
|
Subject:
|
004 Computer science, internet
|
Abstract:
|
To validate hypotheses and tools that depend on the semantics of software, it is necessary to assemble, prepare and maintain (i.e. curate) large, high-quality corpora of executable software systems exhibiting certain desired behavior and/or properties. Today this is a highly tedious and laborious activity requiring significant human time and effort. In this paper we therefore present a prototype platform that supports the notion of “live data sets” where almost all aspects of the data set curation process are automated. Instead of curating data sets by hand, or writing dedicated tools to select and check software samples on a case-by-case basis, a live data set allows users to simply describe their requirements as abstract scripts written in a declarative domain specific language. After explaining the approach and the key ideas behind its implementation, in this paper we present two examples of executable corpora generated automatically from a live data set populated from Maven Central. The first illustrates a “semantics agnostic” use case where the actual behavior of the software is unimportant, while the second illustrates a “semantics specific” use case where software implementing a specific functional abstraction is selected.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Search Authors in
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
|
Show item |
|
|