Learning a graph-based classifier for fault localization
Learning a graph-based classifier for fault localization作者机构:Department of Computer Science and Engineering Shanghai Jiao Tong University
出 版 物:《Science China(Information Sciences)》 (中国科学:信息科学(英文版))
年 卷 期:2020年第63卷第6期
页 面:195-216页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 081202[工学-计算机软件与理论] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:sponsored by National Key R&D Program of China (Grant No. 2018YFC0830500) National Nature Science Foundation of China (Grant No. 61572313) Science and Technology Commission of Shanghai Municipality (Grant No. 15DZ1100305)
主 题:fault classifier partial code analysis bug fix analysis
摘 要:Because software emerged, locating software faults has been intensively researched, culminating in various approaches and tools that have been applied in real development. Despite the success of these developments, improved tools are still demanded by programmers. Meanwhile, some programmers are reluctant to use any tools when locating faults in their development. The state-of-the-art situation can be naturally improved by learning how programmers locate faults. The rapid development of open-source software has accumulated many bug fixes. A bug fix is a specific type of comments containing a set of buggy files and their corresponding fixed files, which reveal how programmers repair bugs. Feasibly, an automatic model can learn fault locations from bug fixes, but prior attempts to achieve this vision have been prevented by various technical challenges. For example, most bug fixes are not compilable after checking out, which hinders analyzing bug fixes by most advanced static/dynamic tools. This paper proposes an approach called ClaFa that trains a graph-based fault classifier from bug fixes. ClaFa is built on a recent partial-code tool called Grapa, which enables the analysis of partial programs by the complete code tool called WALA. Once Grapa has built a program dependency graph from a bug fix, ClaFa compares the graph from the buggy code with the graph from the fixed code, locates the buggy nodes, and extracts the various graph features of the buggy and clean nodes. Based on the extraction result, ClaFa trains a classifier that combines Adaboost and decision tree learning. The trained ClaFa can predict whether a node of a program dependency graph is buggy or *** evaluate ClaFa on thousands of buggy files collected from four open-source projects: Aries, Mahout,Derby, and Cassandra. The f-scores of ClaFa achieves are approximately 80% on all projects.