MineCPP: Mining Bug Fix Pairs and Their Structures
Source
Fse Companion Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering
Date Issued
2024-07-10
Author(s)
Avula, Sai Krishna
Abstract
Modern software repositories serve as valuable sources of information for understanding and addressing software bugs. In this paper, we present MineCPP, a tool designed for large-scale bug-fixing dataset generation, extending the capabilities of a recently proposed approach, namely Minecraft. MineCPP not only captures bug locations and types across multiple programming languages but introduces novel features like offset of a bug in a buggy source file, the sequence of syntactic constructs up to and including the location of the bug, etc. We discuss architectural and operational aspects of MineCPP, and show how it can be used to automatically mine GitHub repositories. A Graphical User Interface (GUI) further enhances user experience by providing interactive visualizations and quantitative analyses, facilitating fine-grained insights about the structure of bug fix pairs. MineCPP serves as a helpful solution for researchers, practitioners, and developers seeking comprehensive bug-fixing datasets and insights into coding practices. Tool demonstration is available at https://youtu.be/ln99irvbADE.
Subjects
Bug Fixes | Coding Effort | LLMs | Mining Software Repositories
