SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair (1901.01808v3)

Published 24 Dec 2018 in cs.SE, cs.LG, and stat.ML

Abstract: This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning. We devise, implement, and evaluate a system, called SequenceR, for fixing bugs based on sequence-to-sequence learning on source code. This approach uses the copy mechanism to overcome the unlimited vocabulary problem that occurs with big code. Our system is data-driven; we train it on 35,578 samples, carefully curated from commits to open-source repositories. We evaluate it on 4,711 independent real bug fixes, as well on the Defects4J benchmark used in program repair research. SequenceR is able to perfectly predict the fixed line for 950/4711 testing samples, and find correct patches for 14 bugs in Defects4J. It captures a wide range of repair operators without any domain-specific top-down design.

Citations (399)

View on Semantic Scholar

Summary

The paper introduces SequenceR, a novel seq2seq approach that repairs buggy code by translating faulty token sequences into fixed ones.
The paper employs a copy mechanism to handle unlimited vocabularies, effectively generating rare tokens and unique identifiers in code.
Empirical evaluation on real-world datasets and the Defects4J benchmark demonstrates SequenceR’s ability to predict precise patches, validating its potential for automated debugging.

Sequence-to-Sequence Learning for End-to-End Program Repair

The paper "Sequence-to-Sequence Learning for End-to-End Program Repair" presents an innovative approach to automatic program repair using machine learning, specifically sequence-to-sequence (seq2seq) neural networks. The proposed system, unnamed in the document but referred to in discussion as "SequenceR," focuses on repairing bugs in source code by predicting line-level changes in a data-driven manner. This approach is grounded in the concept of translating buggy sequences of tokens into fixed sequences, similar to how machine translation systems function.

Overview of the Approach

SequenceR leverages the seq2seq neural network model, commonly employed in machine translation, to address the challenges inherent in source code repairs. The paper highlights three key contributions of this work:

Contextual Abstraction: The approach utilizes a novel method for constructing an abstract buggy context that includes relevant code sections around the identified buggy line. This abstraction allows the model to consider a broader context for understanding and fixing the bug without being bound by method size limitations.
Copy Mechanism Utilization: To tackle the unlimited vocabulary problem in program code, SequenceR integrates a copy mechanism that enables the system to generate rare tokens not present in the training data vocabulary. This feature is crucial for dealing with unique identifiers and method calls that often appear in source code but remain rare in natural language corpora.
Empirical Evaluation: The system is rigorously evaluated on a large dataset of real-world code changes and an established benchmark, Defects4J. SequenceR demonstrates significant predictive success, achieving an ability to perfectly predict 950 out of 4,711 testing samples and generating plausible patches for 19 bugs in the Defects4J benchmark.

Key Findings

The evaluation showcases the capacity of SequenceR to effectively learn and apply a wide range of repair operators. It demonstrates adaptability in handling diverse programming constructs, including changes to method calls, conditional statements, Java keywords, and off-by-one errors. Importantly, the copy mechanism proved essential in achieving high prediction accuracy, allowing SequenceR to make informed alterations by directly referencing context-specific tokens outside the fixed vocabulary.

Implications

The work bridges machine learning and program repair, highlighting the potential for data-driven models to outperform traditional rule-based techniques, which rely heavily on domain knowledge and handcrafted rules. SequenceR's success provides a compelling case for continued research in leveraging deep learning for software engineering tasks, which could lead to more generalized and adaptable program repair systems.

Moreover, the approach has theoretical implications for the development of future machine learning models in program repair. The extension of seq2seq models with a copy mechanism can be further explored to enhance their application to other programming languages and contexts.

Future Directions

The authors suggest several directions for future research, including the extension of the approach to handle multi-line patches and the integration of tree-to-tree transformation learning to complement token-based methods. Additionally, the problem of comprehensive fault localization and its integration with such machine learning models poses an opportunity for refinement in automated debugging processes.

In conclusion, SequenceR represents a substantial step forward in the application of machine learning to program repair, offering both practical benefits and a foundation for future innovations in the field.

PDF Markdown