Automated Patch Assessment for Program Repair at Scale (1909.13694v3)

Published 30 Sep 2019 in cs.SE

Abstract: In this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to help scientists to do overfitting analysis when they evaluate program repair systems; Third, we improve the external validity of the program repair knowledge with the largest study ever.

Citations (72)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Automated Patch Assessment for Program Repair at Scale (1909.13694v3)

Summary

Related Papers