Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 452 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Advaita: Bug Duplicity Detection System (2001.10376v1)

Published 24 Jan 2020 in cs.SE and cs.AI

Abstract: Bugs are prevalent in software development. To improve software quality, bugs are filed using a bug tracking system. Properties of a reported bug would consist of a headline, description, project, product, component that is affected by the bug and the severity of the bug. Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the product maturity , size of the code and number of engineers working on the project. Duplicate bugs range are between 9% to 39% in some of the open source projects like Eclipse, Firefox etc. Detection of duplicity deals with identifying whether any two bugs convey the same meaning. This detection of duplicates helps in de-duplication. Detecting duplicate bugs help reduce triaging efforts and saves time for developers in fixing the issues. Traditional natural language processing techniques are less accurate in identifying similarity between sentences. Using the bug data present in a bug tracking system, various approaches were explored including several machine learning algorithms, to obtain a viable approach that can identify duplicate bugs, given a pair of sentences(i.e. the respective bug descriptions). This approach considers multiple sets of features viz. basic text statistical features, semantic features and contextual features. These features are extracted from the headline, description and component and are subsequently used to train a classification algorithm.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.