Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Mining Software Repositories for Defect Categorization

View through CrossRef
Early detection of software defects is very important to decrease the software cost and subsequently increase the software quality. Success of software industries not only depends on gaining knowledge about software defects, but largely reflects from the manner in which information about defect is collected and used. In software industries, individuals at different levels from customers to engineers apply diverse mechanisms to detect the allocation of defects to a particular class. Categorizing bugs based on their characteristics helps the Software Development team take appropriate actions to reduce similar defects that might get reported in future releases. Classification, if performed manually, will consume more time and effort. Human resource having expert testing skills & domain knowledge will be required for labeling the data. Therefore, the need of automatic classification of software defect is high.This work attempts to categorize defects by proposing an algorithm called Software Defect CLustering (SDCL). It aims at mining the existing online bug repositories like Eclipse, Bugzilla and JIRA for analyzing the defect description and its categorization. The proposed algorithm is designed by using text clustering and works with three major modules to find out the class to which the defect should be assigned. Software bug repositories hold software defect data with attributes like defect description, status, defect open and close date. Defect extraction module extracts the defect description from various bug repositories and converts it into unified format for further processing. Unnecessary and irrelevant texts are removed from defect data using data preprocessing module. Finally grouping of defect data into clusters of similar defect is done using clustering technique. The algorithm provides classification accuracy more than 80% in all of the three above mentioned repositories.
Title: Mining Software Repositories for Defect Categorization
Description:
Early detection of software defects is very important to decrease the software cost and subsequently increase the software quality.
Success of software industries not only depends on gaining knowledge about software defects, but largely reflects from the manner in which information about defect is collected and used.
In software industries, individuals at different levels from customers to engineers apply diverse mechanisms to detect the allocation of defects to a particular class.
Categorizing bugs based on their characteristics helps the Software Development team take appropriate actions to reduce similar defects that might get reported in future releases.
Classification, if performed manually, will consume more time and effort.
Human resource having expert testing skills & domain knowledge will be required for labeling the data.
Therefore, the need of automatic classification of software defect is high.
This work attempts to categorize defects by proposing an algorithm called Software Defect CLustering (SDCL).
It aims at mining the existing online bug repositories like Eclipse, Bugzilla and JIRA for analyzing the defect description and its categorization.
The proposed algorithm is designed by using text clustering and works with three major modules to find out the class to which the defect should be assigned.
Software bug repositories hold software defect data with attributes like defect description, status, defect open and close date.
Defect extraction module extracts the defect description from various bug repositories and converts it into unified format for further processing.
Unnecessary and irrelevant texts are removed from defect data using data preprocessing module.
Finally grouping of defect data into clusters of similar defect is done using clustering technique.
The algorithm provides classification accuracy more than 80% in all of the three above mentioned repositories.

Related Results

Data mining tools
Data mining tools
AbstractThe development and application of data mining algorithms requires the use of powerful software tools. As the number of available tools continues to grow, the choice of the...
Categorizing Motion: Story-Based Categorizations
Categorizing Motion: Story-Based Categorizations
Our most primary goal is to provide a motion categorization for moving entities. A motion categorization that is related to how humans categorize motion, i.e., that is cognitive ...
Clinical and Radiographic Assessment of Periodontal Infrabony Defect Depth and Width and Their Correlation
Clinical and Radiographic Assessment of Periodontal Infrabony Defect Depth and Width and Their Correlation
Brief Background There is preliminary evidence of periodontal defect depth, number of walls and the width of infrabony defects exerting influence on the regenerative potential of p...
Impact of Mining on Socioeconomic Status in Puno, Peru
Impact of Mining on Socioeconomic Status in Puno, Peru
This study examines the direct and indirect effects of mining activities on key socioeconomic indicators such as per capita income, the Human Development Index (HDI), and education...
Towards Transparent Presentation of FAIR-enabling Data Repository Functions & Characteristics
Towards Transparent Presentation of FAIR-enabling Data Repository Functions & Characteristics
Identifying, finding and gaining a sufficient overview of the functions and characteristics of data repositories and their catalogues is essential for users of data repositories an...
IRUS-UK: Improving understanding of the value and impact of institutional repositories
IRUS-UK: Improving understanding of the value and impact of institutional repositories
>> See video of presentation (21 min.) Many educational institutions have repositories for research outputs. The number of items available through institutional repositories ...
Optimisation of potash mining technology for cell and pillar mining method
Optimisation of potash mining technology for cell and pillar mining method
The diverse demand for inorganic fertilizers has predetermined the intensification of potash mining, which is a raw material for their production. In this regard, it has become nec...
The Significance of Text Mining in Research: A Comprehensive Review
The Significance of Text Mining in Research: A Comprehensive Review
Text mining has emerged as a pivotal tool in various domains of research, revolutionizing the way scholars and scientists extract valuable insights from vast volumes of textual dat...

Back to Top