Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach

Cite as:

@misc{yu2020identifying,
    title={Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach},
    author={Zhe Yu and Fahmid Morshed Fahid and Huy Tu and Tim Menzies},
    year={2020},
    eprint={2002.11049},
    archivePrefix={arXiv},
    primaryClass={cs.SE}
}

Data

Original from Maldonado and Shihab "Detecting and quantifying different types of self-admitted technical debt," in 2015 IEEE 7th InternationalWorkshop on Managing Technical Debt (MTD). IEEE, 2015, pp. 9–15.
Corrected: 439 labels checked, 431 labels corrected.

Experiments

Setup

Jitterbug$ pip install -r requirements.txt
Jitterbug$ python
>>> import nltk
>>> nltk.download('punkt')
>>> nltk.download('wordnet')
Jitterbug$ cd src

RQ1: How to find the strong patterns of the "easy to find" SATDs in Step 1?

Prepare data:

src$ python main.py parse

Find patterns with Easy (target project = apache-ant-1.7.0):

src$ python main.py find_patterns

{'fp': 367.0, 'tp': 2493.0, 'fitness': 0.8716783216783217}
{'fp': 53.0, 'tp': 330.0, 'fitness': 0.8616187989556136}
{'fp': 7.0, 'tp': 87.0, 'fitness': 0.925531914893617}
{'fp': 3.0, 'tp': 46.0, 'fitness': 0.9387755102040817}
{'fp': 28.0, 'tp': 61.0, 'fitness': 0.6853932584269663}
Patterns:
[u'todo' u'fixme' u'hack' u'workaround']
Precisions on training set:
{u'fixme': 0.8616187989556136, u'todo': 0.8716783216783217, u'workaround': 0.9387755102040817, u'hack': 0.925531914893617}

Test Easy on every target project, save output as step1_Easy_original.csv:

src$ python main.py Easy_results original

Test MAT on every target project, save output as step1_MAT_original.csv:

src$ python main.py MAT_results original

Can the ground truth be wrong?

Find conflicting labels (GT=no AND Easy=yes), save as csv files under the conflicts directory:

src$ python main.py validate_ground_truth

Validate the conflicting labels manually, results are under the validate directory.
Summarize validation results and save as validate_sum.csv:

src$ python main.py summarize_validate

Correct ground truth labels with the validation results, new data saved under corrected directory:

src$ python main.py correct_ground_truth

Test Easy on every target project with corrected labels, save output as step1_Easy_corrected.csv, also output the data with the "easy to find" SATDs removed to the rest directory:

src$ python main.py Easy_results corrected

Test MAT on every target project with corrected labels, save output as step1_MAT_corrected.csv:

src$ python main.py MAT_results corrected

RQ2: How to better find the "hard to find" SATDs with less human effort in Step 2?

Test Hard, TM, and other supervised learners on every target project with "easy to find" SATDs removed, save results (rest_*.csv) to the results directory, and dump results as rest_result.pickle:

src$ python main.py rest_results

Plot recall-cost curves of Step2 experiments to figures_rest directory:

src$ python main.py plot_recall_cost rest

When to stop Hard in Step 2?

Use estimator to stop at 90% recall, plot curves to figures_est directory:

src$ python main.py estimate_results

RQ3: Overall how does Jitterbug perform?

Test Jitterbug, Easy+RF, Hard, MAT+RF, TM, RF on every target project, save APFD results as overall_APFD.csv, AUC results as overall_AUC.csv, and dump results as overall_result.pickle:

src$ python main.py overall_results

Plot overall recall-cost curves to figures_overall directory:

src$ python main.py plot_recall_cost overall

How does Jitterbug perform when targeting at finding 90% hard to find SATDs?

Collect precision, recall, F1, and cost results of Jitterbug on the original dataset. Save results as stopping_0.9_original.csv.

src$ python main.py stopping_results original

Collect precision, recall, F1, and cost results of Jitterbug on the corrected dataset. Save results as stopping_0.9_corrected.csv.

src$ python main.py stopping_results corrected

Apply Jitterbug to identify SATDs in a new project

Extract code comments from the target project with srcML. The file httpd.csv contains the extracted code comments from Apache httpd-2.4.6.
Apply Jitterbug interactively on httpd.csv:

src$ python main.py apply_Jitterbug

It will first train on the corrected 10 projects to find the patterns (Easy), then extract the "easy-to-find" SATDs as httpd_easy.csv. The rest of the comments will be output as httpd_rest.csv.
Then it will execute Hard, pause at a breakpoint, output a csv file called httpd_query.csv for the user to inspect and label. The example httpd_query.csv is shown below, the user should edit the column of "code" based on the comments in the column of "Abstract": "yes" if the comment is an SATD, "no" otherwise.
After labeling the httpd_query.csv file, the user should continue the execution by input:

(Pdb) c

Jitterbug will load the labeled httpd_query.csv file and retrain its internal model to make better predictions for the "hard-to-find" SATDs. It will pause again at the breakpoint and output a new httpd_query.csv for the user to label. Current status will also be shown as # SATDs found / # comments inspected / # estimated SATDs:

41/50/3855

This process iterates until # SATDs found >= target recall (e.g. 90%) * # estimated SATDs.
Below shows the first five iterations:
Once stopped, the labeled SATDs can be found in httpd_rest_coded.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach

Data

Experiments

Setup

RQ1: How to find the strong patterns of the "easy to find" SATDs in Step 1?

Can the ground truth be wrong?

RQ2: How to better find the "hard to find" SATDs with less human effort in Step 2?

When to stop Hard in Step 2?

RQ3: Overall how does Jitterbug perform?

How does Jitterbug perform when targeting at finding 90% hard to find SATDs?

Apply Jitterbug to identify SATDs in a new project

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
baselines		baselines
data		data
dump		dump
figures_est		figures_est
figures_overall		figures_overall
figures_rest		figures_rest
httpd		httpd
img		img
new_data		new_data
results		results
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

ai-se/Jitterbug

Folders and files

Latest commit

History

Repository files navigation

Identifying Self-Admitted Technical Debts with Jitterbug: A Two-step Approach

Data

Experiments

Setup

RQ1: How to find the strong patterns of the "easy to find" SATDs in Step 1?

Can the ground truth be wrong?

RQ2: How to better find the "hard to find" SATDs with less human effort in Step 2?

When to stop Hard in Step 2?

RQ3: Overall how does Jitterbug perform?

How does Jitterbug perform when targeting at finding 90% hard to find SATDs?

Apply Jitterbug to identify SATDs in a new project

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages