Cross-language refactoring detection method based on edit sequence

Aiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collectio...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao LI, Dongwen ZHANG, Yang ZHANG, Kun ZHENG
Format: Article
Language:zho
Published: Hebei University of Science and Technology 2024-12-01
Series:Journal of Hebei University of Science and Technology
Subjects:
Online Access:https://xuebao.hebust.edu.cn/hbkjdx/article/pdf/b202406007?st=article_issue
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841559994230636544
author Tao LI
Dongwen ZHANG
Yang ZHANG
Kun ZHENG
author_facet Tao LI
Dongwen ZHANG
Yang ZHANG
Kun ZHENG
author_sort Tao LI
collection DOAJ
description Aiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collection tools were employed to collect commit messages, code change information, and refactoring types from different programming languages, the edit sequences were generated from the code change information, and all the data were combined to create a dataset. Secondly, the CodeBERT pre-training model was combined with the BiLSTM-attention model to train and test on the dataset. Finally, the effectiveness of the proposed method was evaluated from six perspectives. The results show that RefCode achieves a significant improvement of about 50% in both precision and recall compared to the refactoring detection method which only uses commit messages as inputs to the LSTM model. The research results realize cross-language refactoring detection and effectively compensate for the defect of unreliable commit messages, which provides some reference for the detection of other programming languages and refactoring types.
format Article
id doaj-art-ad17a784f3924fdd9789900a4a6213a3
institution Kabale University
issn 1008-1542
language zho
publishDate 2024-12-01
publisher Hebei University of Science and Technology
record_format Article
series Journal of Hebei University of Science and Technology
spelling doaj-art-ad17a784f3924fdd9789900a4a6213a32025-01-05T06:35:22ZzhoHebei University of Science and TechnologyJournal of Hebei University of Science and Technology1008-15422024-12-0145662763510.7535/hbkd.2024yx06007b202406007Cross-language refactoring detection method based on edit sequenceTao LI0Dongwen ZHANG1Yang ZHANG2Kun ZHENG3School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaAiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collection tools were employed to collect commit messages, code change information, and refactoring types from different programming languages, the edit sequences were generated from the code change information, and all the data were combined to create a dataset. Secondly, the CodeBERT pre-training model was combined with the BiLSTM-attention model to train and test on the dataset. Finally, the effectiveness of the proposed method was evaluated from six perspectives. The results show that RefCode achieves a significant improvement of about 50% in both precision and recall compared to the refactoring detection method which only uses commit messages as inputs to the LSTM model. The research results realize cross-language refactoring detection and effectively compensate for the defect of unreliable commit messages, which provides some reference for the detection of other programming languages and refactoring types.https://xuebao.hebust.edu.cn/hbkjdx/article/pdf/b202406007?st=article_issuesoftware engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
spellingShingle Tao LI
Dongwen ZHANG
Yang ZHANG
Kun ZHENG
Cross-language refactoring detection method based on edit sequence
Journal of Hebei University of Science and Technology
software engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
title Cross-language refactoring detection method based on edit sequence
title_full Cross-language refactoring detection method based on edit sequence
title_fullStr Cross-language refactoring detection method based on edit sequence
title_full_unstemmed Cross-language refactoring detection method based on edit sequence
title_short Cross-language refactoring detection method based on edit sequence
title_sort cross language refactoring detection method based on edit sequence
topic software engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
url https://xuebao.hebust.edu.cn/hbkjdx/article/pdf/b202406007?st=article_issue
work_keys_str_mv AT taoli crosslanguagerefactoringdetectionmethodbasedoneditsequence
AT dongwenzhang crosslanguagerefactoringdetectionmethodbasedoneditsequence
AT yangzhang crosslanguagerefactoringdetectionmethodbasedoneditsequence
AT kunzheng crosslanguagerefactoringdetectionmethodbasedoneditsequence