Cross-language refactoring detection method based on edit sequence

Aiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collectio...

Full description

Saved in:

Bibliographic Details
Main Authors:	Tao LI, Dongwen ZHANG, Yang ZHANG, Kun ZHENG
Format:	Article
Language:	zho
Published:	Hebei University of Science and Technology 2024-12-01
Series:	Journal of Hebei University of Science and Technology
Subjects:	software engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
Online Access:	https://xuebao.hebust.edu.cn/hbkjdx/article/pdf/b202406007?st=article_issue
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841559994230636544
author	Tao LI Dongwen ZHANG Yang ZHANG Kun ZHENG
author_facet	Tao LI Dongwen ZHANG Yang ZHANG Kun ZHENG
author_sort	Tao LI
collection	DOAJ
description	Aiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collection tools were employed to collect commit messages, code change information, and refactoring types from different programming languages, the edit sequences were generated from the code change information, and all the data were combined to create a dataset. Secondly, the CodeBERT pre-training model was combined with the BiLSTM-attention model to train and test on the dataset. Finally, the effectiveness of the proposed method was evaluated from six perspectives. The results show that RefCode achieves a significant improvement of about 50% in both precision and recall compared to the refactoring detection method which only uses commit messages as inputs to the LSTM model. The research results realize cross-language refactoring detection and effectively compensate for the defect of unreliable commit messages, which provides some reference for the detection of other programming languages and refactoring types.
format	Article
id	doaj-art-ad17a784f3924fdd9789900a4a6213a3
institution	Kabale University
issn	1008-1542
language	zho
publishDate	2024-12-01
publisher	Hebei University of Science and Technology
record_format	Article
series	Journal of Hebei University of Science and Technology
spelling	doaj-art-ad17a784f3924fdd9789900a4a6213a32025-01-05T06:35:22ZzhoHebei University of Science and TechnologyJournal of Hebei University of Science and Technology1008-15422024-12-0145662763510.7535/hbkd.2024yx06007b202406007Cross-language refactoring detection method based on edit sequenceTao LI0Dongwen ZHANG1Yang ZHANG2Kun ZHENG3School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaAiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collection tools were employed to collect commit messages, code change information, and refactoring types from different programming languages, the edit sequences were generated from the code change information, and all the data were combined to create a dataset. Secondly, the CodeBERT pre-training model was combined with the BiLSTM-attention model to train and test on the dataset. Finally, the effectiveness of the proposed method was evaluated from six perspectives. The results show that RefCode achieves a significant improvement of about 50% in both precision and recall compared to the refactoring detection method which only uses commit messages as inputs to the LSTM model. The research results realize cross-language refactoring detection and effectively compensate for the defect of unreliable commit messages, which provides some reference for the detection of other programming languages and refactoring types.https://xuebao.hebust.edu.cn/hbkjdx/article/pdf/b202406007?st=article_issuesoftware engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
spellingShingle	Tao LI Dongwen ZHANG Yang ZHANG Kun ZHENG Cross-language refactoring detection method based on edit sequence Journal of Hebei University of Science and Technology software engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
title	Cross-language refactoring detection method based on edit sequence
title_full	Cross-language refactoring detection method based on edit sequence
title_fullStr	Cross-language refactoring detection method based on edit sequence
title_full_unstemmed	Cross-language refactoring detection method based on edit sequence
title_short	Cross-language refactoring detection method based on edit sequence
title_sort	cross language refactoring detection method based on edit sequence
topic	software engineering; refactoring detection; deep learning; cross-language; code change; edit sequence
url	https://xuebao.hebust.edu.cn/hbkjdx/article/pdf/b202406007?st=article_issue
work_keys_str_mv	AT taoli crosslanguagerefactoringdetectionmethodbasedoneditsequence AT dongwenzhang crosslanguagerefactoringdetectionmethodbasedoneditsequence AT yangzhang crosslanguagerefactoringdetectionmethodbasedoneditsequence AT kunzheng crosslanguagerefactoringdetectionmethodbasedoneditsequence

Cross-language refactoring detection method based on edit sequence

Similar Items