Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods

This study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouT...

Full description

Saved in:
Bibliographic Details
Main Authors: Wildan Jaya Kusoema, Ichsan Ibrahim
Format: Article
Language:Indonesian
Published: Islamic University of Indragiri 2025-09-01
Series:Sistemasi: Jurnal Sistem Informasi
Subjects:
Online Access:https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/5392
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849247303506853888
author Wildan Jaya Kusoema
Ichsan Ibrahim
author_facet Wildan Jaya Kusoema
Ichsan Ibrahim
author_sort Wildan Jaya Kusoema
collection DOAJ
description This study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouTube Data API, which were then preprocessed, automatically labeled using an Indonesian-language RoBERTa model, and balanced through class distribution techniques including undersampling and contextual embedding-based augmentation with IndoBERT. The model architecture integrates IndoBERT as a feature extractor and RCNN as the classifier, and was tested using various combinations of learning rates and batch sizes. Experimental results show that the optimal configuration was achieved with a learning rate of 2e-5 and a batch size of 16, resulting in an accuracy of 84% and an F1-score of 83%. While the model demonstrated strong performance in classifying negative comments, accuracy for neutral and positive classes was relatively lower due to semantic overlap and ambiguity in user expressions. This study contributes to Indonesian-language sentiment analysis by: 1. Integrating the IndoBERT-RCNN architecture for social-political issues, 2. Systematically evaluating hyperparameter combinations for three-class public opinion data, and 3.Utilizing YouTube comments as a relevant source of informal public discourse. The findings have potential applications in real-time digital public opinion monitoring systems for strategic national issues.
format Article
id doaj-art-9d62cf3e3ba34ffaa4780ab55c16d14c
institution Kabale University
issn 2302-8149
2540-9719
language Indonesian
publishDate 2025-09-01
publisher Islamic University of Indragiri
record_format Article
series Sistemasi: Jurnal Sistem Informasi
spelling doaj-art-9d62cf3e3ba34ffaa4780ab55c16d14c2025-08-20T03:58:15ZindIslamic University of IndragiriSistemasi: Jurnal Sistem Informasi2302-81492540-97192025-09-011452246225710.32520/stmsi.v14i5.53921181Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN MethodsWildan Jaya Kusoema0Ichsan Ibrahim1STMIK Indonesia MandiriSTMIK Indonesia MandiriThis study aims to evaluate the performance of a hybrid IndoBERT-RCNN model in classifying public sentiment toward the PT Pertamina corruption case, with a focus on how different hyperparameter combinations affect model accuracy. The dataset consists of 10,078 YouTube comments collected via the YouTube Data API, which were then preprocessed, automatically labeled using an Indonesian-language RoBERTa model, and balanced through class distribution techniques including undersampling and contextual embedding-based augmentation with IndoBERT. The model architecture integrates IndoBERT as a feature extractor and RCNN as the classifier, and was tested using various combinations of learning rates and batch sizes. Experimental results show that the optimal configuration was achieved with a learning rate of 2e-5 and a batch size of 16, resulting in an accuracy of 84% and an F1-score of 83%. While the model demonstrated strong performance in classifying negative comments, accuracy for neutral and positive classes was relatively lower due to semantic overlap and ambiguity in user expressions. This study contributes to Indonesian-language sentiment analysis by: 1. Integrating the IndoBERT-RCNN architecture for social-political issues, 2. Systematically evaluating hyperparameter combinations for three-class public opinion data, and 3.Utilizing YouTube comments as a relevant source of informal public discourse. The findings have potential applications in real-time digital public opinion monitoring systems for strategic national issues.https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/5392sentiment analysisindobertrcnncorruptiondeep learning
spellingShingle Wildan Jaya Kusoema
Ichsan Ibrahim
Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
Sistemasi: Jurnal Sistem Informasi
sentiment analysis
indobert
rcnn
corruption
deep learning
title Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
title_full Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
title_fullStr Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
title_full_unstemmed Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
title_short Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods
title_sort sentiment analysis on the pt pertamina corruption case using indobert and rcnn methods
topic sentiment analysis
indobert
rcnn
corruption
deep learning
url https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/5392
work_keys_str_mv AT wildanjayakusoema sentimentanalysisontheptpertaminacorruptioncaseusingindobertandrcnnmethods
AT ichsanibrahim sentimentanalysisontheptpertaminacorruptioncaseusingindobertandrcnnmethods