Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification

The practice of online astroturfing has become increasingly pervasive in recent years, with the growth in popularity of social media. Astroturfing consists of promoting social, political, or other agendas in a non-transparent or deceitful way, where the promoters masquerade as normative users while...

Full description

Saved in:
Bibliographic Details
Main Authors: Jonathan Schler, Elisheva Bonchek-Dokow
Format: Article
Language:English
Published: MDPI AG 2024-09-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/6/4/108
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846103892700430336
author Jonathan Schler
Elisheva Bonchek-Dokow
author_facet Jonathan Schler
Elisheva Bonchek-Dokow
author_sort Jonathan Schler
collection DOAJ
description The practice of online astroturfing has become increasingly pervasive in recent years, with the growth in popularity of social media. Astroturfing consists of promoting social, political, or other agendas in a non-transparent or deceitful way, where the promoters masquerade as normative users while acting behind a mask that conceals their true identity, and at times that they are not human. In politics, astroturfing is currently considered one of the most severe online threats to democracy. The ability to automatically identify astroturfers thus constitutes a first step in eradicating this threat. We present a complete framework for handling a dataset of profiles, from data collection and efficient labeling, through feature extraction, and finally, to the identification of astroturfers lurking in the dataset. The data were collected over a period of 15 months, during which three consecutive elections were held in Israel. These raw data are unique in scope and size, consisting of several million public comments and reactions to posts on political candidates’ pages. For the manual labeling stage, we present a technique that can zoom in on a sufficiently large subset of astroturfer profiles, thus making the procedure highly efficient. The feature extraction stage consists of a temporal layer of features, which proves useful for identifying astroturfers. We then applied and compared several algorithms in the classification stage, and achieved improved results, with an F<sub>1</sub> score of 77% and accuracy of 92%.
format Article
id doaj-art-e46caa80aa2b48b1a5810b68d9e74aaa
institution Kabale University
issn 2504-4990
language English
publishDate 2024-09-01
publisher MDPI AG
record_format Article
series Machine Learning and Knowledge Extraction
spelling doaj-art-e46caa80aa2b48b1a5810b68d9e74aaa2024-12-27T14:37:26ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902024-09-01642183220010.3390/make6040108Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and ClassificationJonathan Schler0Elisheva Bonchek-Dokow1School of Computer Science, HIT (Holon Institute of Technology), Holon 5810201, IsraelSchool of Computer Science, HIT (Holon Institute of Technology), Holon 5810201, IsraelThe practice of online astroturfing has become increasingly pervasive in recent years, with the growth in popularity of social media. Astroturfing consists of promoting social, political, or other agendas in a non-transparent or deceitful way, where the promoters masquerade as normative users while acting behind a mask that conceals their true identity, and at times that they are not human. In politics, astroturfing is currently considered one of the most severe online threats to democracy. The ability to automatically identify astroturfers thus constitutes a first step in eradicating this threat. We present a complete framework for handling a dataset of profiles, from data collection and efficient labeling, through feature extraction, and finally, to the identification of astroturfers lurking in the dataset. The data were collected over a period of 15 months, during which three consecutive elections were held in Israel. These raw data are unique in scope and size, consisting of several million public comments and reactions to posts on political candidates’ pages. For the manual labeling stage, we present a technique that can zoom in on a sufficiently large subset of astroturfer profiles, thus making the procedure highly efficient. The feature extraction stage consists of a temporal layer of features, which proves useful for identifying astroturfers. We then applied and compared several algorithms in the classification stage, and achieved improved results, with an F<sub>1</sub> score of 77% and accuracy of 92%.https://www.mdpi.com/2504-4990/6/4/108astroturfingFacebookmachine learninglabelingfeature extraction
spellingShingle Jonathan Schler
Elisheva Bonchek-Dokow
Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification
Machine Learning and Knowledge Extraction
astroturfing
Facebook
machine learning
labeling
feature extraction
title Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification
title_full Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification
title_fullStr Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification
title_full_unstemmed Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification
title_short Profiling Astroturfers on Facebook: A Complete Framework for Labeling, Feature Extraction, and Classification
title_sort profiling astroturfers on facebook a complete framework for labeling feature extraction and classification
topic astroturfing
Facebook
machine learning
labeling
feature extraction
url https://www.mdpi.com/2504-4990/6/4/108
work_keys_str_mv AT jonathanschler profilingastroturfersonfacebookacompleteframeworkforlabelingfeatureextractionandclassification
AT elishevabonchekdokow profilingastroturfersonfacebookacompleteframeworkforlabelingfeatureextractionandclassification