Detection of Adult Content in Arabic Tweets Using Machine Learning Models

Al-Anazi, Aram Ibrahim (2025) Detection of Adult Content in Arabic Tweets Using Machine Learning Models. International Journal of Innovative Science and Research Technology, 10 (9): 25sep393. pp. 1060-1065. ISSN 2456-2165

Abstract

This study evaluates the effectiveness of various machine learning and deep learning models in detecting adult content in Arabic tweets, addressing unique linguistic and cultural challenges. Using a dataset of 33,691 Arabic tweets, we implemented and compared Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory networks (LSTM), and AraBERT. The data underwent thorough preprocessing, including cleaning, tokenization, and segmentation into training, validation, and test sets. Performance metrics such as accuracy, F1 score, and confusion matrices were used to assess model efficacy. AraBERT achieved the highest accuracy (100%), demonstrating superior capability in capturing spatial patterns for content classification. CNN and RNN also performed well, with accuracies of 94.27% and 94.22%, respectively, while LSTM achieved an accuracy of 88.37%. These findings highlight AraBERT's potential for effective content moderation in Arabic digital spaces, contributing to safer online environments.

Documents
2846:17147
[thumbnail of IJISRT25SEP393.pdf]
Preview
IJISRT25SEP393.pdf - Published Version

Download (444kB) | Preview
Information
Library
Metrics

Altmetric Metrics

Dimensions Matrics

Statistics

Downloads

Downloads per month over past year

View Item