Detection of Adult Content in Arabic Tweets Using Machine Learning Models

Al-Anazi, Aram Ibrahim

Detection of Adult Content in Arabic Tweets Using Machine Learning Models

Al-Anazi, Aram Ibrahim (2025) Detection of Adult Content in Arabic Tweets Using Machine Learning Models. International Journal of Innovative Science and Research Technology, 10 (9): 25sep393. pp. 1060-1065. ISSN 2456-2165

[A][B][+][-]

Abstract

This study evaluates the effectiveness of various machine learning and deep learning models in detecting adult content in Arabic tweets, addressing unique linguistic and cultural challenges. Using a dataset of 33,691 Arabic tweets, we implemented and compared Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory networks (LSTM), and AraBERT. The data underwent thorough preprocessing, including cleaning, tokenization, and segmentation into training, validation, and test sets. Performance metrics such as accuracy, F1 score, and confusion matrices were used to assess model efficacy. AraBERT achieved the highest accuracy (100%), demonstrating superior capability in capturing spatial patterns for content classification. CNN and RNN also performed well, with accuracies of 94.27% and 94.22%, respectively, while LSTM achieved an accuracy of 88.37%. These findings highlight AraBERT's potential for effective content moderation in Arabic digital spaces, contributing to safer online environments.

Documents