Enhancing Hate Speech Detection in the Digital Age: A Novel Model Fusion Approach Leveraging a Comprehensive Dataset

Waqas Sharif; Saima Abdullah; Saman Iftikhar; Daniah Al-Madani; Shahzad Mumtaz

doi:10.1109/ACCESS.2024.3367281

Enhancing Hate Speech Detection in the Digital Age: A Novel Model Fusion Approach Leveraging a Comprehensive Dataset

Waqas Sharif, Saima Abdullah, Saman Iftikhar, Daniah Al-Madani, Shahzad Mumtaz

Computing Science

Research output: Contribution to journal › Article › peer-review

1 Downloads (Pure)

Abstract

In the era of digital communication, social media platforms have experienced exponential growth, becoming primary channels for information exchange. However, this surge has also amplified the rapid spread of hate speech, prompting extensive research efforts for effective mitigation. These efforts have prominently featured advanced natural language processing techniques, particularly emphasizing deep learning methods that have shown promising outcomes. This article presents a novel approach to address this pressing issue, combining a comprehensive dataset of 18 sources. It includes 0.45 million comments sourced from various digital platforms spanning different time frames. There were two models utilized to address the diversity in the data and leverage distinct strengths found within deep learning frameworks: CNN and BiLSTM with an attention mechanism. These models were tailored to handle specific subsets of the data, allowing for a more targeted approach. The unique outputs from both models were then fused into a unified model. This methodology outperformed recent models, showcasing enhanced generalization capabilities even when tested on the largest and most diverse dataset. Our model achieved an impressive accuracy of 89%, while maintaining a high precision of 0.88 and recall of 0.91.

Original language	English
Pages (from-to)	27225-27236
Number of pages	12
Journal	IEEE Access
Volume	12
Early online date	19 Feb 2024
DOIs	https://doi.org/10.1109/ACCESS.2024.3367281
Publication status	Published - 23 Feb 2024

Bibliographical note

The authors extend their appreciation to the Arab Open Uni-versity for funding this work through AOU research fund No.(AOUKSA-524008)

Keywords

BiLSTM
CNN
deep learning
Hate speech detection
model fusion
natural language processing

Access to Document

10.1109/ACCESS.2024.3367281Licence: CC BY-NC-ND

Sharif_etal_IEEEA_Enhancing_Hate_Speech_VOR
. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Final published version, 7.25 MBLicence: CC BY-NC-ND

https://ieeexplore.ieee.org/document/10439156/

Cite this

@article{24b4dae6acfe4990bf3d7ca88f13041a,

title = "Enhancing Hate Speech Detection in the Digital Age: A Novel Model Fusion Approach Leveraging a Comprehensive Dataset",

abstract = "In the era of digital communication, social media platforms have experienced exponential growth, becoming primary channels for information exchange. However, this surge has also amplified the rapid spread of hate speech, prompting extensive research efforts for effective mitigation. These efforts have prominently featured advanced natural language processing techniques, particularly emphasizing deep learning methods that have shown promising outcomes. This article presents a novel approach to address this pressing issue, combining a comprehensive dataset of 18 sources. It includes 0.45 million comments sourced from various digital platforms spanning different time frames. There were two models utilized to address the diversity in the data and leverage distinct strengths found within deep learning frameworks: CNN and BiLSTM with an attention mechanism. These models were tailored to handle specific subsets of the data, allowing for a more targeted approach. The unique outputs from both models were then fused into a unified model. This methodology outperformed recent models, showcasing enhanced generalization capabilities even when tested on the largest and most diverse dataset. Our model achieved an impressive accuracy of 89%, while maintaining a high precision of 0.88 and recall of 0.91.",

keywords = "BiLSTM, CNN, deep learning, Hate speech detection, model fusion, natural language processing",

author = "Waqas Sharif and Saima Abdullah and Saman Iftikhar and Daniah Al-Madani and Shahzad Mumtaz",

note = "The authors extend their appreciation to the Arab Open Uni-versity for funding this work through AOU research fund No.(AOUKSA-524008)",

year = "2024",

month = feb,

day = "23",

doi = "10.1109/ACCESS.2024.3367281",

language = "English",

volume = "12",

pages = "27225--27236",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "IEEE Explore",

}

TY - JOUR

T1 - Enhancing Hate Speech Detection in the Digital Age

T2 - A Novel Model Fusion Approach Leveraging a Comprehensive Dataset

AU - Sharif, Waqas

AU - Abdullah, Saima

AU - Iftikhar, Saman

AU - Al-Madani, Daniah

AU - Mumtaz, Shahzad

N1 - The authors extend their appreciation to the Arab Open Uni-versity for funding this work through AOU research fund No.(AOUKSA-524008)

PY - 2024/2/23

Y1 - 2024/2/23

N2 - In the era of digital communication, social media platforms have experienced exponential growth, becoming primary channels for information exchange. However, this surge has also amplified the rapid spread of hate speech, prompting extensive research efforts for effective mitigation. These efforts have prominently featured advanced natural language processing techniques, particularly emphasizing deep learning methods that have shown promising outcomes. This article presents a novel approach to address this pressing issue, combining a comprehensive dataset of 18 sources. It includes 0.45 million comments sourced from various digital platforms spanning different time frames. There were two models utilized to address the diversity in the data and leverage distinct strengths found within deep learning frameworks: CNN and BiLSTM with an attention mechanism. These models were tailored to handle specific subsets of the data, allowing for a more targeted approach. The unique outputs from both models were then fused into a unified model. This methodology outperformed recent models, showcasing enhanced generalization capabilities even when tested on the largest and most diverse dataset. Our model achieved an impressive accuracy of 89%, while maintaining a high precision of 0.88 and recall of 0.91.

AB - In the era of digital communication, social media platforms have experienced exponential growth, becoming primary channels for information exchange. However, this surge has also amplified the rapid spread of hate speech, prompting extensive research efforts for effective mitigation. These efforts have prominently featured advanced natural language processing techniques, particularly emphasizing deep learning methods that have shown promising outcomes. This article presents a novel approach to address this pressing issue, combining a comprehensive dataset of 18 sources. It includes 0.45 million comments sourced from various digital platforms spanning different time frames. There were two models utilized to address the diversity in the data and leverage distinct strengths found within deep learning frameworks: CNN and BiLSTM with an attention mechanism. These models were tailored to handle specific subsets of the data, allowing for a more targeted approach. The unique outputs from both models were then fused into a unified model. This methodology outperformed recent models, showcasing enhanced generalization capabilities even when tested on the largest and most diverse dataset. Our model achieved an impressive accuracy of 89%, while maintaining a high precision of 0.88 and recall of 0.91.

KW - BiLSTM

KW - CNN

KW - deep learning

KW - Hate speech detection

KW - model fusion

KW - natural language processing

UR - http://www.scopus.com/inward/record.url?scp=85186104352&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2024.3367281

DO - 10.1109/ACCESS.2024.3367281

M3 - Article

SN - 2169-3536

VL - 12

SP - 27225

EP - 27236

JO - IEEE Access

JF - IEEE Access

ER -

Enhancing Hate Speech Detection in the Digital Age: A Novel Model Fusion Approach Leveraging a Comprehensive Dataset

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this