Javascript must be enabled to continue!
Evaluation of Prompting Strategies for Cyberbullying Detection Using Various Large Language Models
View through CrossRef
Sentiment analysis detects toxic language for safer online spaces and helps businesses refine
strategies through customer feedback analysis [1, 2]. Advancements in Large Language
Models (LLMs) and prompt engineering have introduced novel approaches to sentiment
analysis, cyberbullying detection, and toxicity classification. However, several challenges
persist, particularly in handling text ambiguity, sarcasm, multilingual contexts, and nuanced
emotional comprehension, which limit the ability to achieve accurate and human-aligned
results. This study uses the CYBY23 dataset, which contains 112 human-annotated threads.
To balance the dataset, synthetic threads were generated using ChatGPT, resulting in a final
dataset of 148 threads evenly distributed across two labels: 0 (bullying with no aggression) and 1 (bullying with aggression). Three publicly available LLMs—Deepseek-r1-distillllama-70b (Deepseek), Qwen-2.5-32b (Qwen) and llama3-70b-8192 (Llama)—were systematically evaluated using zero-shot, one-shot, and few-shot prompting strategies, with all models accessed via Groq Cloud APIs. The model outputs were assessed using recall, precision,
F1 scores, and accuracy to measure performance in different prompting techniques (PT). In
this report, Qwen achieved the highest overall accuracy at 82.43% in few-shot 2, while Llama
matched that accuracy in one-shot 2, demonstrating solid performance in few-shot tasks as
well. Deepseek showed high variability, thriving with contextual enhancements in zero-shot
2 but struggling in one-shot and fluctuating in few-shot settings. one-shot prompting proved
most effective for Llama, while few-shot methods worked best for both Qwen and Llama.
Advances in Artificial Intelligence and Machine Learning
Title: Evaluation of Prompting Strategies for Cyberbullying Detection Using Various Large Language Models
Description:
Sentiment analysis detects toxic language for safer online spaces and helps businesses refine
strategies through customer feedback analysis [1, 2].
Advancements in Large Language
Models (LLMs) and prompt engineering have introduced novel approaches to sentiment
analysis, cyberbullying detection, and toxicity classification.
However, several challenges
persist, particularly in handling text ambiguity, sarcasm, multilingual contexts, and nuanced
emotional comprehension, which limit the ability to achieve accurate and human-aligned
results.
This study uses the CYBY23 dataset, which contains 112 human-annotated threads.
To balance the dataset, synthetic threads were generated using ChatGPT, resulting in a final
dataset of 148 threads evenly distributed across two labels: 0 (bullying with no aggression) and 1 (bullying with aggression).
Three publicly available LLMs—Deepseek-r1-distillllama-70b (Deepseek), Qwen-2.
5-32b (Qwen) and llama3-70b-8192 (Llama)—were systematically evaluated using zero-shot, one-shot, and few-shot prompting strategies, with all models accessed via Groq Cloud APIs.
The model outputs were assessed using recall, precision,
F1 scores, and accuracy to measure performance in different prompting techniques (PT).
In
this report, Qwen achieved the highest overall accuracy at 82.
43% in few-shot 2, while Llama
matched that accuracy in one-shot 2, demonstrating solid performance in few-shot tasks as
well.
Deepseek showed high variability, thriving with contextual enhancements in zero-shot
2 but struggling in one-shot and fluctuating in few-shot settings.
one-shot prompting proved
most effective for Llama, while few-shot methods worked best for both Qwen and Llama.
Related Results
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
School Staff's Perceptions and Attitudes towards Cyberbullying
School Staff's Perceptions and Attitudes towards Cyberbullying
<p>Parallel with the spread of technology use, cyberbullying has become a serious problem in schools, particularly those in developed countries where most young people have r...
Moderasi Anonimitas dalam Pengaruh Celebrity Worship terhadap Cyberbullying NCTzen
Moderasi Anonimitas dalam Pengaruh Celebrity Worship terhadap Cyberbullying NCTzen
Abstract. Cyberbullying behavior has become a prevalent phenomenon within K-pop fandom communities, often influenced by the intensity of attachment to celebrities and the presence ...
Fenomena Cyberbullying di Media Sosial TikTok
Fenomena Cyberbullying di Media Sosial TikTok
TikTok is a social media platform that creates 15-60 second videos with various music features, filters, stickers, and other creative effects. The existence of the Tiktok social ne...
Empati dan Cyberbullying pada Remaja Pengguna Media Sosial: Sebuah Kajian Literatur
Empati dan Cyberbullying pada Remaja Pengguna Media Sosial: Sebuah Kajian Literatur
The many emerging social media features and easy access when using them have made the younger generation interested in using social media. However, excessive use of social media or...
Cyberbullying and its Relationship with Smartphone Addiction
Cyberbullying and its Relationship with Smartphone Addiction
Smartphone and internet overuse can result into severe problems in users’ personal and social lives. In this backdrop, the present study aims to find the relationship between cyber...
Remaja, Media Sosial, dan Cyberbullying: Kajian Literatur
Remaja, Media Sosial, dan Cyberbullying: Kajian Literatur
Social media and teenagers are interrelated and almost inseparable. Since the pandemic, the level of mobile phone usage has increased because learning activities and social interac...
MENAKAR PEMAAFAN PADA PENYINTAS CYBERBULLYING
MENAKAR PEMAAFAN PADA PENYINTAS CYBERBULLYING
The purpose of this study is to see an overview of cyberbullying survivors' forgiveness in post-pandemic life. This research uses a quantitative descriptive approach. The research ...


