Javascript must be enabled to continue!

Evaluation of Prompting Strategies for Cyberbullying Detection Using Various Large Language Models

Sentiment analysis detects toxic language for safer online spaces and helps businesses refine strategies through customer feedback analysis [1, 2]. Advancements in Large Language Models (LLMs) and prompt engineering have introduced novel approaches to sentiment analysis, cyberbullying detection, and toxicity classification. However, several challenges persist, particularly in handling text ambiguity, sarcasm, multilingual contexts, and nuanced emotional comprehension, which limit the ability to achieve accurate and human-aligned results. This study uses the CYBY23 dataset, which contains 112 human-annotated threads. To balance the dataset, synthetic threads were generated using ChatGPT, resulting in a final dataset of 148 threads evenly distributed across two labels: 0 (bullying with no aggression) and 1 (bullying with aggression). Three publicly available LLMs—Deepseek-r1-distillllama-70b (Deepseek), Qwen-2.5-32b (Qwen) and llama3-70b-8192 (Llama)—were systematically evaluated using zero-shot, one-shot, and few-shot prompting strategies, with all models accessed via Groq Cloud APIs. The model outputs were assessed using recall, precision, F1 scores, and accuracy to measure performance in different prompting techniques (PT). In this report, Qwen achieved the highest overall accuracy at 82.43% in few-shot 2, while Llama matched that accuracy in one-shot 2, demonstrating solid performance in few-shot tasks as well. Deepseek showed high variability, thriving with contextual enhancements in zero-shot 2 but struggling in one-shot and fluctuating in few-shot settings. one-shot prompting proved most effective for Llama, while few-shot methods worked best for both Qwen and Llama.

Advances in Artificial Intelligence and Machine Learning

Anamika Gupta Sakshi Garg Harsh Bamotra

Advances in Knowledge-Based Systems, Data Science, and Cybersecurity

2025

Title: Evaluation of Prompting Strategies for Cyberbullying Detection Using Various Large Language Models

Description:

Sentiment analysis detects toxic language for safer online spaces and helps businesses refine strategies through customer feedback analysis [1, 2].

Advancements in Large Language Models (LLMs) and prompt engineering have introduced novel approaches to sentiment analysis, cyberbullying detection, and toxicity classification.

However, several challenges persist, particularly in handling text ambiguity, sarcasm, multilingual contexts, and nuanced emotional comprehension, which limit the ability to achieve accurate and human-aligned results.

This study uses the CYBY23 dataset, which contains 112 human-annotated threads.

To balance the dataset, synthetic threads were generated using ChatGPT, resulting in a final dataset of 148 threads evenly distributed across two labels: 0 (bullying with no aggression) and 1 (bullying with aggression).

Three publicly available LLMs—Deepseek-r1-distillllama-70b (Deepseek), Qwen-2.

5-32b (Qwen) and llama3-70b-8192 (Llama)—were systematically evaluated using zero-shot, one-shot, and few-shot prompting strategies, with all models accessed via Groq Cloud APIs.

The model outputs were assessed using recall, precision, F1 scores, and accuracy to measure performance in different prompting techniques (PT).

In this report, Qwen achieved the highest overall accuracy at 82.

43% in few-shot 2, while Llama matched that accuracy in one-shot 2, demonstrating solid performance in few-shot tasks as well.

Deepseek showed high variability, thriving with contextual enhancements in zero-shot 2 but struggling in one-shot and fluctuating in few-shot settings.

one-shot prompting proved most effective for Llama, while few-shot methods worked best for both Qwen and Llama.

Back

<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...

School Staff's Perceptions and Attitudes towards Cyberbullying

<p>Parallel with the spread of technology use, cyberbullying has become a serious problem in schools, particularly those in developed countries where most young people have r...

Moderasi Anonimitas dalam Pengaruh Celebrity Worship terhadap Cyberbullying NCTzen

Abstract. Cyberbullying behavior has become a prevalent phenomenon within K-pop fandom communities, often influenced by the intensity of attachment to celebrities and the presence ...

Fenomena Cyberbullying di Media Sosial TikTok

TikTok is a social media platform that creates 15-60 second videos with various music features, filters, stickers, and other creative effects. The existence of the Tiktok social ne...

Empati dan Cyberbullying pada Remaja Pengguna Media Sosial: Sebuah Kajian Literatur

The many emerging social media features and easy access when using them have made the younger generation interested in using social media. However, excessive use of social media or...

Cyberbullying and its Relationship with Smartphone Addiction

Smartphone and internet overuse can result into severe problems in users’ personal and social lives. In this backdrop, the present study aims to find the relationship between cyber...

Remaja, Media Sosial, dan Cyberbullying: Kajian Literatur

Social media and teenagers are interrelated and almost inseparable. Since the pandemic, the level of mobile phone usage has increased because learning activities and social interac...

MENAKAR PEMAAFAN PADA PENYINTAS CYBERBULLYING

The purpose of this study is to see an overview of cyberbullying survivors' forgiveness in post-pandemic life. This research uses a quantitative descriptive approach. The research ...

Email:
Password:

Email:

Evaluation of Prompting Strategies for Cyberbullying Detection Using Various Large Language Models

Related Results