Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Evaluating Classical and Transformer-Based Models for Urdu Abstractive Text Summarization: A Systematic Review

View through CrossRef
The rapid growth of digital content in Urdu has created an urgent need for effective automatic text summarization (ATS) systems. While extractive methods have been widely studied, abstractive summarization for Urdu remains largely unexplored, primarily due to the language's complex morphology and rich literary tradition. This paper systematically evaluates four transformer-based language models (BERT-Urdu, BART, mT5, and GPT-2) for Urdu abstractive summarization, comparing their performance against conventional machine learning and deep learning approaches. Using multiple Urdu datasets, including the Urdu Summarization Corpus, Fake News Dataset, and Urdu-Instruct-News, we demonstrate that fine-tuned Transformer Language Models (TLMs) significantly outperform traditional methods, with the multilingual mT5 model achieving a 0.42\% average improvement in F1-score over the best baseline. Our analysis reveals that mT5's architecture is particularly effective at handling Urdu-specific challenges such as right-to-left script processing, diacritic interpretation, and complex verb-noun compounding. The study presents empirically validated hyperparameter configurations and training strategies for Urdu ATS, establishing transformer-based approaches as the new state-of-the-art for Urdu text summarization. Our experiments demonstrate that mT5 outperforms Seq2Seq baselines by 20\% in ROUGE-L, underscoring the efficacy of Transformer-based models for Urdu summarization despite limited resources, while offering practical insights for low-resource language NLP applications.
Title: Evaluating Classical and Transformer-Based Models for Urdu Abstractive Text Summarization: A Systematic Review
Description:
The rapid growth of digital content in Urdu has created an urgent need for effective automatic text summarization (ATS) systems.
While extractive methods have been widely studied, abstractive summarization for Urdu remains largely unexplored, primarily due to the language's complex morphology and rich literary tradition.
This paper systematically evaluates four transformer-based language models (BERT-Urdu, BART, mT5, and GPT-2) for Urdu abstractive summarization, comparing their performance against conventional machine learning and deep learning approaches.
Using multiple Urdu datasets, including the Urdu Summarization Corpus, Fake News Dataset, and Urdu-Instruct-News, we demonstrate that fine-tuned Transformer Language Models (TLMs) significantly outperform traditional methods, with the multilingual mT5 model achieving a 0.
42\% average improvement in F1-score over the best baseline.
Our analysis reveals that mT5's architecture is particularly effective at handling Urdu-specific challenges such as right-to-left script processing, diacritic interpretation, and complex verb-noun compounding.
The study presents empirically validated hyperparameter configurations and training strategies for Urdu ATS, establishing transformer-based approaches as the new state-of-the-art for Urdu text summarization.
Our experiments demonstrate that mT5 outperforms Seq2Seq baselines by 20\% in ROUGE-L, underscoring the efficacy of Transformer-based models for Urdu summarization despite limited resources, while offering practical insights for low-resource language NLP applications.

Related Results

A Systematic Review and Experimental Evaluation of Classical and Transformer-Based Models for Urdu Abstractive Text Summarization
A Systematic Review and Experimental Evaluation of Classical and Transformer-Based Models for Urdu Abstractive Text Summarization
The rapid growth of digital content in Urdu has created an urgent need for effective automatic text summarization (ATS) systems. While extractive methods have been widely studied, ...
Abstractive text summarization of low-resourced languages using deep learning
Abstractive text summarization of low-resourced languages using deep learning
Background Humans must be able to cope with the huge amounts of information produced by the information technology revolution. As a result, automatic text summarizat...
Automatic summarization of Malayalam documents using clause identification method
Automatic summarization of Malayalam documents using clause identification method
<span>Text summarization is an active research area in the field of natural language processing. Huge amount of information in the internet necessitates the development of au...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
Automatic Text Summarization Berdasarkan Pendekatan Statistika pada Dokumen Berbahasa Indonesia
Automatic Text Summarization Berdasarkan Pendekatan Statistika pada Dokumen Berbahasa Indonesia
Abstract—Propelled by the modern technological innovations data and text will be more abundant throughout the year. With this much text, automatic text summarization is needed now ...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Automatic text summarization based on extractive-abstractive method
Automatic text summarization based on extractive-abstractive method
The choice of this study has a significant impact on daily life. In various fields such as journalism, academia, business, and more, large amounts of text need to be processed quic...

Back to Top