Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Automatic summarization of Malayalam documents using clause identification method

View through CrossRef
<span>Text summarization is an active research area in the field of natural language processing. Huge amount of information in the internet necessitates the development of automatic summarization systems. There are two types of summarization techniques: Extractive and Abstractive. Extractive summarization selects important sentences from the text and produces summary as it is present in the original document. Abstractive summarization systems will provide a summary of the input text as is generated by human beings. Abstractive summary requires semantic analysis of text. Limited works have been carried out in the area of abstractive summarization in Indian languages especially in Malayalam. Only extractive summarization methods are proposed in Malayalam. In this paper, an abstractive summarization system for Malayalam documents using clause identification method is proposed. As part of this research work, a POS tagger and a morphological analyzer for Malayalam words in cricket domain are also developed. The clauses from input sentences are identified using a modified clause identification algorithm. The clauses are then semantically analyzed using an algorithm to identify semantic triples - subject, object and predicate. The score of each clause is then calculated by using feature extraction and the important clauses which are to be included in the summary are selected based on this score. Finally an algorithm is used to generate the sentences from the semantic triples of the selected clauses which is the abstractive summary of input documents.</span>
Title: Automatic summarization of Malayalam documents using clause identification method
Description:
<span>Text summarization is an active research area in the field of natural language processing.
Huge amount of information in the internet necessitates the development of automatic summarization systems.
There are two types of summarization techniques: Extractive and Abstractive.
Extractive summarization selects important sentences from the text and produces summary as it is present in the original document.
Abstractive summarization systems will provide a summary of the input text as is generated by human beings.
Abstractive summary requires semantic analysis of text.
Limited works have been carried out in the area of abstractive summarization in Indian languages especially in Malayalam.
Only extractive summarization methods are proposed in Malayalam.
In this paper, an abstractive summarization system for Malayalam documents using clause identification method is proposed.
As part of this research work, a POS tagger and a morphological analyzer for Malayalam words in cricket domain are also developed.
The clauses from input sentences are identified using a modified clause identification algorithm.
The clauses are then semantically analyzed using an algorithm to identify semantic triples - subject, object and predicate.
The score of each clause is then calculated by using feature extraction and the important clauses which are to be included in the summary are selected based on this score.
Finally an algorithm is used to generate the sentences from the semantic triples of the selected clauses which is the abstractive summary of input documents.
</span>.

Related Results

Envisioning Originalism Applied to Bioethics Cases
Envisioning Originalism Applied to Bioethics Cases
Photo ID 123697425 © Alexandersikov | Dreamstime.com Abstract Originalism is an increasingly prevalent method for interpreting provisions of the US Constitution. It requires strict...
Automatic Text Summarization Berdasarkan Pendekatan Statistika pada Dokumen Berbahasa Indonesia
Automatic Text Summarization Berdasarkan Pendekatan Statistika pada Dokumen Berbahasa Indonesia
Abstract—Propelled by the modern technological innovations data and text will be more abundant throughout the year. With this much text, automatic text summarization is needed now ...
SUBORDINATE CLAUSES IN ADULTERY NOVEL
SUBORDINATE CLAUSES IN ADULTERY NOVEL
A subordinate clause (dependent clause) is a clause that cannot stand alone as a complete sentence because it does not express a complete thought. It explains and gives more inform...
Social Network Integration in Document Summarization
Social Network Integration in Document Summarization
In this chapter, the author presents the new role of summarization in the dynamic network of social media and its importance in semantic analysis of social media and large data. Th...
Social Network Integration in Document Summarization
Social Network Integration in Document Summarization
In this chapter, the author presents the new role of summarization in the dynamic network of social media and its importance in semantic analysis of social media and large data. Th...
Social Network Integration in Document Summarization
Social Network Integration in Document Summarization
In this chapter, the author presents the new role of summarization in the dynamic network of social media and its importance in semantic analysis of social media and large data. Th...
IIST BCI Dataset-1 for Selected Common Malayalam Words
IIST BCI Dataset-1 for Selected Common Malayalam Words
Designing Brain Computer Interfaces (BCIs), for helping patients, needs appropriate datasets which are relevant for the language of the patients. There exists a significant shortag...
Advancements in Automatic Text Summarization using Natural Language Processing
Advancements in Automatic Text Summarization using Natural Language Processing
With the rapid expansion of data across various domains, the need for automated text summarization has become increasingly crucial. Given the overwhelming volu...

Back to Top