International Journal on Science and Technology

E-ISSN: 2229-7677     Impact Factor: 9.88

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 16 Issue 4 October-December 2025 Submit your research before last 3 days of December to publish your research paper in the issue of October-December.

Clustering technique for Automatic Kannada Text Summarization

Author(s) Arpitha Swamy
Country India
Abstract Text summarization is an application of natural language processing in the field of data mining, used to generate the summary of document. It is a process to reduce the contents in the original text to shorter form which contains important information which is useful for the user in different real-life applications. A lot of techniques have been developed to summarize English text documents but only a small number of methods have been developed for Kannada text because of lack of resources and tools available for Kannada language. This paper discusses the extractive text summarization technique which selects main sentences from the Kannada document. In the proposed approach, Term- Frequency/Inverse Sentence Frequency (TF/ISF) is used to compute the sentence score first and then sentences are grouped by means of clustering algorithm called K-means to produce the extractive summary. The results of the proposed model are evaluated using ROUGE toolkit to measure the performance based on F-score of generated summaries. Experimental studies on custom-built dataset containing 50 Kannada text documents shows significantly better performance in producing extractive summaries as compared to human summaries.
Keywords Extractive, K-means algorithm, ROUGE, Text summarization, TF-ISF
Field Computer > Artificial Intelligence / Simulation / Virtual Reality
Published In Volume 12, Issue 4, October-December 2021
Published On 2021-12-15

Share this