.

ISSN 2063-5346
For urgent queries please contact : +918130348310

DOCUMENT CLASSIFICATION SYSTEM USING IMPROVISED RANDOM FOREST CLASSIFIER

Main Article Content

Kavyashree Nagarajaiah, Madhu Hanakere Krishnappa,Asha Kempaiah Rukmini
» doi: 10.48047/ecb/2023.12.si4.510

Abstract

Text classification is the process of categorizing text into pre-established groupings based on its content. The amount of information available on the Internet has grown significantly over the past few years, making the classification of texts one of the most crucial yet difficult tasks. Text classification is frequently used in a wide range of applications and for a variety of purposes. Blogs, Twitter, and other social media platforms are contributing significantly to the exponential growth of textual data on the internet. The type of text that people upload to the internet is not specified by them. The majority of academics in this field are searching for automated solutions to categorize data or give unclassified documents a class designation. One area where texts are classified is text categorization and the researchers offered a number of options for text classification. The methods for classifying the text often involve gathering training data, preparing the text, extracting features, reducing features, representing the document, and then employing classification algorithms to create a model for predicting the class of a new textual document. In this paper, a Document Classification System using an Improvised Random Forest (DCS-IRF) classifier is proposed which attains better performance when compared with other classifiers and the implementation is performed using python toolkit. The DCS-IRF performs using the data obtained from Reuters-21578 dataset, which is a collection of documents with news articles. Moreover, experimental results obtained by using IRF classifier offers excellent results in accuracy and provides efficient classification of the text documents.

Article Details