IMPROVED FEATURE SET EXTRACTION FROM DOCUMENTS USING MODIFIED BAG OF WORDS

ICTACT Journal on Soft Computing ( Volume: 11 , Issue: 1 )

Abstract

vioft2nntf2t|tblJournal|Abstract_paper|0xf4ff37952b0000000bb8070001000800
In conventional literatures, there are several different methods of collection and extraction and are also used to minimize dimensionality. Traditional methods are intuitively designed to delete redundant and outdated information to help define new test cases more effectively. But the number of specific words in the Bag of Words (BoW) model must be manually calculated, requiring time and work and portability of deficiencies. In addition, the number of codebook vectors in BoW rises as cancer types grow and the efficiency and accuracy of detection are reduced. The BoW model is therefore not ideal for multi-operative failure diagnosis. Therefore, we propose an improved BoW in this paper which selects the number of special terms required to collect cancer diagnostic functions from different documents. The overall recognition and accuracy rates are higher than other existing extraction models. The improved BoW method has been verified to be highly effective in operating conditions that meet the requirements in real time.

Authors

R Sathish Babu, R Nagarajan
Annamalai University, India

Keywords

Bag of Words, Cancer Document Retrieval, Codebook, Dimensionality Reduction

Published By
ICTACT
Published In
ICTACT Journal on Soft Computing
( Volume: 11 , Issue: 1 )
Date of Publication
October 2020
Pages
2213-2217

ICT Academy is an initiative of the Government of India in collaboration with the state Governments and Industries. ICT Academy is a not-for-profit society, the first of its kind pioneer venture under the Public-Private-Partnership (PPP) model

Contact Us

ICT Academy
Module No E6 -03, 6th floor Block - E
IIT Madras Research Park
Kanagam Road, Taramani,
Chennai 600 113,
Tamil Nadu, India

For Journal Subscription: journalsales@ictacademy.in

For further Queries and Assistance, write to us at: ictacademy.journal@ictacademy.in