ADAPTIVE CONTENT BASED TEXTUAL INFORMATION SOURCE PRIORITIZATION

ICTACT Journal on Soft Computing ( Volume: 5 , Issue: 1 )

Abstract

vioft2nntf2t|tblJournal|Abstract_paper|0xf4ff15fd17000000aa49030001000200
The world-wide-web offers a posse of textual information sources which are ready to be utilized for several applications. In fact, given the rapidly evolving nature of online data, there is a real risk of information overload unless we continue to develop and refine techniques to meaningfully segregate these information sources. Specifically, there is a dearth of content-oriented and intelligent techniques which can learn from past search experiences and also adapt to a user’s specific requirements during her current search. In this paper, we tackle the core issue of prioritizing textual information sources on the basis of the relevance of their content to the central theme that a user is currently exploring. We propose a new Source Prioritization Algorithm that adopts an iterative learning approach to assess the proclivity of given information sources towards a set of user-defined seed words in order to prioritise them. The final priorities obtained serve as initial priorities for the next search request. This serves a dual purpose. Firstly, the system learns incrementally from several users’ cumulative search experiences and re-adjusts the source priorities to reflect the acquired knowledge. Secondly, the refreshed source priorities are utilized to direct a user’s current search towards more relevant sources while adapting also to the new set of keywords acquired from that user. Experimental results show that the proposed algorithm progressively improves the system’s ability to discern between different sources, even in the presence of several random sources. Further, it is able to scale well to identify the augmented information source when a new enriched information source is generated by clubbing existing ones.

Authors

Nikhil Mitra, Nilanjana Goel, S. Chakraverty, Gurmeet Singh
Netaji Subhas Institute of Technology, India

Keywords

Textual Information Source Prioritization, Search Engines, Domain Specificity, Term-Source Matrix, Text Information Density

Published By
ICTACT
Published In
ICTACT Journal on Soft Computing
( Volume: 5 , Issue: 1 )
Date of Publication
October 2014
Pages
829-835

ICT Academy is an initiative of the Government of India in collaboration with the state Governments and Industries. ICT Academy is a not-for-profit society, the first of its kind pioneer venture under the Public-Private-Partnership (PPP) model

Contact Us

ICT Academy
Module No E6 -03, 6th floor Block - E
IIT Madras Research Park
Kanagam Road, Taramani,
Chennai 600 113,
Tamil Nadu, India

For Journal Subscription: journalsales@ictacademy.in

For further Queries and Assistance, write to us at: ictacademy.journal@ictacademy.in