vioft2nntf2t|tblJournal|Abstract_paper|0xf4ff0ace0a000000108a010001000500
The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to web data and documents. Web content mining and web structure mining have important roles in identifying the relevant web page. Relevancy of web page denotes how well a retrieved web page or set of web pages meets the information need of the user. Page Rank, Weighted Page Rank and Hypertext Induced Topic Selection (HITS) are existing algorithms which considers only web structure mining. Vector Space Model (VSM), Cover Density Ranking (CDR), Okapi similarity measurement (Okapi) and Three-Level Scoring method (TLS) are some of existing relevancy score methods which consider only web content mining. In this paper, we propose a new algorithm, Weighted Page with Relevant Rank (WPRR) which is blend of both web content mining and web structure mining that demonstrates the relevancy of the page with respect to given query for two different case scenarios. It is shown that WPRR’s performance is better than the existing algorithms.