AN IMPROVED TERM WEIGHTING AND DOCUMENT RANKING METHOD USING RANDOM WALK MODEL FOR INFORMATION RETRIEVAL
DOI:
https://doi.org/10.53808/KUS.2010.10.1and2.0837-EKeywords:
Information retrieval, random walk model, term weight, term position, information gainAbstract
Document representation is one of the most fundamental issues in information retrieval application. The graph-based ranking algorithms represent document as a graph. Once a document is represented as graph, the similarity of that document to a query can be calculated in various ways and the calculation provides ranking to documents. This paper introduces an improved random-walk method to rank a document by considering position of a term within a document and information gain of that term within the whole document set. The experiments on various collection sets show that our approach improves the recall and precision than other proposed methods.
Downloads
References
Blanco, R. and Lioma, C. 2007. Random Walk Term Weighting for Information Retrieval. In: Proceedings of Special Interest Group Information Retrieval Amsterdam, Netherlands
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN system
Hassan, S., Mihalcea, R. and Banea, C. 2006. Random Walk Term Weighting for Improved Text Classification. In: Proceedings of TextGraps: 2nd Workshop on Graph Based Methods for Natural Language Processing ACL: 53-60
Lancaster, F.W.1968. Information Retrieval Systems: Characteristics, Testing and Evaluation. Wiley, New York
Mihalcea, R. and Tarau, P. 2006. TextRank: Bringing Order into Texts. In: Proceedings of Empirical Methods in Natural Language Processing ACL: 404-411
Mooers,C.N. 1950. Information retrieval viewed as temporal signaling. pp 572-573. In: Proceedings of the International Congress of M athematicians. Volume 1
Salton, G. and Buckley, C. 1988. Term-weighting approaches in automatic text retrieval. Information Processing and Management: an International Journal 24(5): 513-523
Sun, Y., He, P. and Chen, Z. 2004. An Improved Term Weighting Scheme for Vector Space Model. In: Proceedings of the Third International Conference on Machine Learning & Cybernatics. Shanghai
Yun-tao, Z., Ling, G. and Yong-cheng, W. 2004. An improved TF-IDF approach for text classification. Journal of Zhejiang University SCIENCE:
Hofmann, T. 1999. Probabilistic Latent Semantic Indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Khulna University Studies
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.