AN IMPROVED TERM WEIGHTING AND DOCUMENT RANKING METHOD USING RANDOM WALK MODEL FOR INFORMATION RETRIEVAL

Authors

  • Md. Rafiqul Islam Computer Science and Engineering Discipline, Khulna University, Khulna9208, Bangladesh
  • Buddha Dev Sarkar Computer Science and Engineering Discipline, Khulna University, Khulna9208, Bangladesh
  • Md. Rakibul Islam Computer Science and Engineering Discipline, Khulna University, Khulna9208, Bangladesh

DOI:

https://doi.org/10.53808/KUS.2010.10.1and2.0837-E

Keywords:

Information retrieval, random walk model, term weight, term position, information gain

Abstract

Document representation is one of the most fundamental issues in information retrieval application. The graph-based ranking algorithms represent document as a graph. Once a document is represented as graph, the similarity of that document to a query can be calculated in various ways and the calculation provides ranking to documents. This paper introduces an improved random-walk method to rank a document by considering position of a term within a document and information gain of that term within the whole document set. The experiments on various collection sets show that our approach improves the recall and precision than other proposed methods.

Downloads

Download data is not yet available.

References

Blanco, R. and Lioma, C. 2007. Random Walk Term Weighting for Information Retrieval. In: Proceedings of Special Interest Group Information Retrieval Amsterdam, Netherlands

Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN system

Hassan, S., Mihalcea, R. and Banea, C. 2006. Random Walk Term Weighting for Improved Text Classification. In: Proceedings of TextGraps: 2nd Workshop on Graph Based Methods for Natural Language Processing ACL: 53-60

Lancaster, F.W.1968. Information Retrieval Systems: Characteristics, Testing and Evaluation. Wiley, New York

Mihalcea, R. and Tarau, P. 2006. TextRank: Bringing Order into Texts. In: Proceedings of Empirical Methods in Natural Language Processing ACL: 404-411

Mooers,C.N. 1950. Information retrieval viewed as temporal signaling. pp 572-573. In: Proceedings of the International Congress of M athematicians. Volume 1

Salton, G. and Buckley, C. 1988. Term-weighting approaches in automatic text retrieval. Information Processing and Management: an International Journal 24(5): 513-523

Sun, Y., He, P. and Chen, Z. 2004. An Improved Term Weighting Scheme for Vector Space Model. In: Proceedings of the Third International Conference on Machine Learning & Cybernatics. Shanghai

Yun-tao, Z., Ling, G. and Yong-cheng, W. 2004. An improved TF-IDF approach for text classification. Journal of Zhejiang University SCIENCE:

Hofmann, T. 1999. Probabilistic Latent Semantic Indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval

Downloads

Published

25-11-2010

How to Cite

[1]
M. R. Islam, B. D. . Sarkar, and M. R. . Islam, “AN IMPROVED TERM WEIGHTING AND DOCUMENT RANKING METHOD USING RANDOM WALK MODEL FOR INFORMATION RETRIEVAL”, Khulna Univ. Stud., pp. 223–232, Nov. 2010.

Issue

Section

Science and Engineering

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.