skip to main content
10.1145/1936254.1936283acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

A new approach in dynamic prediction for user based web page crawling

Published: 26 October 2010 Publication History

Abstract

Maximum available Web prediction techniques typically follow Markov model for Web based prediction. Everybody knows that there are lots of Web links or URLs on any Web page. So, it is very hard to predict the next Web page from the huge number of Web links. Existing approaches predict successfully on the private (personal) computer using different Markov models. In case of public (like cyber cafe) computers, prediction can not be done at all, since many people use the same machine in this type of scenario. In this paper, we propose a new policy on Web prediction using the dynamic behavior of users. We demonstrate four procedures for Web based prediction to make it faster. Our technique does not require any Web-log or usage history at client machine. We are going to use the mouse movement and its direction for the prediction of next Web page. We track the mouse position and its respective direction instead of using Markov model. In this research work, we introduce a fully dynamic Web prediction scheme, since Web-log or any type of static or previous information has not been utilized in our approach. In this paper, we try to minimize the number of Web links to be considered of any Web page in runtime for achieving better accuracy in dynamic Web prediction. Our approach shows the step-wise build-up of a solid Web prediction program which is appropriate in both the private as well as public scenario. Overall, this method shows a new way for prediction using dynamic nature of the respective users.

References

[1]
Mukhopadhyay, D., Mishra, P. and Saha, D., "An Agent Based Method for Web Page Prediction," 1st KES Symposium on Agent and Multi-Agent Systems -- Technologies and Applications, AMSTA 2007 Proceedings, Wroclow, Poland, Lecture Notes in Computer Science, Springer-Verlag, Germany, May 31 - June 1, 2007, pp. 219--228.
[2]
Mukhopadhyay, D., Dutta, R. Kundu, A. and Kim, Y., "A Model for Web Page Prediction using Cellular Automata," The 6th International Workshop MSPT 2006 Proceedings, Youngil Publication, ISBN 89-8801-90-0, ISSN 1975-5635, Republic of Korea, November 20, 2006, pp. 95--100.
[3]
Page, L. and Brin, S., "The Anatomy of a Large-Scale Hypertextual Web Search Engine," 7th International World Wide Web Conference, 1998, Brisbane, ustralia, pp. 107--11.
[4]
Bernardo, A., Huberman et. al., "Strong Regularities in World Wide Web Surfing," Science, vol. 3, Apr 1998, pp. 95--97.
[5]
Davison, B. D., "Learning Web Request Patterns," Web Dynamics -- Adapting to Change in Content, Size. Topology and Use, Springer, 2004, pp. 435--459.
[6]
Duchamp, D., "Prefetching Hyperlinks," USENIX Symposium on Internet Technologies and Systems, 1999.
[7]
Pitkow, J. E. and Pirolli, P., "Mining Longest Repeating Subsequences to Predict World Wide Web Surfing," USENIX Symposium on Internet Technologies and Systems, 1999, pp. 139--150.
[8]
Kroeger, T. M., Long, D. D. E. and Jeffrey C. Mogul, "Exploring the Bounds of Web Latency Reduction from Caching and Prefetching," USENIX Symposium on Internet Technologies and Systems, 1997.
[9]
Palpanas, T., "Web Prefetching using Partial Match Prediction," Technical Report CSRG-376, Graduate department of Computer Science, University of Toronto, 1966.
[10]
Xin Chen, X. and Zhang, X., "A Popularity-Based Prediction Model for Web Prefetching," IEEE Computer Society, March, 2003, pp. 63--70.
[11]
Rasmussen, E., "Clustering algorithms", Information Retrieval, pp. 419--442, Prentice Hall, Eaglewood Cliffs, N.J., 1992.
[12]
Hearst, M. A. and Pedersen, J. O., "Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results", In proceedings of the 19th Annual International ACM SIGIR Conference, Zurich, June 1996.
[13]
Wang, Y. and Kitsuregawa, M., "On combining link and contents information for web page clustering", 13th International Conference on Database and Expert Systems Applications DEXA2002, Aix-en-Provence, France, September 2002, pp. 902--913.
[14]
Gupta, G. K., "Introduction to Data Mining with Case Studies", Prentice Hall of India, 2006.
[15]
Su, Z., Yang, Q., Lu, Y. and Zhang, H., "Whatnext: A Prediction System for Web Requests Using N-gram Sequence Models," First International Conferences on Web Information Systems and Engineering Conferences, Hong Kong, June 2000, pp. 200--207.

Cited By

View all

Index Terms

  1. A new approach in dynamic prediction for user based web page crawling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      MEDES '10: Proceedings of the International Conference on Management of Emergent Digital EcoSystems
      October 2010
      302 pages
      ISBN:9781450300476
      DOI:10.1145/1936254
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • NECTEC: National Electronics and Computer Technology Center
      • KU: Kasetsart University

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 October 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. dynamic prediction
      2. predictive crawling
      3. user based prediction

      Qualifiers

      • Research-article

      Conference

      MEDES '10
      Sponsor:
      • NECTEC
      • KU

      Acceptance Rates

      MEDES '10 Paper Acceptance Rate 26 of 93 submissions, 28%;
      Overall Acceptance Rate 267 of 682 submissions, 39%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media