Agent-based Approach for Web Crawling
Abstract Since
its creation in 1990, World Wide Web has increased the popularity of
Internet
which becomes an important source of information or services for all
people
over the world. The dynamic nature of the Web draws attention to the
need for
continuous support and updating of Web information retrieval systems.
Web
crawling is the process of discovery and maintenance of large-scale web
data.
Crawlers achieve this process by following the Web pages hyperlinks to
automatically download a partial snapshot of the Web. In this paper, an
agent-based approach, through three scenarios, for parallel and
distributed Web
crawling is presented. Simulations with ns2 show that the cloning based
mobile
agents scenario outperforms the single and multiple mobile agents
scenarios. |