org.semanticdesktop.aperture.crawler.web
Class CrawlJob

java.lang.Object
  extended by org.semanticdesktop.aperture.crawler.web.CrawlJob

public class CrawlJob
extends Object

A CrawlJob is used to queue a request for retrieving the content of a URL.

Implementation note: Strings are used to model URLs, rather than java.net.URL, in order to allow the use of schemes other than http(s) and file without requiring registering of a URLStreamHandler for each scheme.


Constructor Summary
CrawlJob(String url, int depth)
          Schedule a URL for crawling.
 
Method Summary
 int getDepth()
           
 String getURL()
           
 void setDepth(int depth)
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CrawlJob

public CrawlJob(String url,
                int depth)
Schedule a URL for crawling. The depth indicates how deep the hypertext graph needs to be crawler. A depth of 0 indicates that only this url needs to be crawled, 1 indicates that all directly linked URLs also need to be crawled, etc. Use a negative value to indicate that the graph needs to be crawled exhaustively.

Parameters:
url - The URL to crawl.
depth - The number of hops to crawl, starting from this URL, or a negative value to indicate that their is no depth limit.
Method Detail

getURL

public String getURL()

getDepth

public int getDepth()

setDepth

public void setDepth(int depth)

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2010 Aperture Development Team. All Rights Reserved.