|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface CrawlerHandler
CrawlerHandlers are notified by a Crawler about additions, changes and deletions or resources in a DataSource. Furthermore, they are notified when the Crawler is cleaning up all its crawl results.
Rather than being pure listeners on a Crawler, CrawlerHandlers are also responsible to produce an RDFContainer on demand which the Crawler can use to store the source-specific metadata of a DataObject. It is up to the CrawlerHandler implementor to decide whether a new instance is returned for every DataObject or whether a shared instance is used. It is also responsible for any transaction and context management.
Method Summary | |
---|---|
void |
accessingObject(Crawler crawler,
String url)
Notification that the Crawler is going to start accessing the specified data object. |
void |
clearFinished(Crawler crawler,
ExitCode exitCode)
Notification that the Crawler has finished clearing the information about the state of the datasource. |
void |
clearingObject(Crawler crawler,
String url)
Notification that the Crawler is removing all information it knows about the specified url. |
void |
clearStarted(Crawler crawler)
Notification that the specified Crawler has started clearing the information it had about the state of the datasource. |
void |
crawlStarted(Crawler crawler)
Notification that the specified Crawler has started crawling its DataSource for DataObjects. |
void |
crawlStopped(Crawler crawler,
ExitCode exitCode)
Notification that the specified Crawler has stopped crawling its DataSource for DataObjects. |
RDFContainerFactory |
getRDFContainerFactory(Crawler crawler,
String url)
Returns a RDFContainerFactory that will be used to provide RDFContainers that will hold a DataObject's metadata. |
void |
objectChanged(Crawler crawler,
DataObject object)
Notification that the Crawler has found a changed resource in the domain it is crawling. |
void |
objectNew(Crawler crawler,
DataObject object)
Notification that the Crawler has found a new resource in the domain it is crawling. |
void |
objectNotModified(Crawler crawler,
String url)
Notification that the Crawler has found a resource that has not been modified since the previous crawl. |
void |
objectRemoved(Crawler crawler,
String url)
Notification that the specified resource that has been found in the past could no longer be found. |
Method Detail |
---|
void crawlStarted(Crawler crawler)
crawler
- The reporting Crawler.void crawlStopped(Crawler crawler, ExitCode exitCode)
crawler
- The reporting Crawler.exitCode
- The status with which the crawling stopped.void accessingObject(Crawler crawler, String url)
crawler
- The reporting Crawler.url
- The url of the resource that is going to be accessed.RDFContainerFactory getRDFContainerFactory(Crawler crawler, String url)
crawler
- The requesting Crawler.url
- The url of the resource that is currently being accessed.
void objectNew(Crawler crawler, DataObject object)
crawler
- The reporting Crawler.object
- The constructed DataObject modeling the new resource.void objectChanged(Crawler crawler, DataObject object)
crawler
- The reporting Crawler.object
- The constructed DataObject modeling the changed resource.void objectNotModified(Crawler crawler, String url)
crawler
- The reporting Crawler.url
- The url of the unmodified resource.void objectRemoved(Crawler crawler, String url)
crawler
- The reporting Crawler.url
- The url that could no longer be found.void clearStarted(Crawler crawler)
clearingObject(Crawler, String)
on every known url.
crawler
- The reporting Crawler.Crawler.clear()
void clearingObject(Crawler crawler, String url)
AccessData
instance), not the
information in the data source itself.
crawler
- The reporting Crawler.url
- The url of the resource whose crawl results are being cleared.Crawler.clear()
void clearFinished(Crawler crawler, ExitCode exitCode)
crawler
- The concerning Crawler.exitCode
- The status with which the clearing stopped.Crawler.clear()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |