org.semanticdesktop.aperture.subcrawler.base
Class AbstractSubCrawler

java.lang.Object
  extended by org.semanticdesktop.aperture.subcrawler.base.AbstractSubCrawler
All Implemented Interfaces:
SubCrawler
Direct Known Subclasses:
AbstractArchiverSubCrawler, AbstractCompressorSubCrawler, MboxSubCrawler, MimeSubCrawler, VcardSubCrawler

public abstract class AbstractSubCrawler
extends Object
implements SubCrawler

A common superclass for all subcrawler implementations


Constructor Summary
AbstractSubCrawler()
           
 
Method Summary
protected  URI createChildUri(URI objectUri, String childPath)
          Creates a URI for a subcrawled entity.
 DataObject getDataObject(URI parentUri, String path, InputStream stream, DataSource dataSource, Charset charset, String mimeType, RDFContainerFactory factory)
          Get a DataObject from the specified stream with the given path.
abstract  String getUriPrefix()
          Returns the prefix used when generating uris.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.semanticdesktop.aperture.subcrawler.SubCrawler
stopSubCrawler, subCrawl
 

Constructor Detail

AbstractSubCrawler

public AbstractSubCrawler()
Method Detail

getUriPrefix

public abstract String getUriPrefix()
Returns the prefix used when generating uris. See the documentation for SubCrawler class for more details.

Returns:
the prefix used when generating uris.

createChildUri

protected URI createChildUri(URI objectUri,
                             String childPath)
Creates a URI for a subcrawled entity. Uses a scheme invented within the apache commons VFS project.

Parameters:
objectUri - the uri of the parent data object
childPath - the path within the the child object
Returns:
a uri for a subcrawled entity.
See Also:
VFS Filesystems Documentation

getDataObject

public DataObject getDataObject(URI parentUri,
                                String path,
                                InputStream stream,
                                DataSource dataSource,
                                Charset charset,
                                String mimeType,
                                RDFContainerFactory factory)
                         throws SubCrawlerException,
                                PathNotFoundException
Description copied from interface: SubCrawler
Get a DataObject from the specified stream with the given path.

Specified by:
getDataObject in interface SubCrawler
Parameters:
parentUri - the URI of the parent object where the path will be looked for
path - the path of the requested resource
stream - the stream that contains the resource
dataSource - data source that will be returned by the DataObject.getDataSource() method of the returned data object. Some implementations may require that this reference is not null and that it contains some particular information
charset - the charset in which the input stream is encoded (optional).
mimeType - the MIME type of the passed stream (optional).
factory - An RDFContainerFactory that delivers the RDFContainer to which the metadata of the DataObject should be added. The provided RDFContainer can later be retrieved as the DataObject's metadata container.
Returns:
The DataObject extracted from the given stream with the given path
Throws:
SubCrawlerException - if any I/O error occurs
PathNotFoundException - if the requested path is not found


Copyright © 2010 Aperture Development Team. All Rights Reserved.