|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.semanticdesktop.aperture.crawler.base.CrawlerBase
org.semanticdesktop.aperture.crawler.mail.AbstractJavaMailCrawler
org.semanticdesktop.aperture.crawler.mbox.MboxCrawler
public class MboxCrawler
A crawler implementation for mbox files.
Field Summary |
---|
Fields inherited from class org.semanticdesktop.aperture.crawler.mail.AbstractJavaMailCrawler |
---|
ACCESSED_KEY, baseFolders, currentFolder, currentFolderURI, maxDepth, maximumByteSize, store, SUBFOLDERS_KEY |
Fields inherited from class org.semanticdesktop.aperture.crawler.base.CrawlerBase |
---|
accessData, accessorRegistry, crawlReportFile, source, stopRequested |
Constructor Summary | |
---|---|
MboxCrawler()
|
Method Summary | |
---|---|
protected boolean |
checkIfCurrentFolderHasBeenChanged(AccessData newAccessData)
Applies source-specific methods to determine if the current folder has been changed since it has last been crawled. |
protected void |
closeConnection()
It seems that mstor doesn't close the opened folders when Service.close() is invoked. |
protected ExitCode |
crawlObjects()
Method called by crawl() that should implement the actual crawling of the DataSource. |
protected void |
ensureConnectedStore()
Ensures that the crawler is connected to the underlying mail storage system and can perform the crawl. |
protected String |
getFolderName(String url)
Extracts the name of the folder from the data object URI. |
protected URI |
getFolderURI(javax.mail.Folder folder)
Returns the URI of the folder, using the URI scheme appropriate for the current crawler. |
protected String |
getMessageUri(javax.mail.Folder folder,
javax.mail.Message message)
Returns the URI of the message, using the URI scheme appropriate for the current crawler. |
protected void |
recordCurrentFolderInAccessData(AccessData newAccessData)
Records source-specific information about the current folder that will enable the crawler to detect if the crawler has been changed on a future crawl. |
protected void |
retrieveConfigurationData(DataSource dataSource)
Prepare for accessing the specified DataSource by fetching all properties from it that are required to connect to the mail box. |
Methods inherited from class org.semanticdesktop.aperture.crawler.base.CrawlerBase |
---|
clear, clear, crawl, getAccessData, getCrawlerHandler, getCrawlReport, getCrawlReportFile, getDataAccessorRegistry, getDataSource, getRDFContainerFactory, inDomain, isStopRequested, reportAccessingObject, reportDeletedDataObject, reportFatalErrorCause, reportFatalErrorCause, reportFatalErrorCause, reportModifiedDataObject, reportNewDataObject, reportUnmodifiedDataObject, reportUntouched, runSubCrawler, setAccessData, setCrawlerHandler, setCrawlReportFile, setDataAccessorRegistry, setDataSource, stop, storeCrawlReport, touchObject |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public MboxCrawler()
Method Detail |
---|
protected ExitCode crawlObjects()
CrawlerBase
crawlObjects
in class CrawlerBase
protected void retrieveConfigurationData(DataSource dataSource)
retrieveConfigurationData
in class AbstractJavaMailCrawler
protected void ensureConnectedStore() throws javax.mail.MessagingException
AbstractJavaMailCrawler
ensureConnectedStore
in class AbstractJavaMailCrawler
javax.mail.MessagingException
protected void closeConnection()
Service.close()
is invoked. This is
important for the DataAccessor
implementation, because when operating as a Crawler
, the
AbstractJavaMailCrawler
.crawlFolder method takes care about this.
closeConnection
in class AbstractJavaMailCrawler
protected void recordCurrentFolderInAccessData(AccessData newAccessData) throws javax.mail.MessagingException
AbstractJavaMailCrawler
recordCurrentFolderInAccessData
in class AbstractJavaMailCrawler
newAccessData
- the access data where the information should be stored
javax.mail.MessagingException
protected boolean checkIfCurrentFolderHasBeenChanged(AccessData newAccessData) throws javax.mail.MessagingException
AbstractJavaMailCrawler
checkIfCurrentFolderHasBeenChanged
in class AbstractJavaMailCrawler
newAccessData
- the AccessData instance that is to be consulted
javax.mail.MessagingException
protected URI getFolderURI(javax.mail.Folder folder) throws javax.mail.MessagingException
AbstractJavaMailCrawler
getFolderURI
in class AbstractJavaMailCrawler
folder
- the Folder whose URI we'd like to obtain.
javax.mail.MessagingException
protected String getMessageUri(javax.mail.Folder folder, javax.mail.Message message) throws javax.mail.MessagingException
AbstractJavaMailCrawler
getMessageUri
in class AbstractJavaMailCrawler
folder
- the folder where the message residesmessage
- the message itself
javax.mail.MessagingException
protected String getFolderName(String url) throws UrlNotFoundException
AbstractJavaMailCrawler
Store.getFolder(String)
method to obtain the corresponding Folder
instance which directly contains the data object (message or attachment) with the given url.
This method can be called ONLY when all confguration has been read from the DataObject
, that is
AFTER AbstractJavaMailCrawler.retrieveConfigurationData(DataSource)
.
getFolderName
in class AbstractJavaMailCrawler
UrlNotFoundException
- if the given url does not belong to the current Store
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |