|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.semanticdesktop.aperture.extractor.microsoft.util.PoiUtil
public class PoiUtil
Features Apache POI-specific utility methods for text and metadata extraction purposes.
Some methods use a buffer to be able to reset the InputStream to its start. The buffer size can be altered by giving the "aperture.poiUtil.bufferSize" system property a value holding the number of bytes that the buffer may use.
Nested Class Summary | |
---|---|
static class |
PoiUtil.NonCloseableStream
|
static interface |
PoiUtil.TextExtractor
A TextExtractor is a delegate that extracts the full-text from an MS Office document using a POIFSFileSystem. |
Constructor Summary | |
---|---|
PoiUtil()
|
Method Summary | |
---|---|
static InputStream |
extractAll(InputStream stream,
PoiUtil.TextExtractor textExtractor,
RDFContainer container,
Logger logger)
Extract full-text and metadata from an MS Office document contained in the specified stream. |
static void |
extractMetadata(org.apache.poi.poifs.filesystem.DirectoryNode dirNode,
RDFContainer container)
|
static InputStream |
extractMetadata(InputStream stream,
boolean resetStream,
RDFContainer container)
Extract all metadata from an OLE document. |
static void |
extractMetadata(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFileSystem,
RDFContainer container)
Extracts all metadata from the POIFSFileSystem's SummaryInformation and transforms it to RDF statements that are stored in the specified RDFContainer. |
static int |
getBufferSize()
Returns the buffer size to use when buffering the contents of a document. |
static org.apache.poi.hpsf.DocumentSummaryInformation |
getDocumentSummaryInformation(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFileSystem)
Returns the SummaryInformation holding the document metadata from a POIFSFileSystem. |
static org.apache.poi.hpsf.SummaryInformation |
getSummaryInformation(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFileSystem)
Returns the SummaryInformation holding the document metadata from a POIFSFileSystem. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public PoiUtil()
Method Detail |
---|
public static org.apache.poi.hpsf.SummaryInformation getSummaryInformation(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFileSystem)
poiFileSystem
- The POI file system to obtain the metadata from.
public static org.apache.poi.hpsf.DocumentSummaryInformation getDocumentSummaryInformation(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFileSystem)
poiFileSystem
- The POI file system to obtain the metadata from.
public static InputStream extractMetadata(InputStream stream, boolean resetStream, RDFContainer container) throws IOException
stream
- The stream containing the OLE document.resetStream
- Specified whether the stream should be buffered and reset. The buffer size can be
determined by the system property described in the class documentation.container
- The RDFContainer to store the metadata in.
IOException
- When resetting of the buffer resulted in an IOException.public static void extractMetadata(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFileSystem, RDFContainer container)
poiFileSystem
- The POI file system to obtain the metadata from.container
- The RDFContainer to store the created RDF statements in.public static void extractMetadata(org.apache.poi.poifs.filesystem.DirectoryNode dirNode, RDFContainer container)
public static InputStream extractAll(InputStream stream, PoiUtil.TextExtractor textExtractor, RDFContainer container, Logger logger)
public static int getBufferSize()
systemProperty
- The system property that contains the buffer size.defaultSize
- The default buffer size, in case the system property is not set or does not contain
a valid value.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |