|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface LinkExtractor
A LinkExtractor extracts links from a document, e.g. the anchors inside a HTML document. Implementations are typically MIME type-specific.
The resulting list of links is returned as a collection of Strings rather than URLs in order to allow for any kind of scheme to be used without having to provide a URLStreamHandler for that scheme.
Field Summary | |
---|---|
static Object |
BASE_URL_KEY
Suggested key to use in the params map to indicate the base URL with which relative URLs can be resolved. |
static Object |
INCLUDE_EMBEDDED_RESOURCES_KEY
Suggested key to use in the params map to indicate that non-navigational links should als be extracted, e.g. embedded images, background, stylesheets, etc. |
Method Summary | |
---|---|
List |
extractLinks(InputStream stream,
Map params)
Extracts all links occurring in the specified stream. |
Field Detail |
---|
static final Object BASE_URL_KEY
static final Object INCLUDE_EMBEDDED_RESOURCES_KEY
Method Detail |
---|
List extractLinks(InputStream stream, Map params) throws Exception
stream
- The input stream containing the content from which the links should be extracted,
e.g. an HTML document.params
- An optional set of parameters to guide the link extraction process.
Exception
- When an error occurred during processing of the document stream.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |