|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface LinkExtractor
A LinkExtractor extracts links from a document, e.g. the anchors inside a HTML document. Implementations are typically MIME type-specific.
The resulting list of links is returned as a collection of Strings rather than URLs in order to allow for any kind of scheme to be used without having to provide a URLStreamHandler for that scheme.
| Field Summary | |
|---|---|
static Object |
BASE_URL_KEY
Suggested key to use in the params map to indicate the base URL with which relative URLs can be resolved. |
static Object |
INCLUDE_EMBEDDED_RESOURCES_KEY
Suggested key to use in the params map to indicate that non-navigational links should als be extracted, e.g. embedded images, background, stylesheets, etc. |
| Method Summary | |
|---|---|
List |
extractLinks(InputStream stream,
Map params)
Extracts all links occurring in the specified stream. |
| Field Detail |
|---|
static final Object BASE_URL_KEY
static final Object INCLUDE_EMBEDDED_RESOURCES_KEY
| Method Detail |
|---|
List extractLinks(InputStream stream,
Map params)
throws Exception
stream - The input stream containing the content from which the links should be extracted,
e.g. an HTML document.params - An optional set of parameters to guide the link extraction process.
Exception - When an error occurred during processing of the document stream.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||