HtmlLinkExtractor (Aperture Core 1.5.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.semanticdesktop.aperture.hypertext.linkextractor.html
Class HtmlLinkExtractor

java.lang.Object
  org.semanticdesktop.aperture.hypertext.linkextractor.html.HtmlLinkExtractor

All Implemented Interfaces:: TokenHandler, LinkExtractor

public class HtmlLinkExtractor
extends Object
implements LinkExtractor, TokenHandler
extends Object
implements LinkExtractor, TokenHandler

A LinkExtractor implementation that can extract links from HTML documents.

Field Summary

Fields inherited from interface org.semanticdesktop.aperture.hypertext.linkextractor.LinkExtractor
`BASE_URL_KEY, INCLUDE_EMBEDDED_RESOURCES_KEY`

Constructor Summary
`HtmlLinkExtractor()`

Method Summary
`void`	`attribute(String name)` Notification of an attribute for the most recently reported element.
`void`	`attribute(String name, String value)` Notification of an attribute for the most recently reported element.
`void`	`comment(String comment)` Notification of comment.
`void`	`docType(String name, String sysId, String fpi, String uri)` Notification of a processing instruction.
`void`	`endDocument()` Notification of the end of a document.
`void`	`endOfStartTag()` Notification of the end of a start tag.
`void`	`endTag(String name)` Notification of an end tag.
`void`	`error(String message)` Notification of a detected error.
`List`	`extractLinks(InputStream inputStream, Map params)` Extracts all links occurring in the specified stream.
`void`	`startDocument()` Notification of the start of a new document.
`void`	`startOfStartTag(String name)` Notification of the start of a start tag.
`void`	`text(String text)` Notification of text.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

HtmlLinkExtractor

public HtmlLinkExtractor()

Method Detail

extractLinks

public List extractLinks(InputStream inputStream,
                         Map params)
                  throws IOException

Description copied from interface: LinkExtractor

Extracts all links occurring in the specified stream.

Specified by:: extractLinks in interface LinkExtractor

Parameters:: inputStream - The input stream containing the content from which the links should be extracted, e.g. an HTML document.; params - An optional set of parameters to guide the link extraction process.
Returns:: A List of Strings representing the encountered links in the order in which they were encountered in the document.
Throws:: IOException

startDocument

public void startDocument()

Description copied from interface: TokenHandler

Notification of the start of a new document.

Specified by:: startDocument in interface TokenHandler

endDocument

public void endDocument()

Description copied from interface: TokenHandler

Notification of the end of a document.

Specified by:: endDocument in interface TokenHandler

startOfStartTag

public void startOfStartTag(String name)

Description copied from interface: TokenHandler

Notification of the start of a start tag.

Specified by:: startOfStartTag in interface TokenHandler

Parameters:: name - The tag name.

endOfStartTag

public void endOfStartTag()

Description copied from interface: TokenHandler

Notification of the end of a start tag.

Specified by:: endOfStartTag in interface TokenHandler

endTag

public void endTag(String name)

Description copied from interface: TokenHandler

Notification of an end tag.

Specified by:: endTag in interface TokenHandler

Parameters:: name - The tag name.

attribute

public void attribute(String name)

Description copied from interface: TokenHandler

Notification of an attribute for the most recently reported element. The reported attribute does not have a value.

Specified by:: attribute in interface TokenHandler

Parameters:: name - The name of the attribute.

attribute

public void attribute(String name,
                      String value)

Description copied from interface: TokenHandler

Notification of an attribute for the most recently reported element.

Specified by:: attribute in interface TokenHandler

Parameters:: name - The name of the attribute.; value - The value of the attribute.

text

public void text(String text)

Description copied from interface: TokenHandler

Notification of text.

Specified by:: text in interface TokenHandler

Parameters:: text - the text.

comment

public void comment(String comment)

Description copied from interface: TokenHandler

Notification of comment.

Specified by:: comment in interface TokenHandler

Parameters:: comment - The comment.

docType

public void docType(String name,
                    String sysId,
                    String fpi,
                    String uri)

Description copied from interface: TokenHandler

Notification of a processing instruction.

Specified by:: docType in interface TokenHandler

Parameters:: name - The type name, e.g. HTML.; sysId - The system id, e.g. PUBLIC or SYSTEM.; fpi - The Formal Public Identifier, e.g. "-//W3C//DTD HTML 4.0 Transitional//EN".; uri - The URL of the DTD, e.g. "http://www.w3.org/TR/REC-html40/loose.dtd".

error

public void error(String message)

Description copied from interface: TokenHandler

Notification of a detected error.

Specified by:: error in interface TokenHandler

Parameters:: message - An error message.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Copyright © 2010 Aperture Development Team. All Rights Reserved.