org.semanticdesktop.aperture.helper.html
Class HtmlParserUtil

java.lang.Object
  extended by org.semanticdesktop.aperture.helper.html.HtmlParserUtil

public class HtmlParserUtil
extends Object

A utility class for HTML parsing using the HTMLParser library. This class embeds all the logic necessary to apply HTMLParser at the cost of a slightly more complicated NodeVisitor, namely one that can be reset.


Nested Class Summary
static class HtmlParserUtil.ContentExtractor
          A NodeVisitor specialization that is able to start all over with interpreting parsing events.
 
Constructor Summary
HtmlParserUtil()
           
 
Method Summary
static void parse(InputStream stream, Charset charset, HtmlParserUtil.ContentExtractor extractor)
          Parses the specified document stream using the HTMLParser library, using the specified ResetableNodeVisitor as a NodeVisitor during parsing.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HtmlParserUtil

public HtmlParserUtil()
Method Detail

parse

public static void parse(InputStream stream,
                         Charset charset,
                         HtmlParserUtil.ContentExtractor extractor)
                  throws HtmlParserException
Parses the specified document stream using the HTMLParser library, using the specified ResetableNodeVisitor as a NodeVisitor during parsing.

Parameters:
stream - The stream containing the HTML document.
charset - The charset of the HTML document (optional);
extractor - The extractor that is informed about encountered document parts.
Throws:
HtmlParserException


Copyright © 2010 Aperture Development Team. All Rights Reserved.