org.semanticdesktop.aperture.helper.html
Class HtmlParserUtil.ContentExtractor

java.lang.Object
  extended by org.htmlparser.visitors.NodeVisitor
      extended by org.semanticdesktop.aperture.helper.html.HtmlParserUtil.ContentExtractor
Enclosing class:
HtmlParserUtil

public static class HtmlParserUtil.ContentExtractor
extends org.htmlparser.visitors.NodeVisitor

A NodeVisitor specialization that is able to start all over with interpreting parsing events.


Constructor Summary
HtmlParserUtil.ContentExtractor()
           
 
Method Summary
 String getAuthor()
          Return the extracted author, if any.
 String getDescription()
          Return the extracted description, if any.
 Iterator getKeywords()
          Return the extracted meta keywords, if any.
 String getText()
          Return the extracted full-text, if any.
 String getTitle()
          Return the extracted title, if any.
 void reset()
          Remove all extracted information so that the ContentExtractor can be used anew.
 void visitEndTag(org.htmlparser.Tag tag)
           
 void visitStringNode(org.htmlparser.Text node)
           
 void visitTag(org.htmlparser.Tag tag)
           
 
Methods inherited from class org.htmlparser.visitors.NodeVisitor
beginParsing, finishedParsing, shouldRecurseChildren, shouldRecurseSelf, visitRemarkNode
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HtmlParserUtil.ContentExtractor

public HtmlParserUtil.ContentExtractor()
Method Detail

reset

public void reset()
Remove all extracted information so that the ContentExtractor can be used anew.


getText

public String getText()
Return the extracted full-text, if any.


getKeywords

public Iterator getKeywords()
Return the extracted meta keywords, if any.


getTitle

public String getTitle()
Return the extracted title, if any.


getAuthor

public String getAuthor()
Return the extracted author, if any.


getDescription

public String getDescription()
Return the extracted description, if any.


visitStringNode

public void visitStringNode(org.htmlparser.Text node)
Overrides:
visitStringNode in class org.htmlparser.visitors.NodeVisitor

visitTag

public void visitTag(org.htmlparser.Tag tag)
Overrides:
visitTag in class org.htmlparser.visitors.NodeVisitor

visitEndTag

public void visitEndTag(org.htmlparser.Tag tag)
Overrides:
visitEndTag in class org.htmlparser.visitors.NodeVisitor


Copyright © 2010 Aperture Development Team. All Rights Reserved.