org.semanticdesktop.aperture.hypertext.linkextractor.html
Class Tokenizer

java.lang.Object
  extended by org.semanticdesktop.aperture.hypertext.linkextractor.html.Tokenizer

public class Tokenizer
extends Object

Tokenizer is a speed-optimized tokenizer for HTML(-like) documents. It reads documents from InputStreams and supplies the tokens it parses to a TokenHandler.


Constructor Summary
Tokenizer(TokenHandler tokenHandler)
          Creates a new Tokenizer.
 
Method Summary
 void read(InputStream input)
          Reads the entire contents of the supplied stream and tokenizes it.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Tokenizer

public Tokenizer(TokenHandler tokenHandler)
Creates a new Tokenizer.

Parameters:
tokenHandler - A TokenHandler that will handle the parsed tokens.
Method Detail

read

public void read(InputStream input)
          throws IOException
Reads the entire contents of the supplied stream and tokenizes it. The parsed tokens are supplied to the token handler.

Parameters:
input - The stream to read.
Throws:
IOException - If an I/O error occurs.


Copyright © 2010 Aperture Development Team. All Rights Reserved.