org.semanticdesktop.aperture.datasource.config
Class DomainBoundaries

java.lang.Object
  extended by org.semanticdesktop.aperture.datasource.config.DomainBoundaries

public class DomainBoundaries
extends Object

A DomainBoundaries uses UrlPatterns (regular expressions or substrings checks) to determine whether a URL belongs to a DataSource domain or not.

Each DomainBoundaries maintains lists of include and exclude patterns. A URL is matched against these two pattern lists to determine whether it is inside or outside the domain. A URL is inside the domain when it matches at least one of the include patterns but none of the exclude patterns. In case no include patterns are specified, all URLs that don't match any of the exclude patterns are included.


Constructor Summary
DomainBoundaries()
           
DomainBoundaries(List includePatterns, List excludePatterns)
           
 
Method Summary
 void addExcludePattern(UrlPattern pattern)
           
 void addIncludePattern(UrlPattern pattern)
           
 List getExcludePatterns()
           
 List getIncludePatterns()
           
 boolean inDomain(String url)
          Checks whether the supplied URL falls inside the specified boundaries.
 void removeAllExcludePatterns()
           
 void removeAllIncludePatterns()
           
 void removeAllPatterns()
           
 boolean removeExcludePattern(UrlPattern pattern)
           
 boolean removeIncludePattern(UrlPattern pattern)
           
 void setExcludePatterns(List excludePatterns)
           
 void setIncludePatterns(List includePatterns)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DomainBoundaries

public DomainBoundaries()

DomainBoundaries

public DomainBoundaries(List includePatterns,
                        List excludePatterns)
Method Detail

addIncludePattern

public void addIncludePattern(UrlPattern pattern)

removeIncludePattern

public boolean removeIncludePattern(UrlPattern pattern)

removeAllIncludePatterns

public void removeAllIncludePatterns()

getIncludePatterns

public List getIncludePatterns()
Returns:
a read-only version of the internal include-list

setIncludePatterns

public void setIncludePatterns(List includePatterns)

addExcludePattern

public void addExcludePattern(UrlPattern pattern)

removeExcludePattern

public boolean removeExcludePattern(UrlPattern pattern)

removeAllExcludePatterns

public void removeAllExcludePatterns()

getExcludePatterns

public List getExcludePatterns()
Returns:
a read-only version of the internal exclude-list

setExcludePatterns

public void setExcludePatterns(List excludePatterns)

removeAllPatterns

public void removeAllPatterns()

inDomain

public boolean inDomain(String url)
Checks whether the supplied URL falls inside the specified boundaries.

Parameters:
url - The URL to check.
Returns:
'true' if the URL is inside the crawl domain, 'false' otherwise.


Copyright © 2010 Aperture Development Team. All Rights Reserved.