org.semanticdesktop.aperture.datasource.config
Class DomainBoundaries
java.lang.Object
org.semanticdesktop.aperture.datasource.config.DomainBoundaries
public class DomainBoundaries
- extends Object
A DomainBoundaries uses UrlPatterns (regular expressions or substrings checks) to determine whether a URL
belongs to a DataSource domain or not.
Each DomainBoundaries maintains lists of include and exclude patterns. A URL is matched against these two
pattern lists to determine whether it is inside or outside the domain. A URL is inside the domain when it
matches at least one of the include patterns but none of the exclude patterns. In case no include patterns
are specified, all URLs that don't match any of the exclude patterns are included.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DomainBoundaries
public DomainBoundaries()
DomainBoundaries
public DomainBoundaries(List includePatterns,
List excludePatterns)
addIncludePattern
public void addIncludePattern(UrlPattern pattern)
removeIncludePattern
public boolean removeIncludePattern(UrlPattern pattern)
removeAllIncludePatterns
public void removeAllIncludePatterns()
getIncludePatterns
public List getIncludePatterns()
- Returns:
- a read-only version of the internal include-list
setIncludePatterns
public void setIncludePatterns(List includePatterns)
addExcludePattern
public void addExcludePattern(UrlPattern pattern)
removeExcludePattern
public boolean removeExcludePattern(UrlPattern pattern)
removeAllExcludePatterns
public void removeAllExcludePatterns()
getExcludePatterns
public List getExcludePatterns()
- Returns:
- a read-only version of the internal exclude-list
setExcludePatterns
public void setExcludePatterns(List excludePatterns)
removeAllPatterns
public void removeAllPatterns()
inDomain
public boolean inDomain(String url)
- Checks whether the supplied URL falls inside the specified boundaries.
- Parameters:
url
- The URL to check.
- Returns:
- 'true' if the URL is inside the crawl domain, 'false' otherwise.
Copyright © 2010 Aperture Development Team. All Rights Reserved.