au.id.pbw.hyfo.hyph
Class AlphabetBuilder

java.lang.Object
  extended by au.id.pbw.hyfo.hyph.AlphabetBuilder

public class AlphabetBuilder
extends Object

This class builds and returns Alphabet objects. The materials for an Alphabet are constructed progressively by calls to add_class, an optional call to add_break_minima and an optional call to add_hyphen_char. When all materials have been provided, a call to get_alphabet returns the corresponding Alphabet, and resets the AlphabetBuilder for use in building a new Alphabet.

This class is not thread-safe.

Author:
Peter B. West

Field Summary
static int DEFAULT_ALPHABET_SIZE
          The default number of character classes in an alphabet.
static int DEFAULT_HYPHEN_CHAR
          The default hyphen for this alphabet.
static int ETX
          The ETX is used as an end-of-word marker in patterns.
static int ETX_CLASS
          The character class of ETX is 1
static Integer ETX_CLASS_INTEGER
          ETX_CLASS as an Integer
protected  boolean requires_codepoints
          Are any alphabet characters outside the BMP; that is, does the alphabet require the use of codepoints rather than characters.
static int STX
          The STX is used as a start-of-word marker in patterns.
static int STX_CLASS
          The character class of STX is 0
static Integer STX_CLASS_INTEGER
          STX_CLASS as an Integer
 
Constructor Summary
AlphabetBuilder()
          Creates a new instance of AlphabetBuilder for an Alphabet with the default number of classes.
AlphabetBuilder(int size)
          Creates a new instance of AlphabetBuilder for an alphabet with the given number of character classes.
 
Method Summary
 void add_break_minima(BreakMinima minima)
          Adds a BreakMinima for this alphabet.
 void add_class(String str)
          Adds a character class for the Alphabet.
 void add_hyphen_char(int hyphen_codept)
          Adds the canonical hyphen character for this alphabet.
 Alphabet get_alphabet()
          Returns the Alphabet constructed from the character classes.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STX

public static final int STX
The STX is used as a start-of-word marker in patterns.

See Also:
Constant Field Values

STX_CLASS

public static final int STX_CLASS
The character class of STX is 0

See Also:
Constant Field Values

STX_CLASS_INTEGER

public static final Integer STX_CLASS_INTEGER
STX_CLASS as an Integer


ETX

public static final int ETX
The ETX is used as an end-of-word marker in patterns.

See Also:
Constant Field Values

ETX_CLASS

public static final int ETX_CLASS
The character class of ETX is 1

See Also:
Constant Field Values

ETX_CLASS_INTEGER

public static final Integer ETX_CLASS_INTEGER
ETX_CLASS as an Integer


DEFAULT_ALPHABET_SIZE

public static final int DEFAULT_ALPHABET_SIZE
The default number of character classes in an alphabet.

See Also:
Constant Field Values

DEFAULT_HYPHEN_CHAR

public static final int DEFAULT_HYPHEN_CHAR
The default hyphen for this alphabet.

See Also:
Constant Field Values

requires_codepoints

protected boolean requires_codepoints
Are any alphabet characters outside the BMP; that is, does the alphabet require the use of codepoints rather than characters.

Constructor Detail

AlphabetBuilder

public AlphabetBuilder()
Creates a new instance of AlphabetBuilder for an Alphabet with the default number of classes.


AlphabetBuilder

public AlphabetBuilder(int size)
Creates a new instance of AlphabetBuilder for an alphabet with the given number of character classes.

Parameters:
size - the number of character classes in the alphabet.
Method Detail

add_break_minima

public void add_break_minima(BreakMinima minima)
Adds a BreakMinima for this alphabet. Only one is required. Subsequent attempts will be logged and otherwise ignored.

Parameters:
minima - the BreakMinima for the alphabet.

add_hyphen_char

public void add_hyphen_char(int hyphen_codept)
Adds the canonical hyphen character for this alphabet. Only one is required. Subsequent attempts will be logged and otherwise ignored.

Parameters:
hyphen_codept - the hyphen character as a codepoint.

add_class

public void add_class(String str)
               throws HyphenationException
Adds a character class for the Alphabet. The character class is represented as a String comprised of at least one character. The first or only character is the canonical character; any subsequent characters are treated as equivalents to the canonical character. Only canonical characters may appear in hyphenation patterns.

All characters are converted to codepoints for use in hyphenation.

Parameters:
str - the character(s) comprising the class. The first or only character is the canonical character, and any subsequent characters are equivalents.
Throws:
HyphenationException - if the argument string is empty, or if this builder has been invalidated.

get_alphabet

public Alphabet get_alphabet()
                      throws HyphenationException
Returns the Alphabet constructed from the character classes.

Returns:
the Alphabet constructed from the classes.
Throws:
HyphenationException - if this builder has been invalidated.


Copyright © 2005-2006 Peter B. West.