Package au.id.pbw.hyfo.hyph

The main package of HyFo.

See:
          Description

Interface Summary
PatternElement Represents a pattern element, which can be either a String or an embedded ModifierReference.
TernaryTree Interface describing ternary trees.
TernaryTreeDataStore A TernaryTreeDataStore maintains the data for an associated TernaryTree.
 

Class Summary
Alphabet Represents a alphabet based on a set of character classes.
AlphabetBuilder This class builds and returns Alphabet objects.
BigTernaryTree A TernaryTree whose node count may exceed Short.MAX_VALUE.
BreakMinima Represents the minimum number of characters allowable before and after a hyphenation line-break.
HyphenatedWord Represents a hyphenated-word, derived from an original word, with its hyphenation possibilities.
HyphenationTree Represents a tree of hyphenation data and associated structures.
HyphenationTreeBuilder Builds a HyphenationTree from an input pattern file.
HyphenationTreeCache A singleton class which caches instances of HyphenationTree.
HyphenatorDisplay Static methods to display hyphenations from instances of HyphenatedWord.
HyphenBreak This class represents a possible hyphenation break.
HyphenDataCache The class caches the data for an associated HyphenationTree.
HyphenInhibitor Represents an inhibiting value for a hyphenation break point.
HyphenMarker The attribute key for an instance of HyphenBreak, containing the details of a potential hyphenation break in an AttributedString.
Modifier This class represents the modification to a word which occurs when it is hyphenated.
ModifierReference Represents a reference to a Modifier embedded in a text string.
ShortTernaryTree  
Subsearch A class representing the results of a partial search.
TextElement Represents a text component of a pattern element.
TreeElement A TreeElement is returned by a tree walker.
 

Enum Summary
HyphenationTreeBuilder.Element An enum of the elements n a resource file.
InstanceType Types of instance in tree builders.
 

Exception Summary
HyphenationException Exceptions encountered in hyphenation.
 

Package au.id.pbw.hyfo.hyph Description

The main package of HyFo. It defines all objects used to specify hyphenation, including the components of the pattern files, and the structure and building of hyphenation trees.

Main interface

The main user interface to hyphenation is through the HyphenationTreeCache, HyphenationTree and HyphenatedWord classes.

The most convenient way to access a HyphenationTree is through the HyphenationTreeCache singleton's HyphenationTreeCache.get_hyphenation_tree(String) method. The argument is described as a locale, but it can in fact be any string which can be matched again the locale-like component of a HyphenationTree name. For example, the string en-US_special will match the HyphenationTrees named HyphenationTree_en_US_special or HyphenationTree_en_US, in that order. Note that hyphens in the string are normalized to underscores before matching.

If the string en has been set as an alias for en_US by a call to the method add_alias("en", "en_US"), then get_hyphenation_tree("en") will also match HyphenationTree_en_US, but not otherwise.

Accessing hyphenation trees

Any required instance of a HyphenationTree must be on the classpath. The simplest way to ensure this is to add the jar file containing the required instance to a directory on the classpath, for example the directory containing the HyFo jar itself.

Once a HyphenationTree has been obtained, word hyphenation can be attempted by calling HyphenationTree.hyphenate(String). This call returns an instance of HyphenatedWord. For example:

 HyphenationTreeCache cache = HyphenationTreeCache.get_cache_instance();
 cache.add_alias("en", "en_US");
 HyphenationTree en = cache.get_hyphenation_tree("en");
 if (en == null) {
     // No tree found - may also throw an IOException
     ...
 }
 HyphenatedWord hyphenated = en.hyphenate("abolition");
 

Hyphenation results

A HyphenatedWord contains information about all possible hyphenation points in the word is has analysed. This information can be recovered in a number of formats. The simplest for normal use is HyphenatedWord.get_string_breakpoints(). For a full description, see HyphenatedWord.

Building new hyphenation trees

See here for a description of the contents of hyphenation patterns files, and the process of building hyphenation tree jar files from them.



Copyright © 2005-2006 Peter B. West.