|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object au.id.pbw.hyfo.hyph.HyphenatedWord
public class HyphenatedWord
Represents a hyphenated-word, derived from an original word,
with its hyphenation possibilities. Internally, the hyphenated-word
is held as an array of character class codes derived from the Alphabet
in which the hyphenator operates. Corresponding to the boundaries
between the characters of the hyphenated-word (including the positions
before the first character and after the last) is an array of HyphenBreak
s. All but those corresponding to a potential
breakpoint are null. The non-null HyphenBreak
s
include the strength of the break possibility between the corresponding
codepoints. In the accompanying diagrams, these are represented by single
digits giving their (odd-numbered) weight. Nominal weights in used in the
diagrams.
The hyphenated-word can be retrieved as the original string.
See get_word
.
The associated breakpoint information is returned as an array of
HyphenBreak
s of the same length as the string. Non-null
HyphenBreak
s correspond to breakpoint opportunities immediately
following the corresponding character in the string.
See get_string_breakpoints
.
Constructor Summary | |
---|---|
HyphenatedWord(int[] char_classes,
int[] char_indices,
String word,
HyphenBreak[] breakpoints,
Alphabet alphabet)
Creates a new instance of HyphenatedWord from the given arrays. |
|
HyphenatedWord(String word)
Creates a NULL instance of HyphenatedWord from the given word . |
Method Summary | |
---|---|
int |
get_breakpoint_count()
Gets the number of breakpoints in this HyphenatedWord . |
HyphenBreak[] |
get_char_classes_breakpoints()
Gets the array of hyphenation possibilities corresponding to the codepoints of the original word. |
int[] |
get_fop_compatible_points()
Gets the Fop-compatible array of breakpoint positions in the word . |
String |
get_fop_post_hyphen(int offset)
Gets the suffix substring of the original word from the given offset, inclusive, to the end of the word. |
String |
get_fop_pre_hyphen(int offset)
Gets the prefix substring of the original word up to, but excluding, the given offset. |
HyphenBreak[] |
get_string_breakpoints()
Gets the array of hyphenation possibilities corresponding to the String representation of the word being hyphenated. |
String |
to_fop_string()
Returns a fully-hyphenated string, mimicking the Fop Hyphenation.toString() method. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public HyphenatedWord(int[] char_classes, int[] char_indices, String word, HyphenBreak[] breakpoints, Alphabet alphabet)
The first argument is an array of int
containing the character classes
of the contents of the original word. A character class is an integer representing
a set of codepoints from the alphabet
which are equivalent in respect of
hyphenation; for example the characters 'D' and 'd'. For Western alphabets, the
upper and lower case versions of a character are equivalent for hyphenation.
The second argument is an array of int
containing the offsets of the
first character in the original word from which individual character classes were
derived. This array is only relevant if the original word can contain supplementary
characters, as indicated by the alphabet
arument. Otherwise, it should be
null
.
For example, if the first character of the word is not a supplementary character,
the second word is supplementary character, requiring two char
s for its
representation, then the first three entries in the indices array will be
0 (pointing to the char
offset of the first character of the word),
1 (pointing to the char
offset of the second character of the word)
and 3 (pointing to the char
offset of the third character of the word).
The third character will be offset by 2 from the second because the second requires
2 supplementary characters for its representation.
The third argument is the original word.
The fourth argument is the array of HyphenBreak
s corresponding to
the character classes of the original word.
N.B. Breakpoints in the array breakpoints
represent a
breakpoint following the corresponding character class of the original word.
The array of breakpoints must be the same length as the array of character classes,
FIXME
and positions which do not correspond to a hyphenation point must be null
.
char_classes
- the array of character classes representing the word.char_indices
- the array of indices from char_classes
to corresponding
offset in the original word
string. May be null.word
- the original word being hyphenated.breakpoints
- the array of HyphenBreak
s.alphabet
- the Alphabet
of characters which are recognized for hyphenation.public HyphenatedWord(String word)
word
.
This hyphenator may be used when no hyphenation can be generated for
a word (due to the presence of non-Alphabet
characters, for example).
word
- the word to hyphenate.Method Detail |
---|
public HyphenBreak[] get_char_classes_breakpoints()
HyphenBreak
s represents a hyphenation possibility
immediately preceding the codepoint to which it corresponds.
public HyphenBreak[] get_string_breakpoints()
String
representation of the word being hyphenated. Each non-null
position in the returned array of HyphenBreak
s represents a
hyphenation possibility immediately preceding the char
to which
it corresponds.
String
representation of the word being hyphenated.public int[] get_fop_compatible_points()
word
.
This method corresponds to the method Hyphenation.getHyphenationPoints()
.
In Fop, the breakpoint position is set on the character following the
breakpoint.
public int get_breakpoint_count()
HyphenatedWord
.
This method corresponds to the Fop method Hyphenation.length()
. It is
strongly recommended that this method be used to determine the effective limit
of the array returned by get_fop_compatible_points()
.
public String get_fop_pre_hyphen(int offset)
This method is compatible with the Fop method Hyphenation.getPreHyphenText()
.
offset
- of the character following the prefix.
get_fop_compatible_points()
,
get_breakpoint_count()
,
get_fop_post_hyphen( int offset )
public String get_fop_post_hyphen(int offset)
This method is compatible with the Fop method Hyphenation.getPostHyphenText()
.
offset
- of the first position of the suffix within the
original word being hyphenated.
get_fop_compatible_points()
,
get_breakpoint_count()
,
get_fop_pre_hyphen( int offset )
public String to_fop_string()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |