au.id.pbw.hyfo.unicode
Class CodepointArrays

java.lang.Object
  extended by au.id.pbw.hyfo.unicode.CodepointArrays

public class CodepointArrays
extends Object

Static methods for manipulating arrays of codepoints.

Author:
Peter B. West

Field Summary
static int CODEPTS
          The index of the array of codepoints within the int[][] returned by to_codepoint_arrays(java.lang.CharSequence).
static int INDICES
          The index of the array of offsets within the int[][] returned by to_codepoint_arrays(java.lang.CharSequence).
 
Method Summary
static int[][] to_codepoint_arrays(CharSequence seq)
          Returns a two-dimensional array corresponding to the CharSequence argument.
static int[][] to_codepoint_arrays(CharSequence seq, int start, int past_end)
          Returns a two-dimensional array corresponding to the specified range of the CharSequence argument.
static int[] to_codepoints(CharSequence seq)
          Returns an array of codepoints (ints) from the given CharSequence.
static int[] to_codepoints(CharSequence seq, int start, int past_end)
          Returns an array of codepoints (ints) from the given CharSequence, starting from position start, and ending at the position immediately preceding past_end.
static String to_string(int[] codepoints)
          Returns a String constructed from the given array of codepoints.
static String to_string(int[] codepoints, int start, int length)
          Returns a String constructed from the given array of codepoints, starting at offset start with number of codepoints length.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CODEPTS

public static final int CODEPTS
The index of the array of codepoints within the int[][] returned by to_codepoint_arrays(java.lang.CharSequence).

See Also:
Constant Field Values

INDICES

public static final int INDICES
The index of the array of offsets within the int[][] returned by to_codepoint_arrays(java.lang.CharSequence).

See Also:
Constant Field Values
Method Detail

to_codepoints

public static int[] to_codepoints(CharSequence seq)
Returns an array of codepoints (ints) from the given CharSequence.

Parameters:
seq - the source CharSequence.
Returns:
an array of int.

to_codepoints

public static int[] to_codepoints(CharSequence seq,
                                  int start,
                                  int past_end)
Returns an array of codepoints (ints) from the given CharSequence, starting from position start, and ending at the position immediately preceding past_end.

Parameters:
seq - the source CharSequence.
start - the start position.
past_end - the position of the last element + 1.
Returns:
the array of int.

to_codepoint_arrays

public static int[][] to_codepoint_arrays(CharSequence seq)
Returns a two-dimensional array corresponding to the CharSequence argument. The sub-array at offset CODEPTS is an array of int containing the Unicode codepoints corresponding to the argument. The sub-array at offset INDICES is an array of offsets into the original CharSequence indicating the first character from which the corresponding codepoint in the parallel array was derived.

Because a codepoint may correspond to a pair of surrogates, there is not necessarily a one-to-one corresponence between char and codepoint.

Parameters:
seq - the original sequence of char.
Returns:
the two-dimensional array of codepoints and offsets.

to_codepoint_arrays

public static int[][] to_codepoint_arrays(CharSequence seq,
                                          int start,
                                          int past_end)
Returns a two-dimensional array corresponding to the specified range of the CharSequence argument. The sub-array at offset CODEPTS is an array of int containing the Unicode codepoints corresponding to the argument. The sub-array at offset INDICES is an array of offsets into the original CharSequence indicating the first character from which the corresponding codepoint in the parallel array was derived.

Because a codepoint may correspond to a pair of surrogates, there is not necessarily a one-to-one corresponence between char and codepoint.

Parameters:
seq - the original sequence of char.
start - the index of the start of the sub-sequence.
past_end - the index of the last character of the sub-sequence + 1.
Returns:
the two-dimensional array of codepoints and offsets corresponding to the sub-sequence.

to_string

public static String to_string(int[] codepoints)
Returns a String constructed from the given array of codepoints.

Parameters:
codepoints - the array of codepoints.
Returns:
the corresponding String.

to_string

public static String to_string(int[] codepoints,
                               int start,
                               int length)
Returns a String constructed from the given array of codepoints, starting at offset start with number of codepoints length.

Parameters:
codepoints - the array of codepoints.
start - the position of the first codepoint to use.
length - the number of codepoints to use.
Returns:
the String.


Copyright © 2005-2006 Peter B. West.