java.lang.Comparable<BinaryComparable>, Writable, WritableComparable<BinaryComparable>@Stringable @Public @Stable public class Text extends BinaryComparable implements WritableComparable<BinaryComparable>
In addition, it provides methods for string traversal without converting the byte array to a string.
Also includes utilities for serializing/deserialing a string, coding/decoding a string, checking if a byte array contains valid UTF8 code, calculating the length of an encoded string.
| Modifier and Type | Class | Description |
|---|---|---|
static class |
Text.Comparator |
A WritableComparator optimized for Text keys.
|
| Modifier and Type | Field | Description |
|---|---|---|
static int |
DEFAULT_MAX_LEN |
| Constructor | Description |
|---|---|
Text() |
Construct an empty text string.
|
Text(byte[] utf8) |
Construct from a byte array.
|
Text(java.lang.String string) |
Construct from a string.
|
Text(Text utf8) |
Construct from another text.
|
| Modifier and Type | Method | Description |
|---|---|---|
void |
append(byte[] utf8,
int start,
int len) |
Append a range of bytes to the end of the given text.
|
static int |
bytesToCodePoint(java.nio.ByteBuffer bytes) |
|
int |
charAt(int position) |
Returns the Unicode Scalar Value (32-bit integer value)
for the character at
position. |
void |
clear() |
Clear the string to empty.
|
byte[] |
copyBytes() |
|
static java.lang.String |
decode(byte[] utf8) |
|
static java.lang.String |
decode(byte[] utf8,
int start,
int length) |
|
static java.lang.String |
decode(byte[] utf8,
int start,
int length,
boolean replace) |
|
static java.nio.ByteBuffer |
encode(java.lang.String string) |
Converts the provided String to bytes using the
UTF-8 encoding.
|
static java.nio.ByteBuffer |
encode(java.lang.String string,
boolean replace) |
Converts the provided String to bytes using the
UTF-8 encoding.
|
boolean |
equals(java.lang.Object o) |
Returns true iff
o is a Text with the same length and same
contents. |
int |
find(java.lang.String what) |
|
int |
find(java.lang.String what,
int start) |
Finds any occurrence of
what in the backing
buffer, starting as position start. |
byte[] |
getBytes() |
Returns the raw bytes; however, only data up to
getLength() is
valid. |
int |
getLength() |
Returns the number of bytes in the byte array.
|
int |
getTextLength() |
|
int |
hashCode() |
Return a hash of the bytes returned from {#getBytes()}.
|
void |
readFields(java.io.DataInput in) |
Deserialize the fields of this object from
in. |
void |
readFields(java.io.DataInput in,
int maxLength) |
|
static java.lang.String |
readString(java.io.DataInput in) |
|
static java.lang.String |
readString(java.io.DataInput in,
int maxLength) |
|
void |
readWithKnownLength(java.io.DataInput in,
int len) |
Read a Text object whose length is already known.
|
void |
set(byte[] utf8) |
Set to a utf8 byte array.
|
void |
set(byte[] utf8,
int start,
int len) |
Set the Text to range of bytes.
|
void |
set(java.lang.String string) |
Set to contain the contents of a string.
|
void |
set(Text other) |
Copy a text.
|
static void |
skip(java.io.DataInput in) |
Skips over one Text in the input.
|
java.lang.String |
toString() |
|
static int |
utf8Length(java.lang.String string) |
For the given string, returns the number of UTF-8 bytes
required to encode the string.
|
static void |
validateUTF8(byte[] utf8) |
Check if a byte array contains valid UTF-8.
|
static void |
validateUTF8(byte[] utf8,
int start,
int len) |
Check to see if a byte array is valid UTF-8.
|
void |
write(java.io.DataOutput out) |
Serialize.
|
void |
write(java.io.DataOutput out,
int maxLength) |
|
static int |
writeString(java.io.DataOutput out,
java.lang.String s) |
Write a UTF8 encoded string to out.
|
static int |
writeString(java.io.DataOutput out,
java.lang.String s,
int maxLength) |
compareTo, compareTopublic static final int DEFAULT_MAX_LEN
public Text()
public Text(java.lang.String string)
string - input string.public Text(Text utf8)
utf8 - input utf8.public Text(byte[] utf8)
utf8 - input utf8.public byte[] copyBytes()
getBytes() for faster access to the underlying array.public byte[] getBytes()
getLength() is
valid. Please use copyBytes() if you
need the returned array to be precisely the length of the data.getBytes in class BinaryComparablepublic int getLength()
getLength in class BinaryComparablepublic int getTextLength()
public int charAt(int position)
position. Note that this
method avoids using the converter or doing String instantiation.position - input position.public int find(java.lang.String what)
public int find(java.lang.String what,
int start)
what in the backing
buffer, starting as position start. The starting
position is measured in bytes and the return value is in
terms of byte position in the buffer. The backing buffer is
not converted to a string for this operation.what - input what.start - input start.public void set(java.lang.String string)
string - input string.public void set(byte[] utf8)
utf8 is
zero, actually clear bytes and any existing
data is lost.utf8 - input utf8.public void set(Text other)
other - other.public void set(byte[] utf8,
int start,
int len)
utf8 - the data to copy fromstart - the first position of the new stringlen - the number of bytes of the new stringpublic void append(byte[] utf8,
int start,
int len)
utf8 - the data to copy fromstart - the first position to append from utf8len - the number of bytes to appendpublic void clear()
getBytes().
In order to free the byte-array memory, call set(byte[])
with an empty byte array (For example, new byte[0]).public java.lang.String toString()
toString in class java.lang.Objectpublic void readFields(java.io.DataInput in)
throws java.io.IOException
Writablein.
For efficiency, implementations should attempt to re-use storage in the existing object where possible.
readFields in interface Writablein - DataInput to deseriablize this object from.java.io.IOException - any other problem for readFields.public void readFields(java.io.DataInput in,
int maxLength)
throws java.io.IOException
java.io.IOExceptionpublic static void skip(java.io.DataInput in)
throws java.io.IOException
in - input in.java.io.IOException - raised on errors performing I/O.public void readWithKnownLength(java.io.DataInput in,
int len)
throws java.io.IOException
in - input in.len - input len.java.io.IOException - raised on errors performing I/O.public void write(java.io.DataOutput out)
throws java.io.IOException
write in interface Writableout - DataOuput to serialize this object into.java.io.IOException - any other problem for write.Writable.write(DataOutput)public void write(java.io.DataOutput out,
int maxLength)
throws java.io.IOException
java.io.IOExceptionpublic boolean equals(java.lang.Object o)
o is a Text with the same length and same
contents.equals in class BinaryComparablepublic int hashCode()
BinaryComparablehashCode in class BinaryComparableWritableComparator.hashBytes(byte[],int)public static java.lang.String decode(byte[] utf8)
throws java.nio.charset.CharacterCodingException
utf8 - input utf8.java.nio.charset.CharacterCodingException - when a character
encoding or decoding error occurs.public static java.lang.String decode(byte[] utf8,
int start,
int length)
throws java.nio.charset.CharacterCodingException
java.nio.charset.CharacterCodingExceptionpublic static java.lang.String decode(byte[] utf8,
int start,
int length,
boolean replace)
throws java.nio.charset.CharacterCodingException
utf8 - input utf8.start - input start.length - input length.replace - input replace.replace is true, then
malformed input is replaced with the
substitution character, which is U+FFFD. Otherwise the
method throws a MalformedInputException.java.nio.charset.CharacterCodingException - when a character
encoding or decoding error occurs.public static java.nio.ByteBuffer encode(java.lang.String string)
throws java.nio.charset.CharacterCodingException
string - input string.java.nio.charset.CharacterCodingException - when a character
encoding or decoding error occurs.public static java.nio.ByteBuffer encode(java.lang.String string,
boolean replace)
throws java.nio.charset.CharacterCodingException
replace is true, then
malformed input is replaced with the
substitution character, which is U+FFFD. Otherwise the
method throws a MalformedInputException.string - input string.replace - input replace.java.nio.charset.CharacterCodingException - when a character
encoding or decoding error occurs.public static java.lang.String readString(java.io.DataInput in)
throws java.io.IOException
in - input in.java.io.IOException - raised on errors performing I/O.public static java.lang.String readString(java.io.DataInput in,
int maxLength)
throws java.io.IOException
in - input datainput.maxLength - input maxLength.java.io.IOException - raised on errors performing I/O.public static int writeString(java.io.DataOutput out,
java.lang.String s)
throws java.io.IOException
out - input out.s - input s.java.io.IOException - raised on errors performing I/O.public static int writeString(java.io.DataOutput out,
java.lang.String s,
int maxLength)
throws java.io.IOException
out - input out.s - input s.maxLength - input maxLength.java.io.IOException - raised on errors performing I/O.public static void validateUTF8(byte[] utf8)
throws java.nio.charset.MalformedInputException
utf8 - byte arrayjava.nio.charset.MalformedInputException - if the byte array contains invalid UTF-8public static void validateUTF8(byte[] utf8,
int start,
int len)
throws java.nio.charset.MalformedInputException
utf8 - the array of bytesstart - the offset of the first byte in the arraylen - the length of the byte sequencejava.nio.charset.MalformedInputException - if the byte array contains invalid bytespublic static int bytesToCodePoint(java.nio.ByteBuffer bytes)
bytes - input bytes.public static int utf8Length(java.lang.String string)
string - text to encodeCopyright © 2008–2025 Apache Software Foundation. All rights reserved.