Build 1.0_r1(from source)

java.nio.charset
Class CharsetEncoder

java.lang.Object
  extended by java.nio.charset.CharsetEncoder

public abstract class CharsetEncoder
extends Object

An converter that can convert 16-bit Unicode character sequence to byte sequence in some charset .

The input character sequence is wrapped by CharBuffer and the output character sequence is ByteBuffer. A encoder instance should be used in following sequence, which is referred to as a encoding operation:

  1. Invoking the reset method to reset the encoder if the encoder has been used;
  2. Invoking the encode method until the additional input is not needed, the endOfInput parameter must be set to false, the input buffer must be filled and the output buffer must be flushed between invocations;
  3. Invoking the encode method last time, and the the endOfInput parameter must be set to true
  4. Invoking the flush method to flush the output.

The encode method will convert as many characters as possible, and the process won't stop except the input characters has been run out of, the output buffer has been filled or some error has happened. A CoderResult instance will be returned to indicate the stop reason, and the invoker can identify the result and choose further action, which can include filling the input buffer, flushing the output buffer, recovering from error and trying again.

There are two common encoding errors. One is named as malformed and it is returned when the input content is illegal 16-bit Unicode character sequence, the other is named as unmappable character and occurs when there is a problem mapping the input to a valid byte sequence in the specific charset.

The two errors can be handled in three ways, the default one is to report the error to the invoker by a CoderResult instance, and the alternatives are to ignore it or to replace the erroneous input with the replacement byte array. The replacement byte array is {(byte)'?'} by default and can be changed by invoking replaceWith method. The invoker of this encoder can choose one way by specifying a CodingErrorAction instance for each error type via onMalformedInput method and onUnmappableCharacter method.

This class is abstract class and encapsulate many common operations of encoding process for all charsets. encoder for specific charset should extend this class and need only implement encodeLoop method for basic encoding loop. If a subclass maintains internal state, it should override the implFlush method and implReset method in addition.

This class is not thread-safe.

See Also:
Charset, CharsetDecoder

Constructor Summary
protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)
          Construct a new CharsetEncoder using given Charset, average number and maximum number of bytes created by this encoder for one input character.
protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)
          Construct a new CharsetEncoder using given Charset, replace byte array, average number and maximum number of bytes created by this encoder for one input character.
 
Method Summary
 float averageBytesPerChar()
          get the average number of bytes created by this encoder for single input character
 boolean canEncode(char c)
          Check if given character can be encoded by this encoder.
 boolean canEncode(CharSequence sequence)
          Check if given CharSequence can be encoded by this encoder.
 Charset charset()
          Get the Charset which creates this encoder.
 ByteBuffer encode(CharBuffer in)
          This is a facade method for encoding operation.
 CoderResult encode(CharBuffer in, ByteBuffer out, boolean endOfInput)
          Encodes characters starting at the current position of the given input buffer, and writes the equivalent byte sequence into the given output buffer from its current position.
protected abstract  CoderResult encodeLoop(CharBuffer in, ByteBuffer out)
          Encode characters into bytes.
 CoderResult flush(ByteBuffer out)
          Flush this encoder.
protected  CoderResult implFlush(ByteBuffer out)
          Flush this encoder.
protected  void implOnMalformedInput(CodingErrorAction newAction)
          Notify that this encoder's CodingErrorAction specified for malformed input error has been changed.
protected  void implOnUnmappableCharacter(CodingErrorAction newAction)
          Notify that this encoder's CodingErrorAction specified for unmappable character error has been changed.
protected  void implReplaceWith(byte[] newReplacement)
          Notify that this encoder's replacement has been changed.
protected  void implReset()
          Reset this encoder's charset related state.
 boolean isLegalReplacement(byte[] repl)
          Check if the given argument is legal as this encoder's replacement byte array.
 CodingErrorAction malformedInputAction()
          Gets this encoder's CodingErrorAction when malformed input occurred during encoding process.
 float maxBytesPerChar()
          Get the maximum number of bytes which can be created by this encoder for one input character, must be positive
 CharsetEncoder onMalformedInput(CodingErrorAction newAction)
          Set this encoder's action on malformed input error.
 CharsetEncoder onUnmappableCharacter(CodingErrorAction newAction)
          Set this encoder's action on unmappable character error.
 byte[] replacement()
          Get the replacement byte array, which is never null or empty, and it is legal
 CharsetEncoder replaceWith(byte[] replacement)
          Set new replacement value.
 CharsetEncoder reset()
          Reset this encoder.
 CodingErrorAction unmappableCharacterAction()
          Gets this encoder's CodingErrorAction when unmappable character occurred during encoding process.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CharsetEncoder

protected CharsetEncoder(Charset cs,
                         float averageBytesPerChar,
                         float maxBytesPerChar)
Construct a new CharsetEncoder using given Charset, average number and maximum number of bytes created by this encoder for one input character.

Parameters:
cs - this encoder's Charset, which create this encoder
averageBytesPerChar - average number of bytes created by this encoder for one input character, must be positive
maxBytesPerChar - maximum number of bytes which can be created by this encoder for one input character, must be positive
Throws:
IllegalArgumentException - if maxBytesPerChar or averageBytePerChar is negative

CharsetEncoder

protected CharsetEncoder(Charset cs,
                         float averageBytesPerChar,
                         float maxBytesPerChar,
                         byte[] replacement)
Construct a new CharsetEncoder using given Charset, replace byte array, average number and maximum number of bytes created by this encoder for one input character.

Parameters:
cs - the this encoder's Charset, which create this encoder
averageBytesPerChar - average number of bytes created by this encoder for single input character, must be positive
maxBytesPerChar - maximum number of bytes which can be created by this encoder for single input character, must be positive
replacement - the replacement byte array, cannot be null or empty, its length cannot larger than maxBytesPerChar, and must be legal replacement, which can be justified by isLegalReplacement
Throws:
IllegalArgumentException - if any parameters are invalid
Method Detail

averageBytesPerChar

public final float averageBytesPerChar()
get the average number of bytes created by this encoder for single input character

Returns:
the average number of bytes created by this encoder for single input character

canEncode

public boolean canEncode(char c)
Check if given character can be encoded by this encoder. Note that this method can change the internal status of this encoder, so it should not be called when another encode process is ongoing, otherwise it will throw IllegalStateException. This method can be overridden for performance improvement.

Parameters:
c - the given encoder
Returns:
true if given character can be encoded by this encoder
Throws:
IllegalStateException - if another encode process is ongoing so that current internal status is neither RESET or FLUSH

canEncode

public boolean canEncode(CharSequence sequence)
Check if given CharSequence can be encoded by this encoder. Note that this method can change the internal status of this encoder, so it should not be called when another encode process is ongoing, otherwise it will throw IllegalStateException. This method can be overridden for performance improvement.

Parameters:
sequence - the given CharSequence
Returns:
true if given CharSequence can be encoded by this encoder
Throws:
IllegalStateException - if current internal status is neither RESET or FLUSH

charset

public final Charset charset()
Get the Charset which creates this encoder.

Returns:
the Charset which creates this encoder

encode

public final ByteBuffer encode(CharBuffer in)
                        throws CharacterCodingException
This is a facade method for encoding operation.

This method encodes the remaining character sequence of the given character buffer into a new byte buffer. This method performs a complete encoding operation, resets at first, then encodes, and flushes at last.

This method should not be invoked if another encode operation is ongoing.

Parameters:
in - the input buffer
Returns:
a new ByteBuffer containing the the bytes produced by this encoding operation. The buffer's limit will be the position of last byte in buffer, and the position will be zero
Throws:
IllegalStateException - if another encoding operation is ongoing
MalformedInputException - if illegal input character sequence for this charset encountered, and the action for malformed error is CodingErrorAction.REPORT
UnmappableCharacterException - if legal but unmappable input character sequence for this charset encountered, and the action for unmappable character error is CodingErrorAction.REPORT. Unmappable means the Unicode character sequence at the input buffer's current position cannot be mapped to a equivalent byte sequence.
CharacterCodingException - if other exception happened during the encode operation

encode

public final CoderResult encode(CharBuffer in,
                                ByteBuffer out,
                                boolean endOfInput)
Encodes characters starting at the current position of the given input buffer, and writes the equivalent byte sequence into the given output buffer from its current position.

The buffers' position will be changed with the reading and writing operation, but their limits and marks will be kept intact.

A CoderResult instance will be returned according to following rules:

The endOfInput parameter indicates that if the invoker can provider further input. This parameter is true if and only if the characters in current input buffer are all inputs for this encoding operation. Note that it is common and won't cause error that the invoker sets false and then finds no more input available; while it may cause error that the invoker always sets true in several consecutive invocations so that any remaining input will be treated as malformed input.

This method invokes encodeLoop method to implement basic encode logic for specific charset.

Parameters:
in - the input buffer
out - the output buffer
endOfInput - true if all the input characters have been provided
Returns:
a CoderResult instance indicating the result
Throws:
IllegalStateException - if the encoding operation has already started or no more input needed in this encoding progress.
CoderMalfunctionError - If the encodeLoop method threw an BufferUnderflowException or BufferUnderflowException

encodeLoop

protected abstract CoderResult encodeLoop(CharBuffer in,
                                          ByteBuffer out)
Encode characters into bytes. This method is called by encode. This method will implement the essential encoding operation, and it won't stop encoding until either all the input characters are read, the output buffer is filled, or some exception encountered. And then it will return a CoderResult object indicating the result of current encoding operation. The rules to construct the CoderResult is same as the encode. When exception encountered in the encoding operation, most implementation of this method will return a relevant result object to encode method, and some performance optimized implementation may handle the exception and implement the error action itself. The buffers are scanned from their current positions, and their positions will be modified accordingly, while their marks and limits will be intact. At most in.remaining() characters will be read, and out.remaining() bytes will be written. Note that some implementation may pre-scan the input buffer and return CoderResult.UNDERFLOW until it receives sufficient input.

Parameters:
in - the input buffer
out - the output buffer
Returns:
a CoderResult instance indicating the result

flush

public final CoderResult flush(ByteBuffer out)
Flush this encoder. This method will call implFlush. Some encoders may need to write some bytes to the output buffer when they have read all input characters, subclasses can overridden implFlush to perform writing action. The maximum number of written bytes won't larger than out.remaining(). If some encoder want to write more bytes than output buffer's remaining spaces, then CoderResult.OVERFLOW will be returned, and this method must be called again with a byte buffer has more spaces. Otherwise this method will return CoderResult.UNDERFLOW, which means one encoding process has been completed successfully. During the flush, the output buffer's position will be changed accordingly, while its mark and limit will be intact.

Parameters:
out - the given output buffer
Returns:
CoderResult.UNDERFLOW or CoderResult.OVERFLOW
Throws:
IllegalStateException - if this encoder hasn't read all input characters during one encoding process, which means neither after calling encode(CharBuffer) nor after calling encode(CharBuffer, ByteBuffer, boolean) with true value for the last boolean parameter

implFlush

protected CoderResult implFlush(ByteBuffer out)
Flush this encoder. Default implementation does nothing and always return CoderResult.UNDERFLOW, and this method can be overridden if needed.

Parameters:
out - the output buffer
Returns:
CoderResult.UNDERFLOW or CoderResult.OVERFLOW

implOnMalformedInput

protected void implOnMalformedInput(CodingErrorAction newAction)
Notify that this encoder's CodingErrorAction specified for malformed input error has been changed. Default implementation does nothing, and this method can be overridden if needed.

Parameters:
newAction - The new action

implOnUnmappableCharacter

protected void implOnUnmappableCharacter(CodingErrorAction newAction)
Notify that this encoder's CodingErrorAction specified for unmappable character error has been changed. Default implementation does nothing, and this method can be overridden if needed.

Parameters:
newAction - The new action

implReplaceWith

protected void implReplaceWith(byte[] newReplacement)
Notify that this encoder's replacement has been changed. Default implementation does nothing, and this method can be overridden if needed.

Parameters:
newReplacement - the new replacement string

implReset

protected void implReset()
Reset this encoder's charset related state. Default implementation does nothing, and this method can be overridden if needed.


isLegalReplacement

public boolean isLegalReplacement(byte[] repl)
Check if the given argument is legal as this encoder's replacement byte array. The given byte array is legal if and only if it can be decode into sixteen bits Unicode characters. This method can be overridden for performance improvement.

Parameters:
repl - the given byte array to be checked
Returns:
true if the the given argument is legal as this encoder's replacement byte array.

malformedInputAction

public CodingErrorAction malformedInputAction()
Gets this encoder's CodingErrorAction when malformed input occurred during encoding process.

Returns:
this encoder's CodingErrorAction when malformed input occurred during encoding process.

maxBytesPerChar

public final float maxBytesPerChar()
Get the maximum number of bytes which can be created by this encoder for one input character, must be positive

Returns:
the maximum number of bytes which can be created by this encoder for one input character, must be positive

onMalformedInput

public final CharsetEncoder onMalformedInput(CodingErrorAction newAction)
Set this encoder's action on malformed input error. This method will call the implOnMalformedInput method with the given new action as argument.

Parameters:
newAction - the new action on malformed input error
Returns:
this encoder
Throws:
IllegalArgumentException - if the given newAction is null

onUnmappableCharacter

public final CharsetEncoder onUnmappableCharacter(CodingErrorAction newAction)
Set this encoder's action on unmappable character error. This method will call the implOnUnmappableCharacter method with the given new action as argument.

Parameters:
newAction - the new action on unmappable character error
Returns:
this encoder
Throws:
IllegalArgumentException - if the given newAction is null

replacement

public final byte[] replacement()
Get the replacement byte array, which is never null or empty, and it is legal

Returns:
the replacement byte array, cannot be null or empty, and it is legal

replaceWith

public final CharsetEncoder replaceWith(byte[] replacement)
Set new replacement value. This method first checks the given replacement's validity, then changes the replacement value, and at last calls implReplaceWith method with the given new replacement as argument.

Parameters:
replacement - the replacement byte array, cannot be null or empty, its length cannot larger than maxBytesPerChar, and must be legal replacement, which can be justified by isLegalReplacement(byte[] repl)
Returns:
this encoder
Throws:
IllegalArgumentException - if the given replacement cannot satisfy the requirement mentioned above

reset

public final CharsetEncoder reset()
Reset this encoder. This method will reset internal status, and then call implReset() to reset any status related to specific charset.

Returns:
this encoder

unmappableCharacterAction

public CodingErrorAction unmappableCharacterAction()
Gets this encoder's CodingErrorAction when unmappable character occurred during encoding process.

Returns:
this encoder's CodingErrorAction when unmappable character occurred during encoding process.

Build 1.0_r1(from source)

Please submit a feedback, bug or feature