Build 1.0_r1(from source)

java.nio.charset
Class CharsetDecoder

java.lang.Object
  extended by java.nio.charset.CharsetDecoder

public abstract class CharsetDecoder
extends Object

An converter that can convert bytes sequence in some charset to 16-bit Unicode character sequence.

The input byte sequence is wrapped by ByteBuffer and the output character sequence is CharBuffer. A decoder instance should be used in following sequence, which is referred to as a decoding operation:

  1. Invoking the reset method to reset the decoder if the decoder has been used;
  2. Invoking the decode method until the additional input is not needed, the endOfInput parameter must be set to false, the input buffer must be filled and the output buffer must be flushed between invocations;
  3. Invoking the decode method last time, and the the endOfInput parameter must be set to true
  4. Invoking the flush method to flush the output.

The decode method will convert as many bytes as possible, and the process won't stop except the input bytes has been run out of, the output buffer has been filled or some error has happened. A CoderResult instance will be returned to indicate the stop reason, and the invoker can identify the result and choose further action, which can include filling the input buffer, flushing the output buffer, recovering from error and trying again.

There are two common decoding errors. One is named as malformed and it is returned when the input byte sequence is illegal for current specific charset, the other is named as unmappable character and it is returned when a problem occurs mapping a legal input byte sequence to its Unicode character equivalent.

The two errors can be handled in three ways, the default one is to report the error to the invoker by a CoderResult instance, and the alternatives are to ignore it or to replace the erroneous input with the replacement string. The replacement string is "�" by default and can be changed by invoking replaceWith method. The invoker of this decoder can choose one way by specifying a CodingErrorAction instance for each error type via onMalformedInput method and onUnmappableCharacter method.

This class is abstract class and encapsulate many common operations of decoding process for all charsets. Decoder for specific charset should extend this class and need only implement decodeLoop method for basic decoding loop. If a subclass maintains internal state, it should override the implFlush method and implReset method in addition.

This class is not thread-safe.

See Also:
Charset, CharsetEncoder

Constructor Summary
protected CharsetDecoder(Charset charset, float averageCharsPerByte, float maxCharsPerByte)
          Construct a new CharsetDecoder using given Charset, average number and maximum number of characters created by this decoder for one input byte, and the default replacement string "�".
 
Method Summary
 float averageCharsPerByte()
          get the average number of characters created by this decoder for single input byte
 Charset charset()
          Get the Charset which creates this decoder.
 CharBuffer decode(ByteBuffer in)
          This is a facade method for decoding operation.
 CoderResult decode(ByteBuffer in, CharBuffer out, boolean endOfInput)
          Decodes bytes starting at the current position of the given input buffer, and writes the equivalent character sequence into the given output buffer from its current position.
protected abstract  CoderResult decodeLoop(ByteBuffer in, CharBuffer out)
          Decode bytes into characters.
 Charset detectedCharset()
          Get the charset detected by this decoder, this method is optional.
 CoderResult flush(CharBuffer out)
          Flush this decoder.
protected  CoderResult implFlush(CharBuffer out)
          Flush this decoder.
protected  void implOnMalformedInput(CodingErrorAction newAction)
          Notify that this decoder's CodingErrorAction specified for malformed input error has been changed.
protected  void implOnUnmappableCharacter(CodingErrorAction newAction)
          Notify that this decoder's CodingErrorAction specified for unmappable character error has been changed.
protected  void implReplaceWith(String newReplacement)
          Notify that this decoder's replacement has been changed.
protected  void implReset()
          Reset this decoder's charset related state.
 boolean isAutoDetecting()
          Get if this decoder implements an auto-detecting charset.
 boolean isCharsetDetected()
          Get if this decoder has detected a charset, this method is optional.
 CodingErrorAction malformedInputAction()
          Gets this decoder's CodingErrorAction when malformed input occurred during decoding process.
 float maxCharsPerByte()
          Get the maximum number of characters which can be created by this decoder for one input byte, must be positive
 CharsetDecoder onMalformedInput(CodingErrorAction newAction)
          Set this decoder's action on malformed input error.
 CharsetDecoder onUnmappableCharacter(CodingErrorAction newAction)
          Set this decoder's action on unmappable character error.
 String replacement()
          Get the replacement string, which is never null or empty
 CharsetDecoder replaceWith(String newReplacement)
          Set new replacement value.
 CharsetDecoder reset()
          Reset this decoder.
 CodingErrorAction unmappableCharacterAction()
          Gets this decoder's CodingErrorAction when unmappable character occurred during decoding process.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CharsetDecoder

protected CharsetDecoder(Charset charset,
                         float averageCharsPerByte,
                         float maxCharsPerByte)
Construct a new CharsetDecoder using given Charset, average number and maximum number of characters created by this decoder for one input byte, and the default replacement string "�".

Parameters:
charset - this decoder's Charset, which create this decoder
averageCharsPerByte - average number of characters created by this decoder for one input byte, must be positive
maxCharsPerByte - maximum number of characters created by this decoder for one input byte, must be positive
Throws:
IllegalArgumentException - if averageCharsPerByte or maxCharsPerByte is negative
Method Detail

averageCharsPerByte

public final float averageCharsPerByte()
get the average number of characters created by this decoder for single input byte

Returns:
the average number of characters created by this decoder for single input byte

charset

public final Charset charset()
Get the Charset which creates this decoder.

Returns:
the Charset which creates this decoder

decode

public final CharBuffer decode(ByteBuffer in)
                        throws CharacterCodingException
This is a facade method for decoding operation.

This method decodes the remaining byte sequence of the given byte buffer into a new character buffer. This method performs a complete decoding operation, resets at first, then decodes, and flushes at last.

This method should not be invoked if another decode operation is ongoing.

Parameters:
in - the input buffer
Returns:
a new CharBuffer containing the the characters produced by this decoding operation. The buffer's limit will be the position of last character in buffer, and the position will be zero
Throws:
IllegalStateException - if another decoding operation is ongoing
MalformedInputException - if illegal input byte sequence for this charset encountered, and the action for malformed error is CodingErrorAction.REPORT
UnmappableCharacterException - if legal but unmappable input byte sequence for this charset encountered, and the action for unmappable character error is CodingErrorAction.REPORT. Unmappable means the byte sequence at the input buffer's current position cannot be mapped to a Unicode character sequence.
CharacterCodingException - if other exception happened during the decode operation

decode

public final CoderResult decode(ByteBuffer in,
                                CharBuffer out,
                                boolean endOfInput)
Decodes bytes starting at the current position of the given input buffer, and writes the equivalent character sequence into the given output buffer from its current position.

The buffers' position will be changed with the reading and writing operation, but their limits and marks will be kept intact.

A CoderResult instance will be returned according to following rules:

The endOfInput parameter indicates that if the invoker can provider further input. This parameter is true if and only if the bytes in current input buffer are all inputs for this decoding operation. Note that it is common and won't cause error that the invoker sets false and then finds no more input available; while it may cause error that the invoker always sets true in several consecutive invocations so that any remaining input will be treated as malformed input.

This method invokes decodeLoop method to implement basic decode logic for specific charset.

Parameters:
in - the input buffer
out - the output buffer
endOfInput - true if all the input characters have been provided
Returns:
a CoderResult instance which indicates the reason of termination
Throws:
IllegalStateException - if decoding has started or no more input is needed in this decoding progress.
CoderMalfunctionError - if the decodeLoop method threw an BufferUnderflowException or BufferOverflowException

decodeLoop

protected abstract CoderResult decodeLoop(ByteBuffer in,
                                          CharBuffer out)
Decode bytes into characters. This method is called by decode method. This method will implement the essential decoding operation, and it won't stop decoding until either all the input bytes are read, the output buffer is filled, or some exception encountered. And then it will return a CoderResult object indicating the result of current decoding operation. The rules to construct the CoderResult is same as the decode. When exception encountered in the decoding operation, most implementation of this method will return a relevant result object to decode method, and some performance optimized implementation may handle the exception and implement the error action itself. The buffers are scanned from their current positions, and their positions will be modified accordingly, while their marks and limits will be intact. At most in.remaining() characters will be read, and out.remaining() bytes will be written. Note that some implementation may pre-scan the input buffer and return CoderResult.UNDERFLOW until it receives sufficient input.

Parameters:
in - the input buffer
out - the output buffer
Returns:
a CoderResult instance indicating the result

detectedCharset

public Charset detectedCharset()
Get the charset detected by this decoder, this method is optional.

If implementing an auto-detecting charset, then this decoder returns the detected charset from this method when it is available. The returned charset will be the same for the rest of the decode operation.

If insufficient bytes have been read to determine the charset, IllegalStateException will be thrown.

The default implementation always throws UnsupportedOperationException, so it should be overridden by subclass if needed.

Returns:
the charset detected by this decoder, or null if it is not yet determined
Throws:
UnsupportedOperationException - if this decoder does not implement an auto-detecting charset
IllegalStateException - if insufficient bytes have been read to determine the charset

flush

public final CoderResult flush(CharBuffer out)
Flush this decoder. This method will call implFlush. Some decoders may need to write some characters to the output buffer when they have read all input bytes, subclasses can overridden implFlush to perform writing action. The maximum number of written bytes won't larger than out.remaining(). If some decoder want to write more bytes than output buffer's remaining spaces, then CoderResult.OVERFLOW will be returned, and this method must be called again with a character buffer that has more spaces. Otherwise this method will return CoderResult.UNDERFLOW, which means one decoding process has been completed successfully. During the flush, the output buffer's position will be changed accordingly, while its mark and limit will be intact.

Parameters:
out - the given output buffer
Returns:
CoderResult.UNDERFLOW or CoderResult.OVERFLOW
Throws:
IllegalStateException - if this decoder hasn't read all input bytes during one decoding process, which means neither after calling decode(ByteBuffer) nor after calling decode(ByteBuffer, CharBuffer, boolean) with true value for the last boolean parameter

implFlush

protected CoderResult implFlush(CharBuffer out)
Flush this decoder. Default implementation does nothing and always return CoderResult.UNDERFLOW, and this method can be overridden if needed.

Parameters:
out - the output buffer
Returns:
CoderResult.UNDERFLOW or CoderResult.OVERFLOW

implOnMalformedInput

protected void implOnMalformedInput(CodingErrorAction newAction)
Notify that this decoder's CodingErrorAction specified for malformed input error has been changed. Default implementation does nothing, and this method can be overridden if needed.

Parameters:
newAction - The new action

implOnUnmappableCharacter

protected void implOnUnmappableCharacter(CodingErrorAction newAction)
Notify that this decoder's CodingErrorAction specified for unmappable character error has been changed. Default implementation does nothing, and this method can be overridden if needed.

Parameters:
newAction - The new action

implReplaceWith

protected void implReplaceWith(String newReplacement)
Notify that this decoder's replacement has been changed. Default implementation does nothing, and this method can be overridden if needed.

Parameters:
newReplacement - the new replacement string

implReset

protected void implReset()
Reset this decoder's charset related state. Default implementation does nothing, and this method can be overridden if needed.


isAutoDetecting

public boolean isAutoDetecting()
Get if this decoder implements an auto-detecting charset.

Returns:
true if this decoder implements an auto-detecting charset

isCharsetDetected

public boolean isCharsetDetected()
Get if this decoder has detected a charset, this method is optional.

If this decoder implements an auto-detecting charset, then this method may start to return true during decoding operation to indicate that a charset has been detected in the input bytes and that the charset can be retrieved by invoking detectedCharset method.

Note that a decoder that implements an auto-detecting charset may still succeed in decoding a portion of the given input even when it is unable to detect the charset. For this reason users should be aware that a false return value does not indicate that no decoding took place.

The default implementation always throws an UnsupportedOperationException; it should be overridden by subclass if needed.

Returns:
true this decoder has detected a charset
Throws:
UnsupportedOperationException - if this decoder doesn't implement an auto-detecting charset

malformedInputAction

public CodingErrorAction malformedInputAction()
Gets this decoder's CodingErrorAction when malformed input occurred during decoding process.

Returns:
this decoder's CodingErrorAction when malformed input occurred during decoding process.

maxCharsPerByte

public final float maxCharsPerByte()
Get the maximum number of characters which can be created by this decoder for one input byte, must be positive

Returns:
the maximum number of characters which can be created by this decoder for one input byte, must be positive

onMalformedInput

public final CharsetDecoder onMalformedInput(CodingErrorAction newAction)
Set this decoder's action on malformed input error. This method will call the implOnMalformedInput method with the given new action as argument.

Parameters:
newAction - the new action on malformed input error
Returns:
this decoder
Throws:
IllegalArgumentException - if the given newAction is null

onUnmappableCharacter

public final CharsetDecoder onUnmappableCharacter(CodingErrorAction newAction)
Set this decoder's action on unmappable character error. This method will call the implOnUnmappableCharacter method with the given new action as argument.

Parameters:
newAction - the new action on unmappable character error
Returns:
this decoder
Throws:
IllegalArgumentException - if the given newAction is null

replacement

public final String replacement()
Get the replacement string, which is never null or empty

Returns:
the replacement string, cannot be null or empty

replaceWith

public final CharsetDecoder replaceWith(String newReplacement)
Set new replacement value. This method first checks the given replacement's validity, then changes the replacement value, and at last calls implReplaceWith method with the given new replacement as argument.

Parameters:
newReplacement - the replacement string, cannot be null or empty
Returns:
this decoder
Throws:
IllegalArgumentException - if the given replacement cannot satisfy the requirement mentioned above

reset

public final CharsetDecoder reset()
Reset this decoder. This method will reset internal status, and then call implReset() to reset any status related to specific charset.

Returns:
this decoder

unmappableCharacterAction

public CodingErrorAction unmappableCharacterAction()
Gets this decoder's CodingErrorAction when unmappable character occurred during decoding process.

Returns:
this decoder's CodingErrorAction when unmappable character occurred during decoding process.

Build 1.0_r1(from source)

Please submit a feedback, bug or feature