public interface CharactersetFinder
There are quite a few libraries that do this, but none are perfect. It is therefore necessary to abstract the implementation to allow these finders to be configured in as required.
Implementations should have a default constructor and be completely thread safe and stateless. This will allow them to be constructed and held indefinitely to do the decoding work.
Where the encoding cannot be determined, it is left to the client to decide what to do. Some implementations may guess and encoding or use a default guess - it is up to the implementation to specify the behaviour.
Modifier and Type | Method and Description |
---|---|
Charset |
detectCharset(byte[] buffer)
Attempt to detect the character set encoding for the given buffer.
|
Charset |
detectCharset(InputStream is)
Attempt to detect the character set encoding for the give input stream.
|
Charset detectCharset(InputStream is)
BufferedInputStream
.
The current state of the stream will be restored before the method returns.
is
- an input stream that must support markingCharset detectCharset(byte[] buffer)
buffer
- the first n bytes of the character streamCopyright © 2005–2015 Alfresco Software. All rights reserved.