Class BomCharactersetFinder

  • All Implemented Interfaces:
    CharactersetFinder

    public class BomCharactersetFinder
    extends AbstractCharactersetFinder
    Byte Order Marker encoding detection.
    Since:
    2.1
    Author:
    Pacific Northwest National Lab, Derek Hulley
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.nio.charset.Charset detectCharsetImpl​(byte[] buffer)
      Just searches the Byte Order Marker, i.e.
      protected int getBufferSize()
      Some implementations may only require a few bytes to do detect the stream type, whilst others may be more efficient with larger buffers.
      void setBufferSize​(int bufferSize)
      Set the maximum number of bytes to read ahead when attempting to determine the characterset.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • BomCharactersetFinder

        public BomCharactersetFinder()
    • Method Detail

      • setBufferSize

        public void setBufferSize​(int bufferSize)
        Description copied from class: AbstractCharactersetFinder
        Set the maximum number of bytes to read ahead when attempting to determine the characterset. Most characterset detectors are efficient and can process 8K of buffered data very quickly. Some, may need to be constrained a bit.
        Overrides:
        setBufferSize in class AbstractCharactersetFinder
        Parameters:
        bufferSize - the number of bytes - default 8K.
      • detectCharsetImpl

        protected java.nio.charset.Charset detectCharsetImpl​(byte[] buffer)
                                                      throws java.lang.Exception
        Just searches the Byte Order Marker, i.e. the first three characters for a sign of the encoding.
        Specified by:
        detectCharsetImpl in class AbstractCharactersetFinder
        Parameters:
        buffer - the buffer of data no bigger than the requested best buffer size. This can, very efficiently, be turned into an InputStream using a ByteArrayInputStream.
        Returns:
        Returns the charset or null if an accurate conclusion is not possible
        Throws:
        java.lang.Exception - Any exception, checked or not