Class BomCharactersetFinder

java.lang.Object
org.alfresco.encoding.AbstractCharactersetFinder
org.alfresco.encoding.BomCharactersetFinder
All Implemented Interfaces:
CharactersetFinder

public class BomCharactersetFinder extends AbstractCharactersetFinder
Byte Order Marker encoding detection.
Since:
2.1
Author:
Pacific Northwest National Lab, Derek Hulley
  • Constructor Details

    • BomCharactersetFinder

      public BomCharactersetFinder()
  • Method Details

    • setBufferSize

      public void setBufferSize(int bufferSize)
      Description copied from class: AbstractCharactersetFinder
      Set the maximum number of bytes to read ahead when attempting to determine the characterset. Most characterset detectors are efficient and can process 8K of buffered data very quickly. Some, may need to be constrained a bit.
      Overrides:
      setBufferSize in class AbstractCharactersetFinder
      Parameters:
      bufferSize - the number of bytes - default 8K.
    • getBufferSize

      protected int getBufferSize()
      Description copied from class: AbstractCharactersetFinder
      Some implementations may only require a few bytes to do detect the stream type, whilst others may be more efficient with larger buffers. In either case, the number of bytes actually present in the buffer cannot be enforced.

      Only override this method if there is a very compelling reason to adjust the buffer size, and then consider handling the AbstractCharactersetFinder.setBufferSize(int) method by issuing a warning. This will prevent users from setting the buffer size when it has no effect.

      Overrides:
      getBufferSize in class AbstractCharactersetFinder
      Returns:
      Returns 64
      See Also:
    • detectCharsetImpl

      protected Charset detectCharsetImpl(byte[] buffer) throws Exception
      Just searches the Byte Order Marker, i.e. the first three characters for a sign of the encoding.
      Specified by:
      detectCharsetImpl in class AbstractCharactersetFinder
      Parameters:
      buffer - the buffer of data no bigger than the requested best buffer size. This can, very efficiently, be turned into an InputStream using a ByteArrayInputStream.
      Returns:
      Returns the charset or null if an accurate conclusion is not possible
      Throws:
      Exception - Any exception, checked or not