Class TikaCharsetFinder

  • All Implemented Interfaces:
    org.alfresco.encoding.CharactersetFinder

    public class TikaCharsetFinder
    extends org.alfresco.encoding.AbstractCharactersetFinder
    Uses Apache Tika as a fallback encoding detector
    Since:
    3.4
    Author:
    Nick Burch
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.nio.charset.Charset detectCharsetImpl​(byte[] buffer)  
      int getThreshold()
      Return the matching threshold before we decide that what we detected is a good match.
      void setThreshold​(int threshold)
      At what point do we decide our match is good enough? In the range 0-100.
      • Methods inherited from class org.alfresco.encoding.AbstractCharactersetFinder

        detectCharset, detectCharset, getBufferSize, setBufferSize
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • TikaCharsetFinder

        public TikaCharsetFinder()
    • Method Detail

      • detectCharsetImpl

        protected java.nio.charset.Charset detectCharsetImpl​(byte[] buffer)
                                                      throws java.lang.Exception
        Specified by:
        detectCharsetImpl in class org.alfresco.encoding.AbstractCharactersetFinder
        Throws:
        java.lang.Exception
      • getThreshold

        public int getThreshold()
        Return the matching threshold before we decide that what we detected is a good match. In the range 0-100.
      • setThreshold

        public void setThreshold​(int threshold)
        At what point do we decide our match is good enough? In the range 0-100. If we don't reach the threshold, we'll decline, and either another finder will work on it or the fallback encoding will be taken.