Class TikaCharsetFinder

  • All Implemented Interfaces:
    org.alfresco.encoding.CharactersetFinder

    public class TikaCharsetFinder
    extends org.alfresco.encoding.AbstractCharactersetFinder
    Uses Apache Tika as a fallback encoding detector
    Since:
    3.4
    Author:
    Nick Burch
    • Constructor Detail

      • TikaCharsetFinder

        public TikaCharsetFinder()
    • Method Detail

      • detectCharsetImpl

        protected Charset detectCharsetImpl​(byte[] buffer)
                                     throws Exception
        Specified by:
        detectCharsetImpl in class org.alfresco.encoding.AbstractCharactersetFinder
        Throws:
        Exception
      • getThreshold

        public int getThreshold()
        Return the matching threshold before we decide that what we detected is a good match. In the range 0-100.
      • setThreshold

        public void setThreshold​(int threshold)
        At what point do we decide our match is good enough? In the range 0-100. If we don't reach the threshold, we'll decline, and either another finder will work on it or the fallback encoding will be taken.