Class XPathMetadataExtracter

java.lang.Object
org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter
All Implemented Interfaces:
NamespaceContext, ContentWorker, MetadataEmbedder, MetadataExtracter, org.springframework.beans.factory.Aware, org.springframework.beans.factory.BeanNameAware, org.springframework.context.ApplicationContextAware

public class XPathMetadataExtracter extends AbstractMappingMetadataExtracter implements NamespaceContext
An extracter that pulls values from XML documents using configurable XPath statements. It is not possible to list a default set of mappings - this is down to the configuration only.

When an instance of this extracter is configured, XPath statements should be provided to extract all the available metadata. The implementation is sensitive to what is actually requested by the configured mapping and will only perform the queries necessary to fulfill the requirements.

To summarize, there are two configurations required for this class:

All values are extracted as text values and therefore all XPath statements must evaluate to a node that can be rendered as text.

Since:
2.1
Author:
Derek Hulley
See Also:
  • Field Details

    • SUPPORTED_MIMETYPES

      public static String[] SUPPORTED_MIMETYPES
  • Constructor Details

    • XPathMetadataExtracter

      public XPathMetadataExtracter()
      Default constructor
  • Method Details

    • getNamespaceURI

      public String getNamespaceURI(String prefix)
      Specified by:
      getNamespaceURI in interface NamespaceContext
    • getPrefix

      public String getPrefix(String namespaceURI)
      Specified by:
      getPrefix in interface NamespaceContext
    • getPrefixes

      public Iterator getPrefixes(String namespaceURI)
      Specified by:
      getPrefixes in interface NamespaceContext
    • setXpathMappingProperties

      public void setXpathMappingProperties(Properties xpathMappingProperties)
      Set the properties file that maps document properties to the XPath statements necessary to retrieve them.

      The Xpath mapping is of the form:

       # Namespaces prefixes
       namespace.prefix.my=http://www....com/alfresco/1.0
       
       # Mapping
       editor=/my:example-element/@cm:editor
       title=/my:example-element/text()
       
    • init

      protected void init()
      Description copied from class: AbstractMappingMetadataExtracter
      Provides a hook point for implementations to perform initialization. The base implementation must be invoked or the extracter will fail during extraction. The default mappings will be requested during initialization.
      Overrides:
      init in class AbstractMappingMetadataExtracter
    • getDefaultMapping

      protected Map<String,Set<QName>> getDefaultMapping()
      It is not possible to have any default mappings, but something has to be returned.
      Overrides:
      getDefaultMapping in class AbstractMappingMetadataExtracter
      Returns:
      Returns an empty map
      See Also:
    • extractRaw

      protected Map<String,Serializable> extractRaw(ContentReader reader) throws Throwable
      Description copied from class: AbstractMappingMetadataExtracter
      Override to provide the raw extracted metadata values. An extracter should extract as many of the available properties as is realistically possible. Even if the default mapping doesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations.

      Raw values must not be trimmed or removed for any reason. Null values and empty strings are

      • Null: Removed
      • Empty String: Passed to the OverwritePolicy
      • Non Serializable: Converted to String or fails if that is not possible

      Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example:

       editor: - the document editor        -->  cm:author
       title:  - the document title         -->  cm:title
       user1:  - the document summary
       user2:  - the document description   -->  cm:description
       user3:  -
       user4:  -
       
      Specified by:
      extractRaw in class AbstractMappingMetadataExtracter
      Parameters:
      reader - the document to extract the values from. This stream provided by the reader must be closed if accessed directly.
      Returns:
      Returns a map of document property values keyed by property name.
      Throws:
      Throwable - All exception conditions can be handled.
      See Also:
    • processDocument

      protected Map<String,Serializable> processDocument(Document document) throws Throwable
      Executes all the necessary XPath statements to extract values.
      Throws:
      Throwable
    • readXPathMappingProperties

      protected void readXPathMappingProperties(Properties xpathMappingProperties)
      A utility method to convert mapping properties to the Map form.
      See Also: