Configuring Ephesoft and Alfresco for CMIS integration

This blog is the result of my discoveries in integrating Ephesoft, the open source mailroom automation, and Alfresco, the open source document management solution. Ephesoft is able to export to a CMIS-enabled repository, and Alfresco is the CMIS repository, and both are open source! In this blog I configure a default install of Ephesoft (using the Ephesoft installer) and a default install of Alfresco Community (using the installer). I installed each application on a different VM, I don’t like to make a mess of my laptop, and don’t want to spend time on getting both to run smoothly on a single VM image.

1. Ephesoft

If you mess around too much manually with your batches, delete all work folders (inside ephesoft-system-folder) and switch the variable ‘workflow.deploy’ in file C:\bin\Ephesoft\Application\WEB-INF\classes\META-INF\dcma-workflows\dcma-workflows.properties to true. Restart Ephesoft and the related tables in the database will be cleanly recreated. I needed it, but in my philosophy I learn the boundaries of a system by breaking them…

In short, these are folder-wise the building blocks within the Ephesoft folder:

  • Application – contains the java/GWT code of the actual application. Note
  • Dependencies – contains helper tools like
    • ImageMagick (to transform and scale images),
    • Tesseract (to perform OCR) and
    • hocr2pdf to construct a pdf from a hocr and a tiff file.
  • Documents – links to the online documentation
  • JavaAppServer – contains a Tomcat instance
  • SharedFolders – contains:
    • another-monitored-folder – a (configurable) folder that is being watched for incoming batches
    • BC1 (actually BCn, there can be more) – containing all configuration for that version of flow definition (like CMIS binding info)
    • ephesoft-system-folder – contains temporary folders for each of the batches containing the temporary html, tiff, xml and png files
    • FinalDropFolder is the configurable) location where the end result pdf’s can be dropped
    • SampleBatches – contains two sets of demo batches

Configuring the Document Types

Next thing to do is to make Alfresco capable of receiving CMIS Document objects with additional attributes. Lets analyse what the document types are that are provided by the Ephesoft demo.

  • Application-Checklist
  • Worker-Comp-02
  • US-Invoice-Data

Each of these document types have the same attributes:

  • Invoice Date (Date)
  • Part Number (Long)
  • Invoice Total (Double)
  • State (String)
  • City (String)

Edit the file C:\bin\Ephesoft\SharedFolders\BC2\cmis-plugin-mapping\DLF-Attribute-mapping.properties This file contains the mapping of Ephesoft attribtes to CMIS target system attributes. It expects a model with a namespace called “ephesoft”. I modified the Alfresco document type from ephesoft-type (ephesoft:document) back into cm:document . Other than that, I will create an Alfresco model/aspect using the same names.

Application-Checklist=D:ephesoft:document
Application-Checklist.PartNumber=ephesoft:partNumber
Application-Checklist.InvoiceTotal=ephesoft:invoiceTotal
Application-Checklist.InvoiceDate=ephesoft:invoiceDate
Application-Checklist.State=ephesoft:state
Application-Checklist.City=ephesoft:city
Workers-Comp-02=D:ephesoft:document
Workers-Comp-02.PartNumber=ephesoft:partNumber
Workers-Comp-02.InvoiceTotal=ephesoft:invoiceTotal
Workers-Comp-02.InvoiceTotal=ephesoft:invoiceDate
Workers-Comp-02.State=ephesoft:state
Workers-Comp-02.City=ephesoft:city
US-invoice-Data=D:ephesoft:document
US-invoice-Data.PartNumber=ephesoft:partNumber
US-invoice-Data.InvoiceTotal=ephesoft:invoiceTotal
US-invoice-Data.InvoiceTotal=ephesoft:invoiceDate
US-invoice-Data.State=ephesoft:state
US-invoice-Data.City=ephesoft:city

2. Alfresco

In Alfresco the model (C:\bin\Alfresco\tomcat\shared\classes\alfresco\extension\ephesoftModel.xml) looks like:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Custom Model -->
<!-- Note: This model is pre-configured to load at startup of the Repository.  So, all custom -->
<!--       types and aspects added here will automatically be registered -->

<model name="ephesoft:demomodel" xmlns="http://www.alfresco.org/model/dictionary/1.0">
 <!-- Optional meta-data about the model -->
 <description>VLC</description>
 <author>Tjarda Peelen - VLC</author>
 <version>0.1</version>

 <imports>
   <!-- Import Alfresco Dictionary Definitions -->
   <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d"/>
   <!-- Import Alfresco Content Domain Model Definitions -->
   <import uri="http://www.alfresco.org/model/content/1.0" prefix="cm"/>
 </imports>
 <!-- Introduction of new namespaces defined by this model -->
 <!-- NOTE: The following namespace custom.model should be changed to reflect your own namespace -->   <namespaces>      <namespace uri="com.ephesoft.demo" prefix="ephesoft"/>   </namespaces>
  <constraints />
  <types>
      <type name="ephesoft:document">
         <title>ephesoft_scan</title>
         <parent>cm:content</parent>
         <properties>
          <property name="ephesoft:invoiceDate">
	     <title>Invoice Date</title>
            <type>d:datetime</type>
          </property>
          <property name="ephesoft:partNumber">
 	     <title>Part Number</title>
             <type>d:long</type>
          </property>
	   <property name="ephesoft:invoiceTotal">
	     <title>Invoice Total</title>
             <type>d:double</type>
          </property>
	   <property name="ephesoft:state">
	     <title>State</title>
             <type>d:text</type>
          </property>
          <property name="ephesoft:city">
	     <title>City</title>
            <type>d:text</type>
         </property>
      </properties>
    </type>
  </types>
</model>

Yes it is a TYPE instead of an ASPECT. CMIS does not provide for Apects (in that sense, it is a compromise standard, but a good one)…

Register the mode using C:\bin\Alfresco\tomcat\shared\classes\alfresco\extension\custom-model-context.xml with content:

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>
<beans>
<!-- Registration of new models -->
  <bean id="extension.dictionaryBootstrap" parent="dictionaryModelBootstrap" depends-on="dictionaryBootstrap">
    <property name="models">
      <list>
        <value>alfresco/extension/ephesoftModel.xml</value>
      </list>
    </property>
  </bean>
</beans>

Restart Alfresco.

Going to the Alfresco UI, and selecting document details, all properties need to show up.
Alfresco is shipped with a web based CMIS browser. Target your web browser at:
http://192.168.30.128:8080/alfresco/cmisbrowse

And enter for:

CMIS atom-pub url: http://192.168.30.128:8080/alfresco/service/cmis
Select link “Types collection”, select the “down” just underneath “cmis:document”. This will show all objects inheriting from cmis:document. Find id “D:ephesoft:document”, select it and if your custom model was properly configured, you will see these (additional) properties defined (I leave out the ones inherited from the default Document):
ephesoft:partNumber
id              ephesoft:partNumber
localName       partNumber
localNamespace  http://com.ephesoft.demo/model/content/1.0
displayName     Part Number
queryName       ephesoft:partNumber
propertyType    integer
cardinality     single
updatability    readwrite
inherited       false
required        false
queryable       true
orderable       true
openChoice      falseephesoft:city
id              ephesoft:city
localName       city
localNamespace  http://com.ephesoft.demo/model/content/1.0
displayName     City
queryName       ephesoft:city
propertyType    string
cardinality     single
updatability    readwrite
inherited       false
required        false
queryable       true
orderable       false
openChoice      false 

ephesoft:invoiceTotal
id              ephesoft:invoiceTotal
localName       invoiceTotal
localNamespace  http://com.ephesoft.demo/model/content/1.0
displayName     Invoice Total
queryName       ephesoft:invoiceTotal
propertyType    decimal
cardinality     single
updatability    readwrite
inherited       false
required        false
queryable       true
orderable       true
openChoice      false

ephesoft:invoiceDate
id              ephesoft:invoiceDate
localName       invoiceDate
localNamespace  http://com.ephesoft.demo/model/content/1.0
displayName     Invoice Date
queryName       ephesoft:invoiceDate
propertyType    datetime
cardinality     single
updatability    readwrite
inherited       false
required        false
queryable       true
orderable       true
openChoice      false

ephesoft:state
id              ephesoft:state
localName       state
localNamespace  http://com.ephesoft.demo/model/content/1.0
displayName     State
queryName       ephesoft:state
propertyType    string
cardinality     single
updatability    readwrite
inherited       false
required        false
queryable       true
orderable       false
openChoice      false

 

3 .Setting up the CMIS connection

The CMIS entrance to your Alfresco repository can be found  at http://192.168.30.128:8080/alfresco/service/cmis

Alfresco is shipped with a web based CMIS browser. target your web browser at:

http://192.168.30.128:8080/alfresco/cmisbrowse

And enter for:

CMIS atom-pub url: http://192.168.30.128:8080/alfresco/service/cmis

Now, pay attention, you need your Repository ID, the large identifier on top, build of groups and separated by minus signs.

Navigate in Ephesoft to http://localhost:8080/dcma/BatchClassManagement.html. Select the batch class BI2 with description “Tesseract Mail Room”, and select “edit”

Select the “Export module” from the module list, and select “Edit”

Select the “CMIS Export” plugin, and select “Edit”

Edit the plugin configuration

And select Save. Congratulations, you just configured your CMIS end point. If you navigate back to the main page of the admin console you can notice that the version number of the batch has increased from 1.0.0.0 to 1.0.0.1.

Propagate a batch through Ephesoft (in the user UI, http://localhost:8080/dcma/BatchList.html).

Problem
You will notice the folders do get created in Alfresco (/EphesoftFinalDropFolder/BI01 or something similar) but your document will not arrive… In the Ephesoft logging you will see complaints that an Integer or Decimal is expected. CMIS does know about Integer and Decimal, but not about Long and Double. It is my assumption that on Ephesoft side this mapping goes wrong (if it is mapped at all) I have not had the time to investigate yet.

Solution
a) The solution is simple… Remove 3 properties from each document type from your mapping file in Ephesoft (C:\bin\Ephesoft\SharedFolders\BC2\cmis-plugin-mapping\DLF-Attribute-mapping.properties) :

  • ephesoft:partNumber (because: Long)
  • ephesoft:invoiceDate (because: Date –> Strange, don’t know why this one fails)
  • ephesoft:invoiceTotal (because: Double)

Retry pushing a batch through Ephesoft flow, and find out that it actually works (but the 3 properties removed remain empty, of course).

b) update your CMIS mapping to map against a plain CMIS document type (in C:\bin\Ephesoft\SharedFolders\BC2\cmis-plugin-mapping\DLF-Attribute-mapping.properties) into:

Application-Checklist=cmis:document
Workers-Comp-02=cmis:document
US-invoice-Data=cmis:document

Remind, NOT D:cmis:document, remove the D:!!

Retry pushing a batch through Ephesoft flow, and find out that it actually works, and a default Alfresco document is created!

Ephesoft comments
Ephesoft commented on this issue. They successfully tested CMIS types datetime, int and string. Actually, they map:

invoiceTotal --> d:int in Alfresco
partNumber --> d:text in Alfresco
invoiceDate --> d:datetime in Alfresco

I can see the pragmatic approach to map a Long onto a cmis:String/d:text. However, it does was kind of a surprise. Not really sure if it is  a nice solution or a void in the CMIS spec’s. Unexpected it was.

Just as unexpected is the mapping of the  invoiceTotal (a Double in Ephesoft) onto a Integer in CMIS/Alfresco. This challenges to test with values just bigger than in int, and see what happens…

I have not yet tested these new ‘insights’ against my test Alfresco setup. I have been thinking how to deal with reducing more complex types to String values. The native type was more useful to store in a DMS than the String representation of the type (think reporting, decision making based on metadata (rules)).

Conclusion
The CMIS basics work out well. A default cmis:document can be created in a remote repository. However, CMIS is fun especially if you are able to transfer the metadata er well. That was one of the key reasons to use Ephesoft in the first place.  I can conclude that using a type in Alfresco, the system is able to recieve the Ephesoft output. There are however some issues with Date, Double and Long, which makes sense since the CMIS specification knows about Integer and Decimal… Where the Date goes wrong is still a question to me. Maybe have to get into the source for that…

I would like to be able to model my metadata in an Aspect rather than a Type in Alfresco. I tried this initially and failed. At this point in time I have to try again, since now I know of the limitation in Long and Double. I cannot remember anymore if this caused the error or something else. On the other hand, I would not be surprised if it takes a little more, considering the CMIS standard is more generic than Alfresco as a repository is able to handle… To be continued!

 

[[update 30 dec 2010: added the feedback of Ephesoft, and included the screenshots that were missing]]

Advertisements

19 Responses to “Configuring Ephesoft and Alfresco for CMIS integration”


  1. 1 Ash February 19, 2011 at 16:47

    Thanks for sharing your experience.

    Would you know how to get the right mime type over to alfresco? Following your steps, in our case it doesn’t show up as a pdf in alfresco. We have to change it manually.

  2. 4 dhartford April 20, 2011 at 20:35

    excellent walkthrough!

  3. 5 jim January 20, 2012 at 01:47

    Hi,

    I’m new to xml and tried to create an ephesoftModel.xml file in the location you indicate.

    From the information above it appears the text displayed is incomplete, I just copied and pasted the text as displayed into notepad and saved the file as ephesoftModel.xml.

    Everything is installed on my laptop c:\ephesoft; and c:\alfresco – both run concurrently, no problems but i do not see the custom ephesoft model under the cmis browse screen as you suggest…

    Any suggestions – can you send / post the sample ephesoftModel.xml file. i think I have formatting errors / mistakes etc in my version…

    thanks

    Jim

    • 6 Tjarda Peelen February 15, 2012 at 14:32

      Hi Jim,

      Can it be that by copying the XML from the webpage into your editor also copied some (maybe hidden) control characters (like line breaks, new lines and paragraph breaks)?
      Does your XML show well (and not complain) if you open it using InternetExplorer? (or probably any other webbrowser)?

  4. 7 aminos88 February 15, 2012 at 12:55

    Hi, Thank you for this good job

    i can’t find the EphesoftFinalDropFolder
    -how i can find it?
    -how I can find the output of ephesoft in alfresco

    it can be a bad configuration? or because I used the community versions ?

    thank you

  5. 9 talija March 15, 2012 at 14:56

    Excellent article. Just wondering, what could be the cause of metadata not filling in Alfresco? Document is created and has proper metadata fields, but they are empty.

    • 10 Tjarda Peelen March 15, 2012 at 16:28

      Thank you.

      I cannot recall exactly what was the problem at that time. My best guess was that it was a midnight-time side project, and I was happy I got the mechanism working. I think my set-up needed some final polishing to get it running in all details. Probably had to find out the exact mapping of types in Ephesoft versus the property types in Alfresco.
      I haven’t had the time/customer to revisit this. I think I have to apply for more hours in a day 🙂

      Please let me know if you got it running in a more recent version!

      • 11 Johan Pieterse March 28, 2012 at 12:48

        Hi all

        We got it to work.
        Questions…
        Can I get the name to automatically update with some metadata field I fill (cm:name/cmis:name)
        Can I drop the folder it create in Alfresco? Or is it better to create a rule in Alfresco to move them to another (combined) folder
        Is the Ephesoft CMIS customizable
        Or should I ask all this to Ephesoft 🙂

        • 12 Tjarda Peelen March 28, 2012 at 21:11

          Hi Johan,

          Can you elaborate on “Can I get the name to automatically update with some metadata field I fill (cm:name/cmis:name)”?? I don’t understand what you are saying (there -is- a mapping from the Ephesoft property (cmis:name) to Alfresco (cm:name), so I am confused what you try to achieve…

          The folder in Alfresco will be recreated if needed. I usually create a behaviour or scheduled job to move stuff into the right position. First of all to delegate responsibility to those who know (which is not Ephesoft, but Alfresco). Secondly, think about the span of transaction, and the consequences of that.
          * If the CMIS transfer fails, then what? (–> always make the cmis transfer succesfull, therefore simple)
          * If the transaction (cmis transfer invoking a rule or behaviour) fails because the rule/behaviour fails, then what?
          * If a scheduled job fails, Alfresco is able to report there are documents stuck in the ‘queue’ filled up by CMIS/Ephesoft. And some admin is able to act upon that. My preferred way of working is to use a scheduled job to move stuff around. If there is a record that the CMIS transfer from Ephesoft to Alfresco was successfull, there is less need for complex logic to create an fall back to make up for the failed transaction. (What does a roll back mean anyway?)

          Is the Ephesoft CMIS customizable… Yes and no. What do you want to customize?? It is a brilliant standard for this kind of integration. What I can imagine is you wnat to customize Ephesoft to do some business magic with the properties Ephesft delivers (join metadata fields, transform values into other representations of the same, do whatever. In this setup CMIS is meant to transfer the document + metadata from one ‘system’ to the other… One can argue that the Alfresco way of working with Aspects should be supported. I wrote a blog about that a while ago. Rumours go that Ephesoft is building something like that at this point in time. So check the next release if you got this additional feature by then!

          Tjarda

  6. 13 Mohammed June 7, 2012 at 12:35

    ephesoftModel.xml is not available “Alfresco 4.0d”

  7. 14 Mohammed June 8, 2012 at 11:06

    once I made ephesoftModel.xml and tried editing the custom-model-context.xml “sample” I got no change then I made custom-model-context.xml “XML” I got unauthorized to access http://Localhost:8080/alfresco/service/cmis

  8. 15 Gideon October 30, 2012 at 10:17

    please how can i change the final batch names to one of the captured properties in alfresco. the files have batch instance names by default

    • 16 Tjarda Peelen November 7, 2012 at 09:02

      You can try to map some Ephesoft property against the cm:name… Either extend Ephesoft, or fix IT in Alfresco (rule or behaviour). It is too long ago for me to reproduce…


  1. 1 Open Source scanning with Ephesoft and Alfresco « Open Source ECM/WCM Trackback on December 23, 2010 at 16:40
  2. 2 Ephesoft and Alfresco in one Tomcat instance « Open Source ECM/WCM Trackback on February 23, 2011 at 23:10
  3. 3 Ephesoft CMIS Export plugin using Alfresco’s Aspects « Open Source ECM/WCM Trackback on May 29, 2011 at 01:01
Comments are currently closed.