Alfresco Business Reporting released

alfresco business reporting logoA while ago I wanted insight in how our Alfresco Share repository was used. ‘Ad-hoc reporting’ and ‘(always changing) management reporting’ is not Alfresco’s strongest feature. There is a need for reporting in a way a business user understands, using their tools of choice. The Alfresco Business Reporting project delivers the missing link. It extracts Alfresco business objects (like documents, folders, datalists, links, discussions) into plain SQL tables. The total set of properties for each object, including your custom aspects, show up as columns in your table. And you, the business, can use your reporting tool of choice, to generate fancy reports. No need for IT consultants to configure Alfresco for every change or additional report. Once the synchronization is configured, you can define as much reports as you like, answering your (or your management’s) questions immediately.

The project is composed of two parts; filling the reporting database, and generating reports from the reporting database.

Filling the reporting database

The most important feature is extracting the Alfresco repository objects into ‘plain’ SQL tables. Alfresco’s database structure is extremely efficient and flexible for its content management purposes. It is however quite hard to report against this database structure, and find all properties belonging to a particular type (and its applied aspects). You cannot let the business report against the Alfresco native database.

Alfresco Business Reporting uses JDBC to push the Alfresco metadata into a seperate reporting database. This reporting database can be located anywhere, for example to off-load Alfresco’s production database server.  Lucene queries are as the basis to fill sql tables. For each table a Lucene query defines the set of objects to push into this particular table. The table definitions as shipped with this release will provide you some Alfresco Types, like Documents, Folders, Links, Posts, DataListItems etc. If your business case has a need to select objects having a certain aspect or objects havind some property value(s), feel free. Lucene caters for a wide variety of query options to select just those objects relevant in your business domain.

For each object found the related table in the reporting database  will be stretched to fit all properties found in the result set of the Lucene query. This will include all properties in the assigned aspects (even if only 1 object in your result list has this aspect).  The limitations on your result set (result size or query time) are eliminated on the go.  You can schedule the process of pushing the metadata from the Alfresco repository into the reporting database. And you can initially bulk-load this reporting database, but you can update incrementally as well. This means, only pushing business objects into the reporting database if they have been modified after the last successful run.

This tool is meant to be configured. My initial purpose was to see how our Alfresco Share instance was used. I configured the tables to match the Share business objects. You can configure your tool to match the business objects relevant in your domain. Do you use Alfresco as a case management system? Find metrics of how the system is used, throughput, statistics per case type, outcome per case type, outcome per case-responsible… Statistics per year, last month. Find out if content is only added, or people actually update content.

Generating reports from the reporting database

The biggest win is in having all Alfresco metadata available in a format the business can process. The business should be in control of creating reports instead of calling an Alfresco engineer if they need a new report type. (From my perspective I don’t have the time for that, I like to do projects.) Personally I use JasperSoft‘s iReport and Pentaho‘s Report Designer if I am in need for some reporting. I created some logic to use Alfresco as a reporting server. Both suppliers mentioned each have their own Reporting Server, which makes sense if your business has serious needs for reporting.  I just wanted my few reports to be executed on a regular basis.

The configuration of this package allows to define an Alfresco space where the report definitions can be found, and where the resulting reports should be created/modified. Given the script provided, all reports in the ‘input’ space will be executed. If you add another report definition (.prpt, .jrxml or .jasper), it will be executed next run. The output is stored in the space you can configure. I put it into an Alfresco Share space where I tweaked permissions according the content of the pdf-reports.

This report-generating part of the project is not mandatory. If you have your reporting solution in place, use it like that! Once you defined the business objects relevant for your domain, any reporting tool will do.

How to use

I know undocumented software is widely available on the web.Therefor I provided some information about choices, what you get (zips, jars etc, but also the table definitions), how to install, stuff that still needs to be done in the project wiki.

Work to do

I just published the first release of Alfresco Business Reporting.  It is not perfect, and if I wait for that moment, it will never get out. I know I need to expand on the Alfresco data types (associations, mutli-value properties). I can see providing an AMP package makes installation more easy (but i need to move the database credentials to the alfresco-global,properties first).

If you have a pointer, see a gab that needs to be crossed, point me to flaws, or (need) help in any other way, please let me know. Enjoy your new insights in how your repository is used!

[Update 20130424: Changed project logo to comply to Alfresco’s trademark guidelines and policies ]

Advertisements

19 Responses to “Alfresco Business Reporting released”


  1. 1 Karl September 25, 2012 at 10:09

    Hi Tjarda, we are having trouble getting this working on our system – logged issue on Google project – any help would be appreciated! Thanks Karl

  2. 2 Rubie Casaña December 5, 2012 at 03:36

    Hi Tjarda, We used PostgreSQL as database of our Alfresco installation. Do you have a guide for installing the reporting in Postgre?

    Thanks,
    Rubie

    • 3 Tjarda Peelen December 5, 2012 at 08:06

      Hi Rubie,
      The install will not be the issue. (Create an empty database +user and use a PostgreSQL driver). I guess the pain will be in the sql dialect. I currently coded against MySQLonly. It is on my roadmap to support postgress too. However, I have a major new release of this module pending and in need of documentation and sample reports. If that is published, the Postgress thing will be high on my todo list.

      Tjarda

      • 4 Rubie Casaña December 5, 2012 at 08:31

        Thanks for the prompt response Tjarda. I tried to use mySQL in the reporting database. (I assumed there will be no conflict if the Alfresco installation is in postgre and the reporting is in Mysql). However when I try to access alfresco it returns 404 error and checking the logfile it has this info: Dec 5, 2012 3:30:30 PM org.quartz.core.ErrorLogger schedulerError
        SEVERE: Job (Solr.CoreTracker-archive threw an exception.
        org.quartz.SchedulerException: Job threw an unhandled exception. [See nested exception: org.alfresco.error.AlfrescoRuntimeException: 11050016 GetModelsDiff return status is 404]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:227)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)
        Caused by: org.alfresco.error.AlfrescoRuntimeException: 11050016 GetModelsDiff return status is 404
        at org.alfresco.solr.client.SOLRAPIClient.getModelsDiff(SOLRAPIClient.java:1010)
        at org.alfresco.solr.tracker.CoreTracker.trackModels(CoreTracker.java:1630)
        at org.alfresco.solr.tracker.CoreTracker.trackRepository(CoreTracker.java:1134)
        at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:491)
        at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)

        • 5 Rubie Casaña December 5, 2012 at 08:36

          Having that error, I can’t even login to Alfresco share anymore. Checking the mysql database that I created, it remained empty. No tables were created.

        • 6 Tjarda Peelen December 24, 2012 at 00:44

          Not being able to start Share is a ‘Bad Thing’. I guess you have not defined the properties in alfresco-global.properties that define the cron jobs? See the how-to. (I have been fighting Spring configuration tonight to get that working as-intended, but have not succeeded yet…)

  3. 7 Rubie Casaña December 5, 2012 at 08:37

    If I get this work, I can volunteer to help in your documentation 🙂

    • 8 Tjarda Peelen December 23, 2012 at 23:51

      Hi Rubie,

      Sorry for the late reply. I am just in process of downloading Postgres… (Don’t expect a fix before January though)
      Did you execute the (Alfresco) script createTables.js? This will actually create the tables, they do not appear (yet) by themselves.

      Tjarda

    • 9 Tjarda Peelen December 24, 2012 at 01:12

      I can use some help 🙂
      Next version will make life much easier. So reduces the documentation-importance a little.

      I have been playing with Postgres a little tonight. I guess it does take me quite some time to get this working. Need to get my way in schema’s and the slightly different SQL dialect. If that is solved, the flexibility, the feature of the Alfresco-Business-Reporting tool, needs to become available in both worlds. Got a feeling Oracle comes in play too, so I am actually already looking at MyBatis…

  4. 10 Rubie Casaña January 2, 2013 at 06:37

    Hi Tjarda,

    This is great news 🙂 I’ll wait for your update. Happy New Year!

  5. 11 Rubie Casaña January 24, 2013 at 10:48

    Hi Tjarda,

    I wonder if you were able to make the postgresql worK

    Thanks!

    • 12 Tjarda Peelen January 24, 2013 at 22:47

      Hi Rubie,

      Thanks for your reply.
      First of all, yes, I am working on that almost right now. I did some starting activities, but now I am finalizing a ton of new and very usefull features. Quite some sample reports, and documentation on how to use the tool, and how to configure it to get the best out of your reporting. I want to work to a release first, otherwise I am developing only, and never publishing it.

      it is on the top if my next list…

      Tjarda

  6. 13 Rubie Casaña February 13, 2013 at 03:14

    Hi Tjarda,

    Going back to your comment here about the cron jobs, how about if Alfresco runs on a windows machine? Will the reporting tool work?

    Not being able to start Share is a ‘Bad Thing’. I guess you have not defined the properties in alfresco-global.properties that define the cron jobs? See the how-to. (I have been fighting Spring configuration tonight to get that working as-intended, but have not succeeded yet…)

    • 14 Tjarda Peelen February 13, 2013 at 08:18

      Hi Rubie,

      Mentioning ‘cron job’, I meant the scheduled jobs inside Alfresco. Therefore they are platform independent, triggered by Alfresco’s ‘cron’ mechanism. So it wil lbe running on Linux as well as Windows platforms.

      I fixed the spring stuff mentioned earlier, and I am in testing phase currently. Have to make sure it works against 3.4EE, 4.0EE, 4.1EE and 4.2CE, Solr and Lucene. Also preparing some sample reports, and I have to change a lot of (actually: all) documentation. Expect a new release first half of March. Lots of new stuff. All UI configurable. Reports parameterizable. Great stuff!

  7. 15 Rubie Casaña February 28, 2013 at 02:40

    Hi Tjarda,

    Will the release in March include PostgreSQL support?

    Thanks!

  8. 17 Rubie Casaña March 8, 2013 at 10:19

    Hi Tjarda,

    Thanks for that. Looking forward for the release of the new version.
    We are able to make the business reporting work, we just encountered some issues like it doesn’t sync the contents with categories and tags in the mysql database. I just want to confirm if this is a bug and will it be resolved in the new version or we are just missing some configuration.

    Thanks a lot.

    Rubie

    • 18 Tjarda Peelen March 8, 2013 at 11:39

      Ho Rubie,

      I noticed the lack of tags and categories. Must have been a bug for quite a while I am afraid 😦
      I noticed a few weeks ago and fixed it. The new version will be including tags and categories….

      Tjarda

  9. 19 Rubie Casaña March 11, 2013 at 05:10

    Thanks Tjarda for the response. Glad to know that it will be fixed in the next upgrade.

    I’m sorry if I keep on asking. Will the upgrade be out anytime this week? Also, will it still be compatible with version 4.0.e?


Comments are currently closed.