Alfresco Business Reporting – The out-of-the-box basics

Pentaho ReportingThe recently released new version of Alfresco Business Reporting makes life easy for you as business users. Usually there is some SQL knowledge around to create some nice reports fulfilling your specific reporting needs. In this blog I will show how to find your way around the basic reporting objects in the system once you installed the AMP (and the JAR)

This blog is part of a series of how-to’s about Alfresco Business Reporting; how to create a report in Pentaho Business Reporting, and how to configure the report in Alfresco:

I assume you already have the Alfresco Business Reporting add-in configured and installed in your Alfresco instance.

The Reporting tool does two things, as described in the schema below:

Alfresco Business Reporting Mechanism


Harvesting is the process of getting metadata from business objects, and store it into a ‘plain sql’ reporting database. These can be objects you retrieve by query (documents, folders, Sites, YourSpecificWhatervers. Next to that all Users and Groups will be harvested too. And output from the audit framework can be included in the reports (think last-login timestamps, failed logins). And the system can get Categories as a tree structure into the reporting database. Once in the database you can do whatever you want with the metadata. Either use  the (parameterized) report execution of Alfresco Business Reporting, or use your existing tooling like Business Objects, SPSS, whatever.

Harvesting has two mechanisms. Queries business objects and Audit framework queries can be incremental. Therefore the system will query only for those objects that are modified since last successful harvesting run. This implies that you can run these queries relatively light-weight (or more often).


The other mechanism is ‘drop-and-recreate’. Types like Users and groups, but also Categories are hard to identify if and what structural changes have happened since last run. Therefore these tables are dropped. And recreated. It could bell be that this is more resource intensive, and you don’t want to do this multiple times a day. It might also be data less interesting to be as fresh as possible. Therefore you can schedule these kind of data to be harvested less frequent.


At this point in time the reporting database has to be a MySQL database, but cross-vendor is on it’s way.

Scheduled Harvesting Execution

Usually the harvesting is scheduled by cron jobs. You only configure the Harvesting Definition, having the right harvesting frequency.

Manual Harvesting Execution

If you really feel like harvesting -now- and waiting for the scheduled moment is not an option (because you changed the config and need to know if it works), you can turn to the Explorer interface (for now). There are (by default) two harvesting defintions. Each have an additional action to trigger the execution of that Harvest Definiton.

Next to that the Reporting Root also owns an action to execute all Harvesting Definions underneath.

Report Execution

The Reporting Root is the ‘top element’ of the reporting configuration. It contains the Reporting Container. Each container has an execution frequency. Report Templates, the Pentaho definitions, are stored insite Reporting Containers.


Properties of a Reporting Root. Notice you can enable and disable the filling of the reporting database, as well as execution of reports. This main-switch sometimes is handy.


The Reporting Container owns the execution frequency. Next to that you can enable/disable an entire container.


The Reporting Template knwos where the resulting output goes to, and in what form or shape. the output folder can be fixed, or highly configurable. You can define if your resulting report should be versioned or not. It even allows you to parameterize the report execution. Remind that when you installed the AMP into Alfresco, the values ${site} and ${yyyy}=${MM} did not import at all. For some reason Alfresco strips these parts of the value before storing the propertyvalue. See this screencam how to fix this.

Scheduled execution

Usually the report execution is scheduled by cron jobs. You only have to toss the report definition into the right container, having the right execution frequency.

Manual Execution

Usually you don’t have to execute manually. But if you really feel like it (because you test-drive a new report), you can execute reports by hand. At this point in time from the Explorer interface only:


The Reporting Root executes all reports underneath.


The Reporting Container executes all reports underneath


The Reporting Template executes itself. 

Manual execution overides disabled Reporting Containers or Reporting Roots.