Monday, September 10, 2012

BIRT, Cassandra and Hector

While BIRT offers many ways to connect to Cassandra, including using the Cassandra JDBC driver, this post focuses on using a Scripted data source to call the Hector Client Java client.  A BIRT scripted data source allows external Java classed to be called to retrieve data for a BIRT report and can be written in Java or JavaScript.  The examples below will use JavaScript.    For this post we used the DataStax community edition which is available here, and created a keyspace with the name users and a column family named User.  The User column family contains three string columns for first name, last name and age.  The script used to load the sample data is available in the example download.

Set Designer Classpath


The first thing that you will need to do is set the classpath for the designer to access the following set of jars. 
  • hector-core-version.jar
  • hector-object-mapper-version.jar
  • slf4j-api-version.jar
  • libthrift-version.jar
  • apache-cassandra-thrift-version.jar
  • guava-rversion.jar
  • commons-lang-version.jar


All of these jars, with the exception of the two Hector jars are available in the /install-directory/DataStax Community/apache-cassandra/lib directory.  To get the Hector jars you can download and build the hector source or just download them from a maven repository.

The Hector-object-mapper jar file can be downloaded from here.
The Hector core jar file can be downloaded from here.

One way to setup the classpath is to create a libs directory in your Report Project and then copy all of the jars above to this folder.
Next Select Window->Preferences.  Select the Report Design->Classpath preference and click on the Configure project specific settings link.
 
Select the BIRT Project that you will be using Hector with and click on ok.

Select the enable project specific settings checkbox and add the jars in the lib folder you created earlier.


Creating a Scripted Data Source using Hector

 
You can now create a report that calls the Hector APIs directly.  To do this first create a new report.  Select the data explorer view and right click on the data sources node and click on New Data Source.  Select the Scripted Data Source option and click on finish.
 

Next right click on the Data Sets node and choose the New Data Set option.  Make sure to select the Scripted Data Source that you just created as the data source for this data set. 

 

Click on the Next button and enter each column name and data type for the data set.


 
Click on the Finish button.  You can now enter script for the data set.  To do this first make sure the data set is selected in data explorer view and click on the script tab at the bottom of the report canvas.


In the script editor you will have many events that could be scripted, but in this example all we need is an open script and a fetch script.  First select open from the script drop down list and enter a script similar to the following.

importPackage(Packages.java.util);

importPackage(Packages.me.prettyprint.cassandra.serializers);

importPackage(Packages.me.prettyprint.cassandra.service);

importPackage(Packages.me.prettyprint.hector.api);

importPackage(Packages.me.prettyprint.hector.api.beans);

importPackage(Packages.me.prettyprint.hector.api.factory);

importPackage(Packages.me.prettyprint.hector.api.query);

 

var cluster = HFactory.getOrCreateCluster("Test Cluster",new CassandraHostConfigurator("localhost:9160"));

var keyspace = HFactory.createKeyspace("users", cluster);

var rangeSlicesQuery = HFactory.createRangeSlicesQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get())

.setColumnFamily("User").setRange(null, null, false, 10).setRowCount(100);            

var result = rangeSlicesQuery.execute();

myrows = result.get();          

rowsIterator = myrows.iterator();

Hector also supports using CQL so you could also use the following open script

importPackage(Packages.java.util);

importPackage(Packages.me.prettyprint.cassandra.serializers);

importPackage(Packages.me.prettyprint.cassandra.service);

importPackage(Packages.me.prettyprint.hector.api);

importPackage(Packages.me.prettyprint.hector.api.beans);

importPackage(Packages.me.prettyprint.hector.api.factory);

importPackage(Packages.me.prettyprint.hector.api.query);

importPackage(Packages.me.prettyprint.cassandra.model);

 

var cluster = HFactory.getOrCreateCluster("Test Cluster",new CassandraHostConfigurator("localhost:9160"));

var keyspace = HFactory.createKeyspace("users", cluster);

            

var cqlQuery = new CqlQuery(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get());

cqlQuery.setQuery("select * from User");

var resultCQL = cqlQuery.execute();    

rowsIterator = resultCQL.get().iterator();

Next add a fetch script like the following.

if (rowsIterator.hasNext()) {

     var myrow = rowsIterator.next();

     var cols = myrow.getColumnSlice().getColumns();

     for( ii=0; ii < cols.size(); ii++ ){

       row[cols.get(ii).getName()] = cols.get(ii).getValue();

     }

        return true;

}else{

       return false;

}



In the above fetch the script assumes you have named your scripted data set columns the same as the columns in Cassandra.  You should now be able to preview the data set.  Double click on the data set in the data explorer view and select preview.

 
You can now use the data set within your report. 


Deploying a Report that Uses the Hector API


 
If you are using the BIRT Viewer and deploy a report that calls the Hector API, verify that all the jars discussed in the beginning of this Post (Set Designer Classpath) are placed in WEB-INF/lib directory of the Viewer.  If you are running BIRT reports using the BIRT APIs verify that the above jars are also in the classpath.

More information on CQL and Hector is available here.  The example in this post is available on Birt-Exchange.


1 comment:

Unknown said...

Good one. But i am facing a problem now. I can't able to fetch multiple column, current its possible to read only first column. Do you have any idea?