Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Saturday 30 May 2015

Self-Signed SSL/TLS Configuration HOW-TO with Pentaho CE 5.3 BA Server || change http://localhost:8080/pentaho TO https://localhost:8443/pentaho/Login

It's a weekend playing with Pentaho CE 5.3 Business Analytics  server. Yeah..!!! it is. I love playing with Pentaho  BI suite though it is a small outcome or a bigger. I can't classify this post is a bigger or smaller but I just want to share how I did this and hope this will helpful to someone, somewhere across the globe who is/are looking for the same.

Special thanks to Madhava & Piyush Bhatt for making me to explore this. 


Technology : 
* Pentaho BA Server CE 5.3
* Tomcat-7
* Java 1.7 (Oracle Java)
* Operating System : Windows-7 Home Premium. 
* Web browsers used to test : Google Chrome & Internet Explorer 

Aim of the article : 
Install SSL certificate to Pentaho BA Server 
i.e., change http://localhost:8080/pentaho TO https://localhost:8443/pentaho

What is Secure Sockets Layer (SSL) ?
  • It is  technology which allow web browsers and web servers to communicate over a secured connection. 
  • This means that the data being sent is encrypted by one side, transmitted, then decrypted by the other side before processing. 
  • This is a two-way process, meaning that both the server AND the browser encrypt all traffic before sending out data.
  • Another important aspect of the SSL/TLS protocol is Authentication. 
  • This means that during your initial attempt to communicate with a web server over a secure connection, that server will present your web browser with a set of credentials, in the form of a "Certificate", as proof the site is who and what it claims to be. In certain cases, the server may also request a Certificate from your web browser, asking for proof that you are who you claim to be. This is known as "Client Authentication,"
  •  Although in practice this is used more for business-to-business (B2B) transactions than with individual users. Most SSL-enabled web servers do not request Client Authentication.
Step by Step procedure :

1) Make sure java 1.7 is installed in machine & set JAVA_HOME in environment variables. 

2) Also make sure Pentaho BA server is running as a test drive before installing SSL certificate. 

3) There are two types of certificates we can install. 
     a) Self- Signed certificate ( Not trusted)
     b) Certificate from a certificate authority. 

4) This procedure is with Self-Signed based on self-signed certificate. 

5) Navigate to the bin folder of JAVA installed from command prompt in Admin mode. 
   Example : 
  C:\Program Files\Java\jdk1.7.0_79\bin

6) To create a new JKS keystore from scratch, containing a single self-signed Certificate, execute the following from a terminal command line:
"%JAVA_HOME%\bin\keytool" -genkey -alias tomcat -keyalg RSA
Example : 
  C:\Program Files\Java\jdk1.7.0_79\bin>ketytool.exe -genkey -alias tomcat -keyalg RSA

7) The command will ask us enter password  and few other details. The default password we can use is "changeit" or developer is flexible to give any name. 
Example : changeit (Enter the password twice as it asks). 


8) As shown in above command line utility finish giving all the other details & the last password could be same as first password or it could differ. 

9) As soon as it re-directs to the path back, it will generate a certificate named as ".keystore" in system(computer) user location. 
Example : 
Location of User : C:\Users\sadakar\ (file directory)
File Name to find : .keystore

Note that .keystore is the self-signed certificate generated. 

10) Now, install the above certificate to pentaho tomcat server. i.e., configure the certificate to the tomcat server.xml file.

Location of server.xml file : 
E:\2_Archive_Installed\pentaho\biserver-ce-5.3.0.0-213\biserver-ce\tomcat\conf

Add below code : ( There is already a piece of code available in the file , un-comment the code and add new attributes to the tag). 

The final code should look like as below. 

<!-- Define a SSL HTTP/1.1 Connector on port 8443
         This connector uses the JSSE configuration, when using APR, the 
         connector should be using the OpenSSL style configuration
         described in the APR documentation ${user.home}/.keystore-->
    
    <Connector 
                       URIEncoding="UTF-8" 
                       port="8443" 
                       protocol="HTTP/1.1" 
                      SSLEnabled="true"
                      maxThreads="200" 
                      scheme="https" 
                      secure="true"
             keystoreFile="C:\Users\sadakar.p\.keystore" 
                     keystorePass="changeit"
                     clientAuth="false" 
                     sslProtocol="TLS" />



NOTE : 
The .keystore file is hard coded - we can replace it by  ${user.home}/.keystore

11) Start the pentaho BA server 
(E:\2_Archive_Installed\pentaho\biserver-ce-5.3.0.0-213\biserver-ce\start-pentaho.bat)

12) Go to any web browser and type this URL : https://localhost:8433

Sample pentaho login screen with self-signed certificate installed 

(Google Chrome)


14) Internet Explorer output
a) 

b) 


15) Sample pentaho login screen with out certificate installed
(Google Chrome)



IMP NOTE :
This is not a production related document instead a development experiment. 
* JKS -  Java Key Storke - A repository of security certificates. 


- Sadakar Pochampalli



Tuesday 26 May 2015

Integration work out : USE CASE solved with Switch/Case, Merge Join , If field value is Null , Calculator Steps in Pentaho Data Integration

Hi ,

This post will explain the usage of various components with a USE CASE from community threads.

USE CASE thread : 
URL : 
http://forums.pentaho.com/showthread.php?187380-Data-need-to-be-filtered-and-sum-market-value-and-shares-based-on-Transaction-value&p=408478&posted=1#post408478

USE CASE : 
(A CSV file tab separated input).
Entity_Id Account_Name Trans_Code Total_Share Market_Value
10001 Growth Fund Buy 100 1000
10001 Growth Fund Sell 100 1000
10002 Aggressive Fund Buy 1300 13000
10002 Aggressive Fund Sell 450 4500
10003 Income Fund Buy 250 2500
10004 Brokerage Fund Buy 200 2000
10004 Brokerage Fund Sell 100 1000

USE CASE OUTPUT ( Subtraction for Total_Share(Buy-Sell) & Subtraction for Market_Value(Buy-Sell). 

Entity_Id Account_Name Total_Share Market_Value
10001 Growth Fund
10002 Aggressive Fund   850          8500
10003 Income Fund           250          2500

10004 Brokerage Fund  100                 1000

Design flow of ETL 

Quick understanding of steps used in above USE CASE 

1) Switch/Case
Detailed Information on Switch/Case :
http://wiki.pentaho.com/display/EAI/Switch-Case)


2) Multi way Merge Join
Detailed Information at :
http://wiki.pentaho.com/display/EAI/Multiway+Merge+Join )


3) If field value is Null
Detailed Information at :
http://wiki.pentaho.com/display/EAI/If+field+value+is+null )


4) Calculator
Detailed Information at :
(http://wiki.pentaho.com/display/EAI/Calculator)

Download files : 
( Input TEXT FILE + .ktr file) - Click Me

NOTE : 
Edit the input file path in .ktr file as per your system environment and explore it by running the file.

I hope this post has given some idea on steps subjected by seeing at the images provided.

What if there are more than 1 row with the same Id for sells or buys ? 
Will this procedure works ? ? NO

Write your comments for any queries related..

- Sadakar Pochampalli

Tuesday 19 May 2015

Tip : File path as variable in kettle.properties file to read excel files in Pentaho Kettle

In this post we will see how we can make file path as variable in kettle.properties file. 

Environment : 
OS Windows 7, Kettle ETL(PDI) CE 5.3, Excel files. 

Problem Statement : 
How to give file directory name as variable (PATH for files)

Solution  1 : By selecting the predefined variables from the drop down. 
${Internal.Transformation.Filename.Directory} 

Solution 2 : 
Another way with variable defined in kettle.properties file

1. Create a variable in kettle.properties file. 
    Location of the file :  C:\Users\sadakar.p\.kettle

2. Variable Name to give in kettle.properties file is (for eg) : PATH_FOR_DATA
    Value of the variable would be the path of the directory where files stores. 
  i.e., 
PATH_FOR_DATA=C:/Users/sadakar.p/sadakar drive/BI/Pentaho/1_Kettle/DWH_INHOUSE/Data/Education/


3. Now design transformation with Microsoft Excel input and table(or dummy step) step as shown in below image and press Control+ Space to get the available variables list in selected files tab(File/Directory). 
 
In this example we have to chose the PATH_FOR_DATA variable created in kettle.properties file. 



4. Test whether the data from all excel sheets is populating or not by clicking on preview button(one should get the field names for the first time with hard located path). 



NOTE : 
1) A restart of Kettle(spoon) is required once the variable is defined in kettle.properties file other wise the in the list of variables it will not populate. 

:-) 


Friday 15 May 2015

Pentaho CTools Introduction for Business Analysts (Quick selected points from webdetails.pt) - Download PPT


I have a prepared a very small PPT for Business Analysts & for BI consultants to introduce the features of Open source Dashboard tool. (C-Tools).

Resource for the PPT is:  Pentaho Webdetails site.

I hope the selected points from http://webdetails.pt/ site will give you the quick information to demonstrate or to prepare a like one.

Quick Snapshot :



Download PPT: Click Here


- Sadakar Pochampalli



Wednesday 6 May 2015

Pentaho Data Integration Basics - 6 - (DWH) : A simple time dimension table creation using Kettle tool for Data warehouse

This post will talk about developing a basic time dimension table in a warehouse.

For detailed, time_dimension tables creation find reference links listed at the end of this post.


Software setup :  PDI 5.3, PostgreSQL

Schema diagram for time dimension table : (part of DataMart Design for time dimension table)

ETL design flow in PDI(a.k.a Kettle)


Step by Step procedure
Download the ktr file and have a look into each of the step.

will updating the procedure steps 

Download Files : 
Transformation + PostgreSQL .backup file of dim_time table


References :
1) http://type-exit.org/adventures-with-open-source-bi/2010/09/building-a-detailed-date-dimension-with-pentaho-kettle/
2) http://www.fabianschladitz.de/25-kommentare-zur-welt/software/375-howto-create-a-date-dimension-with-pdi-kettle.html
3) http://forums.pentaho.com/showthread.php?90452-Auto-generated-date-dimension-tables
4) http://forums.pentaho.com/showthread.php?74831-How-to-Date-Dimension
5) http://stackoverflow.com/questions/18841625/pentaho-kettle-conveting-month-numeric-value-to-month-name


Tuesday 5 May 2015

Tip : Kettle ETL : Get only first string in a comma separted text - Modified Java Script Step Example

Problem Statement : 
Get first sting in below text value in a field
"Children out of school, primary, female"
Solution : 
Children out of school

Tool set : PDI 5.3 CE, Excel Data File(or Table input)


1) Get data from input file or input table ( Input-> Table input step).
   Note that the filed name is : indicator_name from which we want to get first string from a comma separated text.   

2) Take Modified Java Script Value step (Scripting -> MJSV) and write below expression
var indicator_sub_name = indicator_name.split(",")[0]

* Hop Table input to MJSV
* After writing expression click on "Get fields" button in MJSV
* The variable will be a new stream filed to the next step

3) Add select values step and preview the transformation.( ignore the unwanted fields in ignore tab).

Quick Image:

Hope this tip helps some one ...!!!
:-)