Building a Data Mart with Pentaho Data Integration

In this course, learn how to create ETL transformations to populate a star schema in a short span of time. Create a fully-functional ETL process using a practical approach. Follow the step-by-step instructions for creating an ETL based on a fictional company – get your hands dirty and learn fast Create a star schema.

  • Comprehensive training through 25 video sessions.
  • Implement logging and scheduling for the ETL process
  • Get an overview of the whole process: from source data to the end user analyzing the data
    Call Me

    Self-Paced

    batch loading...

    Course Features

    Related Courses

    About Building a Data Mart with Pentaho Data Integration

    Companies store a lot of data, but in most cases, it is not available in a format that makes it easily accessible for analysis and reporting tools. Ralph Kimball realized this a long time ago, so he paved the way for the star schema.

    Building a Data Mart with Pentaho Data Integration walks you through the creation of an ETL process to create a data mart based on a fictional company. This course will show you how to source the raw data and prepare it for the star schema step-by-step. The practical approach of this course will get you up and running quickly, and will explain the key concepts in an easy to understand manner.

    Building a Data Mart with Pentaho Data Integration teaches you how to source raw data with Pentaho Kettle and transform it so that the output can be a Kimball-style star schema. After sourcing the raw data with our ETL process, you will quality check the data using an agile approach. Next, you will learn how to load slowly changing dimensions and the fact table. The star schema will reside in the column-oriented database, so you will learn about bulk-loading the data whenever possible. You will also learn how to create an OLAP schema and analyze the output of your ETL process easily.

    By covering all the essential topics in a hands-down approach, you will be in the position of creating your own ETL processes within a short span of time.

    Course Objectives
    • Create a star schema
    • Populate and maintain slowly changing dimensions type 1 and type 2
    • Load fact and dimension tables in an efficient manner
    • Use a columnar database to store the data for the star schema
    • Analyze the quality of the data in an agile manner
    • Implement logging and scheduling for the ETL process
    • Get an overview of the whole process: from source data to the end user analyzing the data
    • Learn how to auto-generate data for a date dimension
    Curriculum
    Module 1:

    Getting Started

    • The Second-hand Lens Store
    • The Derived Star Schema
    • Setting up Our Development Environment
    Module 2:

    Agile Bi – Creating Etls To Prepare Joined Data Set

    • Importing Raw Data
    • Exporting Data Using the Standard Table Output
    • Exporting Data Using the Dedicated Bulk Loading
    Module 3:

    Agile Bi – Building Olap Schema, Analyzing Data, And Implementing Required Etl Improvements

    • Creating a Pentaho Analysis Model
    • Analyzing Data Using Pentaho Analyzer
    • Improving Your ETL for Better Data Quality
    Module 4:

    Slowly Changing Dimensions

    • Creating a Slowly Changing Dimension of Type 1 Using Insert/Update
    • Creating a Slowly Changing Dimension of Type 1 Using Dimension Lookup Update
    • Creating a Slowly Changing Dimension Type 2
    Module 5:

    Populating Data Dimension

    • Defining Start and End Date Parameters
    • Auto-generating Daily Rows for a Given Period
    • Auto-generating Year, Month, and Day
    Module 6:

    Creating The Fact Transformation

    • Sourcing Raw Data for Fact Table
    • Lookup Slowly Changing Dimension of the Type 1 Key
    • Lookup Slowly Changing Dimension of the Type 2 key
    Module 7:

    Orchestration

    • Loading Dimensions in Parallel
    • Creating Master Jobs
    Module 8:

    Id-Based Change Data Capture

    • Implementing Change Data Capture (CDC)
    • Creating a CDC Job Flow
    Module 9:

    Id-Based Change Data Capture

    • Setting up a Dedicated DB Schema
    • Setting up Built-in Logging
    • Scheduling on the Command Line
    Instructor

    Diethard Steiner, currently working as an independent Senior Consultant in London, U.K, has specialized in the field of open source business intelligence solutions for many years. Diethard has been very passionate about his work, regularly publishing tutorials on his blog http://diethardsteiner.blogspot.in/, which over the years has gained a loyal following.

    He has implemented end-to-end solutions (from data integration to reporting and dashboards) for several clients and projects, and has gained a deep understanding of the requirements and challenges of such solutions.

    Certification

    A test will be conducted at the end of the course. On completion of the test with a minimum of 70% marks, training.com will issue a certificate of successful completion from NIIT.

    Five re-attempts will be provided in case the candidate scores less than 70%.

    A Participation certificate will be issued if the candidate does not score 70% after five attempts.

    Pre-requisites

    Database Query fundamentals (Data Manipulation Language).

    Fundamentals of database design and schema.

    Basic data extration from databases and marshalling data into another database.

    FAQs

    Who should go for this Course?

    Individuals aspiring a great career in Business Intelligence using ETL (Extract, Transform & Load) technique. Professionals iinvolved in Data Mart processes to generate business specific knowledge out of the huge data store in the business databases. Data mining and analytics professionals

    Where can I find my session schedule?

    The session schedule will be available in the training.com Student portal - Learning Plan section. You can login to your training.com account to view the same.

    What is your refund policy?

    Upon registering for the course, if for some reason you are unable or unwilling to participate in the course further, you can apply for a refund. You can initiate the refund any time before start of the second session of the course by sending an email to support@training.com , with your enrolment details and bank account details (where you want the amount to be transferred). Once you initiate a refund request, you will receive the amount within 21 days after confirmation and verification by our team. This is provided if you have not downloaded any courseware after registration.

    Why is it called Self Paced course?

    Self Paced courses are comprised of several learning videos into a course structure broken down into Learning Modules and Sessions. The learner is required to go through the videos topic-wise in the structure sequence of the course to learn the concepts. Being Self Paced, there is no intervention of any external faculty or additional mentor in learning.

    Being a self paced course, how will my attendance be tracked and marked?

    you login into your training.com account to watch the videos, attendance for it will be marked automatically.

    How will the assessment be conducted for my certification?

    After each module, a multiple choice questions type online assessment will be conducted. 5 Attempts will be allowed for the assessment to be completed. The minimum pass percentage for each assessment is 70%. On successfully clearing the assessment, a verified certificate from NIIT shall be awarded otherwise the certificate of participation will be issued.

    What are the minimum system requirements to attend the course?

      Minimum system requirements for accessing the courses are:

    • Personal computer or Laptop with web camera
    • Headphone with Mic
    • Minimum 4 Mbps broadband connection

    Is there an official support desk for technical guidance during the training program?

    Yes.For immediate technical support during the live online classroom sessions, you can call 91-9717992809 or 0124-4917203 between 9:00 AM and 8:00 PM IST. You can write to support@training.com for all other queries and our team will be happy to help you.

    Course Features

    batch loading...

    Related Courses

    AI and Deep Learning with TensorFlow
    AWS Certification and Training Program
    Administration Essentials for New Admins- Salesforce
    Advanced Data Mining projects with R
    Advanced Pay Per Click
    Advanced Program in Data Sciences
    Advanced Social Media Marketing
    Analyzing and Visualizing Data with Excel
    Analyzing and Visualizing Data with Power BI
    Android Game Development for Beginners
    Application Development with Swift 2
    Automated UI Testing in Java
    Big Data Analytics with R
    Big Data Applications using Hadoop
    Building Android Games with OpenGL ES
    Building Applications with Ext JS
    Building Applications with Force.com
    Building a Data Mart with Pentaho Data Integration
    Building iOS 10 Applications with Swift
    Builiding web application with spring MVC
    Business Analytics using R from KPMG
    Certified Digital Marketing Professional
    Complete Web and Social Media Analytics
    Data Quality 9.x: Developer, Level 1
    Data Science Orientation
    Data Science with R
    Data Science with Spark
    DevOps Certification Training
    Developing Microsoft SharePoint® Server 2013
    Enabling and Managing Microsoft Office 365
    Executive Program in Applied Finance
    Executive Program in Digital and Social Media Marketing Strategy
    Getting Started with R for Data Science
    Getting started with Apache Solr Search Server
    IBM Cognos Connection and Workspace Advanced
    Implementing Microsoft Azure Solutions-70-533
    Informatica PowerCenter 9.x Level 1
    Introducing Rails 5 Learning Web Development the Ruby Way
    Introduction to ITIL
    Java Enterprise Apps with DevOps
    Joomla Certification Training Program
    Julia for Data Science
    LEAD (Learn. Enhance. Aspire. Deliver)
    Learning Android N Application Development
    Learning Data Mining with R
    Learning Joomla 3 Extension Development
    Learning MongoDB
    Learning R for Data Visualization
    Learning Spring Boot
    Learning Swift 2
    Linux shell scripting solution
    Machine Learning with Python
    Marketing Analytics Data Tools and Techniques
    Master AngularJS 2
    Mastering Magento
    Open Source Web App Development using MEAN Stack
    PMI® Agile Certified Practitioner Training
    Pentaho Reporting
    Post Graduate Certificate in General Management (PGCGM)
    Programming Using Python
    Programming with Python for Data Sciences
    Project Management Professional (PMP®) Training
    R Data Mining Projects
    R for Data Science Solutions
    Reactive Java 9
    SAS Certification Training Program
    Secrets of Viral Video Marketing
    Selenium with Java
    Six Sigma Certification Training Program
    Spring Security
    Supply Chain Management(SCM) Training Program
    Teradata Certification Training
    Test Driven Android
    UNIX Shell Scripting Training
    Web Apps Development using Node.js along with Express.js and MongoDB
    Web Apps Development with HTML5, CSS3, jQuery & Bootstrap
    Web Development with Node.JS and MongoDB
    iOS App Development Certification Training
    jquery UI Development