Companies store a lot of data, but in most cases, it is not available in a format that makes it easily accessible for analysis and reporting tools. Ralph Kimball realized this a long time ago, so he paved the way for the star schema.
Building a Data Mart with Pentaho Data Integration walks you through the creation of an ETL process to create a data mart based on a fictional company. This course will show you how to source the raw data and prepare it for the star schema step-by-step. The practical approach of this course will get you up and running quickly, and will explain the key concepts in an easy to understand manner.
Building a Data Mart with Pentaho Data Integration teaches you how to source raw data with Pentaho Kettle and transform it so that the output can be a Kimball-style star schema. After sourcing the raw data with our ETL process, you will quality check the data using an agile approach. Next, you will learn how to load slowly changing dimensions and the fact table. The star schema will reside in the column-oriented database, so you will learn about bulk-loading the data whenever possible. You will also learn how to create an OLAP schema and analyze the output of your ETL process easily.
By covering all the essential topics in a hands-down approach, you will be in the position of creating your own ETL processes within a short span of time.
- Create a star schema
- Populate and maintain slowly changing dimensions type 1 and type 2
- Load fact and dimension tables in an efficient manner
- Use a columnar database to store the data for the star schema
- Analyze the quality of the data in an agile manner
- Implement logging and scheduling for the ETL process
- Get an overview of the whole process: from source data to the end user analyzing the data
- Learn how to auto-generate data for a date dimension
Getting Started
- The Second-hand Lens Store
- The Derived Star Schema
- Setting up Our Development Environment
Agile Bi – Creating Etls To Prepare Joined Data Set
- Importing Raw Data
- Exporting Data Using the Standard Table Output
- Exporting Data Using the Dedicated Bulk Loading
Agile Bi – Building Olap Schema, Analyzing Data, And Implementing Required Etl Improvements
- Creating a Pentaho Analysis Model
- Analyzing Data Using Pentaho Analyzer
- Improving Your ETL for Better Data Quality
Slowly Changing Dimensions
- Creating a Slowly Changing Dimension of Type 1 Using Insert/Update
- Creating a Slowly Changing Dimension of Type 1 Using Dimension Lookup Update
- Creating a Slowly Changing Dimension Type 2
Populating Data Dimension
- Defining Start and End Date Parameters
- Auto-generating Daily Rows for a Given Period
- Auto-generating Year, Month, and Day
Creating The Fact Transformation
- Sourcing Raw Data for Fact Table
- Lookup Slowly Changing Dimension of the Type 1 Key
- Lookup Slowly Changing Dimension of the Type 2 key
Orchestration
- Loading Dimensions in Parallel
- Creating Master Jobs
Id-Based Change Data Capture
- Implementing Change Data Capture (CDC)
- Creating a CDC Job Flow
Id-Based Change Data Capture
- Setting up a Dedicated DB Schema
- Setting up Built-in Logging
- Scheduling on the Command Line
Diethard Steiner, currently working as an independent Senior Consultant in London, U.K, has specialized in the field of open source business intelligence solutions for many years. Diethard has been very passionate about his work, regularly publishing tutorials on his blog http://diethardsteiner.blogspot.in/, which over the years has gained a loyal following.
He has implemented end-to-end solutions (from data integration to reporting and dashboards) for several clients and projects, and has gained a deep understanding of the requirements and challenges of such solutions.
A test will be conducted at the end of the course. On completion of the test with a minimum of 70% marks, training.com will issue a certificate of successful completion from NIIT.
Five re-attempts will be provided in case the candidate scores less than 70%.
A Participation certificate will be issued if the candidate does not score 70% after five attempts.
Database Query fundamentals (Data Manipulation Language).
Fundamentals of database design and schema.
Basic data extration from databases and marshalling data into another database.
Who should go for this Course?
Individuals aspiring a great career in Business Intelligence using ETL (Extract, Transform & Load) technique. Professionals iinvolved in Data Mart processes to generate business specific knowledge out of the huge data store in the business databases. Data mining and analytics professionals
Where can I find my session schedule?
The session schedule will be available in the training.com Student portal - Learning Plan section. You can login to your training.com account to view the same.
What is your refund policy?
Upon registering for the course, if for some reason you are unable or unwilling to participate in the course further, you can apply for a refund. You can initiate the refund any time before start of the second session of the course by sending an email to support@training.com , with your enrolment details and bank account details (where you want the amount to be transferred). Once you initiate a refund request, you will receive the amount within 21 days after confirmation and verification by our team. This is provided if you have not downloaded any courseware after registration.
Why is it called Self Paced course?
Self Paced courses are comprised of several learning videos into a course structure broken down into Learning Modules and Sessions. The learner is required to go through the videos topic-wise in the structure sequence of the course to learn the concepts. Being Self Paced, there is no intervention of any external faculty or additional mentor in learning.
Being a self paced course, how will my attendance be tracked and marked?
you login into your training.com account to watch the videos, attendance for it will be marked automatically.
How will the assessment be conducted for my certification?
After each module, a multiple choice questions type online assessment will be conducted. 5 Attempts will be allowed for the assessment to be completed. The minimum pass percentage for each assessment is 70%. On successfully clearing the assessment, a verified certificate from NIIT shall be awarded otherwise the certificate of participation will be issued.
What are the minimum system requirements to attend the course?
- Personal computer or Laptop with web camera
- Headphone with Noise Clarity Microphone
- Broadband connection with minimum bandwidth of 4 mbps.
- Its recommended to use System Health Check to examine the OS details, Add in, Plugins, Camera, Mic and other external devices.
Minimum system requirements for accessing the courses are:
Is there an official support desk for technical guidance during the training program?
Yes.For immediate technical support during the live online classroom sessions, you can call 91-9717992809 or 0124-4917203 between 9:00 AM and 8:00 PM IST. You can write to support@training.com for all other queries and our team will be happy to help you.