Pentaho Data Lake-1

Uploaded by

hokusmanoli

0% found this document useful (0 votes)

40 views2 pages

pentaho_data_lake-1

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

pentaho_data_lake-1

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

40 views2 pages

Pentaho Data Lake-1

Uploaded by

hokusmanoli

pentaho_data_lake-1

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

DATASHEET

BLUEPRINT FOR BIG DATA SUCCESS: A BEST PRACTICE SOLUTION PATTERN

Filling the Data Lake

Simplify and Accelerate Hadoop Data Ingestion with a Scalable Approach

What is it?
As organizations scale up data onboarding from just a few • A flexible, scalable, and repeatable process to onboard a
sources going into Hadoop to hundreds or more, IT time growing number of data sources into Hadoop data lakes
and resources can be monopolized, creating hundreds of
• Streamlined data ingestion from hundreds or thousands
hard-coded data movement procedures – and the process
of disparate CSV files or database tables into Hadoop
is often highly manual and error-prone. The Pentaho Filling
• An automated, template-based approach to data work-
the Data Lake blueprint provides a template-based
flow creation
approach to solving these challenges, and is comprised of:
• Simplified regular data movement at scale into Hadoop in
the AVRO format

Why do it?
• Reduce IT time and cost spent building and maintaining • Automate business processes for efficiency and speed,
repetitive big data ingestion jobs, allowing valuable staff while maintaining data governance
to dedicate time to more strategic projects
• Enable more sophisticated analysis by business users
• Minimize risk of manual errors by decreasing dependence with new and emerging data sources
on hard-coded data ingestion procedures

Value of Pentaho
• Unique metadata injection capability accelerates time-to- • Ability to architect a governed process that is highly reus-
value by automating many onboarding jobs with just a few able
templates
• Robust integration with the broader Hadoop ecosystem
• Intuitive graphical user interface for big data integration and semi-structured data
means existing ETL developers can create repeatable data
movement flows without coding – in minutes, not hours
Example of how a Filling the Data Lake blueprint
implementation may look in a financial organization
This company uses metadata injection to move thousands of data sources into
Hadoop in a streamlined, dynamic integration process.

• Large financial services organization with thousands of input sources

• Reduce number of ingest processes through Metadata Injection

• Deliver transformed data directly into Hadoop in the AVRO Format

RDBMS

HADOOP AVRO FORMAT

INGEST PROCEDURES

CSV

DISPARATE DATA SOURCES DYNAMIC DATA INTEGRATION PROCESSES DYNAMIC TRANSFORMATIONS

Be social
with Pentaho:

Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
Rating: 3 out of 5 stars
3/5 (1)
Hadoop Blueprints
From Everand
Hadoop Blueprints
Deshpande Tanmay
No ratings yet
A RESTful Pluggable Architecture To Tackle Big Data in The Cloud Slide
Document33 pages
A RESTful Pluggable Architecture To Tackle Big Data in The Cloud Slide
cabirul
No ratings yet
DSX InfoSphere DataStage Is Big Data Integration 2013-05-13
Document30 pages
DSX InfoSphere DataStage Is Big Data Integration 2013-05-13
parashara
0% (1)
Making The Most of Your Investment in Hadoop: Whitepaper
Document10 pages
Making The Most of Your Investment in Hadoop: Whitepaper
khamdb
No ratings yet
SAP HortonWorks GB 24469 en
Document10 pages
SAP HortonWorks GB 24469 en
pidamarthi14
No ratings yet
Modern Data Architecture For Financial Services With Apache Hadoop On Windows White Paper
Document20 pages
Modern Data Architecture For Financial Services With Apache Hadoop On Windows White Paper
Zhi Jun Alisa Yap
No ratings yet
Storage Emulated 0 Download Modern-Data-Architecture-Apache-Hadoop
Document18 pages
Storage Emulated 0 Download Modern-Data-Architecture-Apache-Hadoop
Santosh Kumar G
No ratings yet
DataWarehouseDesignDecisions PDF
Document62 pages
DataWarehouseDesignDecisions PDF
Hemant Dujari
No ratings yet
PowerExchange - SAP Functionality Overview
Document13 pages
PowerExchange - SAP Functionality Overview
Mahesh Sharma
No ratings yet
Hadoop V.01
Document24 pages
Hadoop V.01
Eshan Bhateja
No ratings yet
Sybase Information Management
Document5 pages
Sybase Information Management
Mustafa Sartaj New
No ratings yet
Data Integration - Solution Strategy
Document37 pages
Data Integration - Solution Strategy
asc2011od302
No ratings yet
Oracle Data Integration - An Overview With Emphasis in DW App
Document34 pages
Oracle Data Integration - An Overview With Emphasis in DW App
kinan_kazuki104
No ratings yet
Data Center Migration Project Plan Sample
Document4 pages
Data Center Migration Project Plan Sample
sandi123in
6% (16)
SAP Sales Force Integration
Document32 pages
SAP Sales Force Integration
Sunil Kumar Eda
100% (1)
SAP BW/BI Hot Topics
Document19 pages
SAP BW/BI Hot Topics
Rajeshvaramana Venkataramana
No ratings yet
Snaplogic
Document11 pages
Snaplogic
Manvendra Singh Rao
No ratings yet
Datastage Realtime Projects5
Document32 pages
Datastage Realtime Projects5
maheshumbarkar
No ratings yet
Hadoop Developer Resume
Document5 pages
Hadoop Developer Resume
steve
No ratings yet
Fathoming A Data Lake: A Proof of Concept
Document1 page
Fathoming A Data Lake: A Proof of Concept
mbasu ty
No ratings yet
DS-Attunity-End-To-End-Data-Integration-For-Hadoop-Data-Lakes-EN
Document2 pages
DS-Attunity-End-To-End-Data-Integration-For-Hadoop-Data-Lakes-EN
Demian Molinari
No ratings yet
Powercenter BR en-US
Document8 pages
Powercenter BR en-US
rajesh99mainframe
No ratings yet
Teradata - 8586485761185341111
Document16 pages
Teradata - 8586485761185341111
Pvkr Reddy
No ratings yet
Cloud Data Warehouse: Streamsets For Snowflake
Document6 pages
Cloud Data Warehouse: Streamsets For Snowflake
John Rho
No ratings yet
Big Data and Cloud Computing
Document27 pages
Big Data and Cloud Computing
ShreeRoopaU
No ratings yet
Cloud Computing Guide 1561727
Document4 pages
Cloud Computing Guide 1561727
karanmahajan
No ratings yet
Big Data and Hadoop: Senior Product Specialist
Document40 pages
Big Data and Hadoop: Senior Product Specialist
Robin
No ratings yet
Cloud Based Developer - SantoshKedar (4y - 0m)
Document3 pages
Cloud Based Developer - SantoshKedar (4y - 0m)
amenahrexecutive
No ratings yet
611 SAPMassChange
Document9 pages
611 SAPMassChange
mailstosj
No ratings yet
BW On Hana Roi - Ag v8
Document51 pages
BW On Hana Roi - Ag v8
Rahul Salunke
No ratings yet
Data Lake Architecture: Delivering Insight and Scale From Hadoop As An Enterprise-Wide Shared Service
Document12 pages
Data Lake Architecture: Delivering Insight and Scale From Hadoop As An Enterprise-Wide Shared Service
juergen_urbanski
100% (1)
Denodo Data Virtualization Reference Architecture and Patterns
Document13 pages
Denodo Data Virtualization Reference Architecture and Patterns
Vincent Harcq
No ratings yet
SAP IMPLEMENTATION AND ROLL OUT SERVICES
Document5 pages
SAP IMPLEMENTATION AND ROLL OUT SERVICES
msndevep
0% (2)
Migrating From SAP To Snowflake
Document14 pages
Migrating From SAP To Snowflake
Alyssa Llanes
No ratings yet
Anusha
Document2 pages
Anusha
abhishek
No ratings yet
OceanBase Distributed Database
Document34 pages
OceanBase Distributed Database
firstyanto
No ratings yet
Fundamentals of Big Data and Business Analytics
Document6 pages
Fundamentals of Big Data and Business Analytics
jitendra.jgec8525
No ratings yet
Speed Your Data Lake ROI
Document16 pages
Speed Your Data Lake ROI
shilpan9166
No ratings yet
Pentaho Document
Document2 pages
Pentaho Document
Jakkaphan Guntawong
No ratings yet
Big Data
Document28 pages
Big Data
Ankur Kulshreshtha
No ratings yet
AdidadHANA Poc
Document14 pages
AdidadHANA Poc
sdnbabu
No ratings yet
RiverETL 1.8.x
Document2 pages
RiverETL 1.8.x
Jose Walker
No ratings yet
Using Hadoop For Data Warehouse Optimization
Document8 pages
Using Hadoop For Data Warehouse Optimization
wilfredorangelucv
No ratings yet
Chapter 5
Document47 pages
Chapter 5
Rose Mae
No ratings yet
Opentext Po Data Archiving For Sap en
Document3 pages
Opentext Po Data Archiving For Sap en
anurag
No ratings yet
Traditional BI vs. Business Data Lake - A Comparison
Document12 pages
Traditional BI vs. Business Data Lake - A Comparison
tpsarvanan
No ratings yet
Pentaho Business Analytics Platform
Document4 pages
Pentaho Business Analytics Platform
mahdimax
No ratings yet
LG Article Blogx VJ 03112024
Document6 pages
LG Article Blogx VJ 03112024
viniec116
No ratings yet
TDWI Checklist Report MYDWTCFBP Snowflake Kobielus Web
Document10 pages
TDWI Checklist Report MYDWTCFBP Snowflake Kobielus Web
david.arturo.zaragoza.ortega
No ratings yet
New World Hadoop Architectures (& What Problems They Really Solve) For Dbas
Document44 pages
New World Hadoop Architectures (& What Problems They Really Solve) For Dbas
Anonymous VVSLkDOAC1
No ratings yet
DL Vs DLH Draft v0.1
Document9 pages
DL Vs DLH Draft v0.1
meetnavpk
No ratings yet
Data Vault Case Study
Document6 pages
Data Vault Case Study
newelljj
No ratings yet
Otimização Do SAP Na AWS
Document41 pages
Otimização Do SAP Na AWS
Leo Lima
No ratings yet
Banking Data Analysis On Hadoop
Document21 pages
Banking Data Analysis On Hadoop
Shantanu
No ratings yet
White Paper:: Three Open Blueprints For Big Data Success
Document8 pages
White Paper:: Three Open Blueprints For Big Data Success
Jeudy Tavárez
No ratings yet
Online Transaction Processing
Document17 pages
Online Transaction Processing
Javzaa
No ratings yet
Cloud - Is It Ready For Primetime? - Dave Partridge
Document20 pages
Cloud - Is It Ready For Primetime? - Dave Partridge
e.Republic
No ratings yet
Consise Cloud Compute: It Professionals’ Handbook
From Everand
Consise Cloud Compute: It Professionals’ Handbook
Vijay
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Big Data &iot Sy
Document2 pages
Big Data &iot Sy
addssdfa
No ratings yet
Zentyal As A Gateway - The Perfect Setup
Document8 pages
Zentyal As A Gateway - The Perfect Setup
mat
No ratings yet
Opennebula Instal Steps
Document23 pages
Opennebula Instal Steps
Karthikeyan Purusothaman
No ratings yet
Internet Banking System
Document45 pages
Internet Banking System
syam praveen
91% (32)
kdcmp922 PDF
Document44 pages
kdcmp922 PDF
Mile Martinov
No ratings yet
GE PPS Getting Started
Document62 pages
GE PPS Getting Started
LeonardPrayitno
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
Document28 pages
William Stallings Computer Organization and Architecture 9 Edition
Mamta Borle
No ratings yet
What Is Precise Paging?
Document2 pages
What Is Precise Paging?
harinjaka
No ratings yet
hw2 Ans
Document5 pages
hw2 Ans
JIBRAN AHMED
No ratings yet
Prod Bulletin0900aecd80630a8e
Document3 pages
Prod Bulletin0900aecd80630a8e
naumanadil473
No ratings yet
5inch HDMI Display User Manual (En)
Document5 pages
5inch HDMI Display User Manual (En)
A Margem
No ratings yet
NGFW in FW/VPN Role and Clustering Technology
Document37 pages
NGFW in FW/VPN Role and Clustering Technology
omarptc
No ratings yet
Combo Fix
Document8 pages
Combo Fix
Jesus Jhonny Quispe Rojas
No ratings yet
Approval Sheet
Document10 pages
Approval Sheet
Kim Oliver
No ratings yet
DCN-NG Download and License Tool
Document45 pages
DCN-NG Download and License Tool
Cha Lee Heng
No ratings yet
Fingerprint Based Attendance System Using Labview and GSM
Document9 pages
Fingerprint Based Attendance System Using Labview and GSM
27051977
No ratings yet
Connect em Cover View
Document67 pages
Connect em Cover View
interdevcj
No ratings yet
Hardware
Document4 pages
Hardware
Julio Machado
No ratings yet
Cobol PDF
Document118 pages
Cobol PDF
Mahantesh R
No ratings yet
7030 Surpass Hit
Document2 pages
7030 Surpass Hit
Farooq Anees
No ratings yet
SAP LSMW Data Migration Tool Transfers Legacy Data with No Coding
Document4 pages
SAP LSMW Data Migration Tool Transfers Legacy Data with No Coding
Srinivasa Rao Mullapudi
No ratings yet
Corrige 2
Document3 pages
Corrige 2
Sergio Fernandes
No ratings yet
"As-Built" Documentation Template
Document41 pages
"As-Built" Documentation Template
mauricio
No ratings yet
Deep Web Research and Discovery Resources 2017: by Marcus P. Zillman, M.S., A.M.H.A
Document52 pages
Deep Web Research and Discovery Resources 2017: by Marcus P. Zillman, M.S., A.M.H.A
Vlad Tache
No ratings yet
Axiom 49 Manual
Document70 pages
Axiom 49 Manual
Dave Dexter
No ratings yet
User Manual E2533DB/E2533DG DIGITAL Terrestrial Series: Support
Document59 pages
User Manual E2533DB/E2533DG DIGITAL Terrestrial Series: Support
Germán László
No ratings yet
Windows 10 - Introduction
Document9 pages
Windows 10 - Introduction
Angel Mencia
No ratings yet
Git Quick Reference
Document3 pages
Git Quick Reference
Rhysio Eren
No ratings yet
ALFA AIP-W525H User Guide
Document69 pages
ALFA AIP-W525H User Guide
hassan329
100% (1)