You are on page 1of 3

A

Big Data Reference Architecture Needs to


Consider the Following Domains
7

Core innova2on

Presenta;on & Applica;on

Enablers into an enterprise environment

Enable exis;ng and new


applica;ons
3

Data
Integra;on &
Governance

2
Data Access
Unprecedented insights: Allow
simultaneous access by and ;mely
insights for all approved users
across en;re data lake, using
dierent processing engine and
schema on read

Integrate with
exis;ng systems.
Move data into,
within and out of
1
Data Management
the environment,
while minimizing
Petabyte scale: A cquire all data in
duplica;on and
its original format and store it in
data movement
one place, cost eec;vely and for

very long ;me periods


6

Environment & Deployment Model
Run within your environment
and in a public cloud
Copyright juergen@analy2x.is

Security
Provide layered
approach to
security that
dieren;ates
internal and
external users

Opera;ons
Deploy and
manage a
mul; - tenant,
environment
easily, using
exis;ng tools
where possible

Core Big Data Capabili;es Required


Core innova2ons
Data Integra;on &
Governance

Presenta;on
Reports &
Dashboards

Clients

Extract,
Transform, Load

Iden;ty &
Access
Management

Real-Time Monitoring

(Exis;ng or New) Applica;on


OLAP

Web & Social


Media

Video &
Audio

Geo-loca;on

Machine
Learning &
Predic;on

SQL

Streaming &
Complex Event
Processing

Batch
Processing

Data
Connectors

Data Isola;on &


Mul;-tenancy
Search &
Discovery

Graph
Processing
Data Masking

Data Management
Rela;onal
Database

(MPP) Data
Warehouse

Distributed
Storage

Opera;ons

Data Access

Data Encryp;on

Security & Privacy

Text &
Seman;cs

Real Time &


Batch Inges;on

Life Cycle
Management

Advanced
Visualiza;on

NoSQL
Database*

In-memory
Compu;ng
Custodian
Gateways

Physical Infrastructure

Store rst, ask


ques;ons later
(HDFS)

Parallel
processing
(MapReduce)

Commodity
HW, cheap
storage

* Includes key value, document, graph and object data bases.

Any data type,


incl.
unstructured


Real-;me

reasoning
on

new data

Copyright juergen@analy2x.is

Google
for
Big Data

Friends &
family social
NW analysis

Predic;ons
enable
Prescrip;ons
1

Hadoop and Spark Deliver Many of the Core


Innova;ve Capabili;es Required
Spark Provides A Modern Development Environment On Top Of Hadoop

In-memory high-speed analy2cs engine

Advanced machine learning libraries


Unied programming model across all processing engines
Hadoop Provides The Enterprise-Wide Data Lake
Allows to acquire all data in its original format and store it in
one place, cost eec2vely and for very long 2me periods
Allows dierent processing engines and schema on read
Mature mul2-tenancy, opera2ons, security and integra2on

Note: Both are open source technologies supported and


embedded by a wide range of so9ware and services vendors
Copyright juergen@analy2x.is

You might also like