BY K.Karthikeyan: Hadoop & Map Reduce

Uploaded by

Karthik S

0% found this document useful (0 votes)

15 views7 pages

hadoop in IR

Original Title

Hand Oop

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

hadoop in IR

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

15 views7 pages

BY K.Karthikeyan: Hadoop & Map Reduce

Uploaded by

Karthik S

hadoop in IR

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 7

Search inside document

HADOOP & MAP REDUCE

BY
K.KARTHIKEYAN

HADOOP & MAP REDUCE

Hadoop Distributed File System is designed to

store very large data sets reliably, and to stream
those data sets at high bandwidth to user
applications. HDFS stores file system metadata
and application data separately.
HDFS namespaces is a hierarchy of files and
directories. File and directories are represented
on the NameNode by inodes, which record
attributes like permissions, modification and
access times, namespace and disk space quotas.
The file content is split into large blocks and each
block of the file is independently replicated at
multiple Datanodes.

HDFS keeps the entire namespace in RAM. The nodes data

and the list of blocks belonging to each file comprise the metadata of the name system called the image.
The persistent record of the image stored in the local hosts
native files system is called a checkpoint.

MAP REDUCE
Mapreduce is a programming model and software framework
first developed by Google. Intended to facilitate and simplify
the processing of vast amounts of data in parallel on large
clusters of commodity hardware in a reliable fault-tolerant
manner.
MapReduce Characteristics:
Very large scale data
Write once and read many data.
Map and reduce the main operation
All the map should be completed before reduce operation
starts.

Map and reduce operations are typically performed by the

same physical processor.
Number of map tasks and reduce tasks are configurable.
Operations are provisioned near the data.
Commodity hardware and storage.
Runtime takes care of splitting and moving data for operations.

Input: This is the input data / file to be processed.

Split: Hadoop splits the incoming data into smaller pieces
called "splits".
Map: In this step, MapReduce processes each split according
to the logic defined in map() function. Each mapper works on
each split at a time. Each mapper is treated as a task and
multiple tasks are executed across different TaskTrackers and
coordinated by the JobTracker.
Combine: This is an optional step and is used to improve the
performance by reducing the amount of data transferred across
the network. Combiner is the same as the reduce step and is
used for aggregating the output of the map() function before it
is passed to the subsequent steps.
Shuffle & Sort: In this step, outputs from all the mappers is
shuffled, sorted to put them in order, and grouped before
sending them to the next step.

Reduce: This step is used to aggregate the outputs of mappers

using the reduce() function. Output of reducer is sent to the
next and final step. Each reducer is treated as a task and
multiple tasks are executed across different TaskTrackers and
coordinated by the JobTracker.
Output: Finally the output of reduce step is written to a file in
HDFS.

Sample Network Design Proposal
Document13 pages
Sample Network Design Proposal
Klause Paulino
No ratings yet
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Document37 pages
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Dank Boii
No ratings yet
Schneider Electric - Governance
Document9 pages
Schneider Electric - Governance
James E
No ratings yet
Big Data
Document67 pages
Big Data
tamizhanps
No ratings yet
Exploring Bigdata With Hadoop: Dr.A.Bazila Banu Associate Professor Department of Cse
Document23 pages
Exploring Bigdata With Hadoop: Dr.A.Bazila Banu Associate Professor Department of Cse
MAMAN MYTHIEN S
No ratings yet
GoPhish User Guide - 2019
Document46 pages
GoPhish User Guide - 2019
Frank Martinez
No ratings yet
Residential Rental Agreement Guide
Document5 pages
Residential Rental Agreement Guide
Praveen Kumar
No ratings yet
Python Book
Document50 pages
Python Book
Karthik S
No ratings yet
Funds Management
Document3 pages
Funds Management
Sowmya
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Learning Spoken English
Document54 pages
Learning Spoken English
Sarah JT
93% (87)
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Unit - III Advanced Analytics Technology and Tools
Document44 pages
Unit - III Advanced Analytics Technology and Tools
Diksha Chhabra
No ratings yet
Shortnotes For Cloud
Document22 pages
Shortnotes For Cloud
Mahi Mahi
No ratings yet
Chap 6 - MapReduce Programming
Document37 pages
Chap 6 - MapReduce Programming
Harshitha Raaj
No ratings yet
The Map Reduce Programming
Document15 pages
The Map Reduce Programming
manjunath
No ratings yet
Hadoop Distributed File System Basics
Document30 pages
Hadoop Distributed File System Basics
ashuvasuma
No ratings yet
BDA All Modules
Document72 pages
BDA All Modules
v h
No ratings yet
Hdfs Architecture and Hadoop Mapreduce
Document10 pages
Hdfs Architecture and Hadoop Mapreduce
Nishkarsh Shah
No ratings yet
Explain in Detail About Hadoop Framework
Document4 pages
Explain in Detail About Hadoop Framework
Information Techn. HOD
No ratings yet
Unit Iv-1
Document84 pages
Unit Iv-1
keerthanavelmurugan02
No ratings yet
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
Document62 pages
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
Mousoomi Baruah
No ratings yet
Sem 7 - COMP - BDA
Document16 pages
Sem 7 - COMP - BDA
Raja Rajgonda
No ratings yet
HadoopMapreduce Summerization
Document24 pages
HadoopMapreduce Summerization
Atharv Chaudhari
No ratings yet
Hadoop
Document5 pages
Hadoop
Vaishnavi Chockalingam
No ratings yet
Big Data Engines: Binary Batch Processing
Document12 pages
Big Data Engines: Binary Batch Processing
Sonakshi Gupta
No ratings yet
Unit V FRAMEWORKS AND VISUALIZATION
Document71 pages
Unit V FRAMEWORKS AND VISUALIZATION
Yash Deep
No ratings yet
Chapter 2 - 大数据生态系统
Document31 pages
Chapter 2 - 大数据生态系统
gs68295
No ratings yet
Big Data and Analytics and MapReduce 29052023 054155pm
Document35 pages
Big Data and Analytics and MapReduce 29052023 054155pm
Talha Mughal
No ratings yet
Lovely Professional University (Lpu) : Mittal School of Business (Msob)
Document10 pages
Lovely Professional University (Lpu) : Mittal School of Business (Msob)
Fareed
No ratings yet
Unit 2 Notes BDA
Document10 pages
Unit 2 Notes BDA
vasusrivastava138
No ratings yet
Unit - II (1)
Document64 pages
Unit - II (1)
praneelp2000
No ratings yet
2_hadoop_ecosystem
Document41 pages
2_hadoop_ecosystem
tranngocbaooooo12062003
No ratings yet
New way to analyze big data with Hadoop
Document20 pages
New way to analyze big data with Hadoop
C. Valeriu
No ratings yet
777 1651400645 BD Module 3
Document62 pages
777 1651400645 BD Module 3
nimmy
No ratings yet
Unit 1 Haoop Architecture
Document26 pages
Unit 1 Haoop Architecture
Anirudh Prakash
No ratings yet
Q1. Discuss Hadoop and Map Reduce Algorithm.: Data Is Located
Document7 pages
Q1. Discuss Hadoop and Map Reduce Algorithm.: Data Is Located
Hîмanî Jayas
No ratings yet
Unit 5 - Introduction To Hadoop
Document50 pages
Unit 5 - Introduction To Hadoop
Shree Shak
No ratings yet
05-movies-data-analysis-using-mapreduce
Document20 pages
05-movies-data-analysis-using-mapreduce
mohammadkhaja.shaik
No ratings yet
Hadoop Building Blocks
Document30 pages
Hadoop Building Blocks
Kavya
No ratings yet
Bda - 3 Unit
Document18 pages
Bda - 3 Unit
ASMA UL HUSNA
No ratings yet
Hadoop Major Components
Document10 pages
Hadoop Major Components
aswagada
No ratings yet
Bda - Unit 2
Document56 pages
Bda - Unit 2
Kajal Vaniya
No ratings yet
Mapreduce and Hadoop Ecosystem
Document64 pages
Mapreduce and Hadoop Ecosystem
Rin Rin Nurmalasari
No ratings yet
Hadoop Dcs
Document31 pages
Hadoop Dcs
bt20cse155
No ratings yet
Hadoop Architecture
Document48 pages
Hadoop Architecture
vidya56789
No ratings yet
Data W - Bigdata8
Document105 pages
Data W - Bigdata8
ujjwal subedi
No ratings yet
Unit-2 [MapReduce-II]
Document11 pages
Unit-2 [MapReduce-II]
tripathineeharika
No ratings yet
Hadoop: A Seminar Report On
Document28 pages
Hadoop: A Seminar Report On
Roshni Khairnar
No ratings yet
Unit 5 - Introduction To Hadoop
Document50 pages
Unit 5 - Introduction To Hadoop
Shree Shak
No ratings yet
Module 3 - Mapreduce
Document40 pages
Module 3 - Mapreduce
Aditya Raj
No ratings yet
Chapter 10
Document45 pages
Chapter 10
Sarita Samal
No ratings yet
BDA Notes
Document25 pages
BDA Notes
mrudula.sb
No ratings yet
MAPREDUCE AND HADOOP: A FRAMEWORK FOR PROCESSING LARGE DATASETS
Document21 pages
MAPREDUCE AND HADOOP: A FRAMEWORK FOR PROCESSING LARGE DATASETS
18941
No ratings yet
Intro ToHadoop-Unit 04
Document24 pages
Intro ToHadoop-Unit 04
Asiya Khan
No ratings yet
Unit 3 Bba
Document11 pages
Unit 3 Bba
rajendrameena172003
No ratings yet
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
Document6 pages
HADOOP: A Solution To Big Data Problems Using Partitioning Mechanism Map-Reduce
Editor IJTSRD
No ratings yet
UNIT 4 Notes by ARUN JHAPATE
Document20 pages
UNIT 4 Notes by ARUN JHAPATE
Ankit “अंकित मौर्य” Mourya
No ratings yet
CC Unit 5
Document43 pages
CC Unit 5
prassadyashwin
No ratings yet
(17CS82) 8 Semester CSE: Big Data Analytics
Document169 pages
(17CS82) 8 Semester CSE: Big Data Analytics
Prakash G
No ratings yet
Hadoop: A Report Writing On
Document13 pages
Hadoop: A Report Writing On
dilip kodmour
No ratings yet
226 Unit-6
Document32 pages
226 Unit-6
shivam saxena
No ratings yet
MapReduce in Cloud Computing Explained
Document10 pages
MapReduce in Cloud Computing Explained
Muhammad umar
No ratings yet
Hadoop Architecture
Document8 pages
Hadoop Architecture
gnikithaspandanasridurga3112
No ratings yet
BDL8 PDF
Document41 pages
BDL8 PDF
Mrs. Usha Naidu S
No ratings yet
System Design and Implementation 5.1 System Design
Document14 pages
System Design and Implementation 5.1 System Design
sararajee
No ratings yet
Business Intelligence & Big Data Analytics-CSE3124Y
Document26 pages
Business Intelligence & Big Data Analytics-CSE3124Y
splokbov
No ratings yet
Viva Questions
Document2 pages
Viva Questions
Karthik S
No ratings yet
CS6001-C Sharp and .NET Programming
Document12 pages
CS6001-C Sharp and .NET Programming
Karthik S
No ratings yet
CS 8391 Data Structures: Unit 1
Document16 pages
CS 8391 Data Structures: Unit 1
Karthik S
No ratings yet
Lab Manual
Document45 pages
Lab Manual
Karthik S
No ratings yet
NS2 Simulation of Distance Vector Routing Protocol
Document3 pages
NS2 Simulation of Distance Vector Routing Protocol
Karthik S
No ratings yet
TNPSC Counselling Procedure 24 08 2018
Document5 pages
TNPSC Counselling Procedure 24 08 2018
Karthik S
No ratings yet
IP Lab Ex
Document24 pages
IP Lab Ex
Karthik S
No ratings yet
NS2 Simulation of Distance Vector Routing Protocol
Document3 pages
NS2 Simulation of Distance Vector Routing Protocol
Karthik S
No ratings yet
31 Geometric Series
Document11 pages
31 Geometric Series
Diego Romé
No ratings yet
Python Brochure
Document2 pages
Python Brochure
Karthik S
No ratings yet
Aadhaar Seeding Form PDF
Document1 page
Aadhaar Seeding Form PDF
Karthik S
No ratings yet
EX - NO 10 Simulation of Error Correction Code (CRC) Aim
Document4 pages
EX - NO 10 Simulation of Error Correction Code (CRC) Aim
Karthik S
100% (1)
Exemplo Código Ns
Document4 pages
Exemplo Código Ns
Moaberdã Gomes
No ratings yet
GTA Syllabus
Document1 page
GTA Syllabus
Karthik S
No ratings yet
Aarush I
Document22 pages
Aarush I
Karthik S
No ratings yet
Ex - No: 1 - Android Application That Uses GUI Components, Font and Colors
Document12 pages
Ex - No: 1 - Android Application That Uses GUI Components, Font and Colors
Karthik S
No ratings yet
Aarush I
Document22 pages
Aarush I
Karthik S
No ratings yet
Baby Names - 1
Document2 pages
Baby Names - 1
Karthik S
No ratings yet
Syllabis
Document1 page
Syllabis
Karthik S
No ratings yet
OPNET Modeller lab booklet: LAN Switching Lab
Document101 pages
OPNET Modeller lab booklet: LAN Switching Lab
Karthik S
No ratings yet
Baby Names - 1
Document2 pages
Baby Names - 1
Karthik S
No ratings yet
27 11 15
Document18 pages
27 11 15
Karthik S
No ratings yet
BSNL Press Release Cashback
Document1 page
BSNL Press Release Cashback
nik
No ratings yet
Ge6161 (Set2)
Document5 pages
Ge6161 (Set2)
heavendew
No ratings yet
Practical Attendance Sheet Apr May 2015
Document194 pages
Practical Attendance Sheet Apr May 2015
Karthik S
No ratings yet
Conditional Formatting With Formulas (10 Examples) - Exceljet
Document8 pages
Conditional Formatting With Formulas (10 Examples) - Exceljet
Leeza Glam
No ratings yet
Combo
Document390 pages
Combo
Valvolt Nova
No ratings yet
CCNA Lab Topology 100406
Document34 pages
CCNA Lab Topology 100406
api-3748256
100% (2)
IOT Based Smart Energy Meter With Auto Daily Tariff Calculations Over Internet.
Document3 pages
IOT Based Smart Energy Meter With Auto Daily Tariff Calculations Over Internet.
dileeppatra
No ratings yet
06 - HSDPA Call Setup
Document66 pages
06 - HSDPA Call Setup
neerajj.jain
No ratings yet
HP Bladesystem BL P-Class Blade Servers: Appendix C He646S B.01
Document34 pages
HP Bladesystem BL P-Class Blade Servers: Appendix C He646S B.01
suntony1
No ratings yet
Processing Devices
Document14 pages
Processing Devices
Nsubuga Matthew
No ratings yet
Rete Pattern Matching Algorithm
Document46 pages
Rete Pattern Matching Algorithm
v4vix
No ratings yet
Intro Computing Assignment 2 Tips
Document2 pages
Intro Computing Assignment 2 Tips
Siddique Rana
No ratings yet
Avaya Aura Contact Center and Call Center Elite
Document8 pages
Avaya Aura Contact Center and Call Center Elite
Cấn Mạnh Trường
No ratings yet
Scaling and Unscaling Analog Values
Document11 pages
Scaling and Unscaling Analog Values
Juan Fernando Carmona
No ratings yet
DigiNet Site 4.13 Manual
Document110 pages
DigiNet Site 4.13 Manual
Іван Іваненко
No ratings yet
Week 1-Day 1 and 2
Document10 pages
Week 1-Day 1 and 2
Jeaunesse Anne
No ratings yet
Sit774-9 1P
Document4 pages
Sit774-9 1P
Tanzeel Mirza
No ratings yet
ODOT MicroStation V8i User Guide
Document148 pages
ODOT MicroStation V8i User Guide
num fistism
No ratings yet
Summer Internship Report
Document38 pages
Summer Internship Report
Amit Yadav
No ratings yet
Practical: 2 Aim 2.1: Check Whether The Number Is Armstrong or Not?
Document5 pages
Practical: 2 Aim 2.1: Check Whether The Number Is Armstrong or Not?
Harsh More
No ratings yet
Metastability and Clock Domain Crossing: IN3160 IN4160
Document30 pages
Metastability and Clock Domain Crossing: IN3160 IN4160
kumar
No ratings yet
Functional Features of Static Energy Meters PDF
Document78 pages
Functional Features of Static Energy Meters PDF
J HARSHA YADAV
No ratings yet
Swastick Das Xii Cs Mt3!20!21 QP
Document8 pages
Swastick Das Xii Cs Mt3!20!21 QP
Swastick Das
No ratings yet
AR Setup Checklist
Document9 pages
AR Setup Checklist
David Sparks
No ratings yet
Language Modelling
Document3 pages
Language Modelling
Prakash Sawant
No ratings yet
Solvers and Model Types: General Algebraic Modeling System
Document2 pages
Solvers and Model Types: General Algebraic Modeling System
RMolina65
No ratings yet
Prouct Life Cycle-1
Document5 pages
Prouct Life Cycle-1
Swarthik Reddy
No ratings yet
AutoCAD Hotkeys
Document6 pages
AutoCAD Hotkeys
Karyll Heart Layug
No ratings yet
Tyco Fire Protection Products Revit Families
Document28 pages
Tyco Fire Protection Products Revit Families
Edu
No ratings yet