Professional Documents
Culture Documents
2/28/2019 1
Module 1: Representation of Data and Descriptive Statistics
[Week 1-3 (7 lectures)]*
2/28/2019 2
Books (Also consult Basic Statistics by Nagar and Das)
2/28/2019 3
Let’s appreciate the need for
“quantity” as well as “quality”
2/28/2019 4
Comparison between these two events are possible only when both quantitative
and qualitative information are available
2/28/2019 6
Tracing the history of data representation
Mortality Table of
John Graunt (1661)
Overcoming the
problem of
“can’t see the forest
for the tree”
2/28/2019 7
Scottish imports/exports by W. Playfair (1786)
2/28/2019 8
1859: Florence Nightingale’s
polar area diagram
2/28/2019 9
Not the numbers but the arrangement of
the numbers tells the story!
Click on the link!
2/28/2019 10
Types of Data
2/28/2019 11
What are the heights of the
students in IIT Mandi?
Height of Roll no. .....is 5.2 ft
Height of Roll no. ..... Is 4.9 ft
and so on....
2/28/2019 12
Arrangement makes life easy!
2/28/2019 13
Tabular representation of data
2/28/2019 14
A table is prepared to represent the summary of the data. The table that you
want to create out of any raw data should depend on your research objective.
Same data can be tabulated in various ways to answer the particular research
question that you are trying to address. Further, ask yourself the following
questions before you proceed to create any table.
Tables based on
Cardinal data? Tables based on
Nominal Data?
Here you are the one to construct
class intervals, which will act as
categories Here you know your categories
No. of
Classes (in Classes (in Midpoint households % of
hectare) acre) (xi) (fi) households
Landless 0 0 0 11 46%
Marginal <1 <2.5 1.25 2 8%
Small 1-2 2.5-5 3.75 2 8%
Semi medium 2-4 5-10 7.5 0 0%
Medium 4-10 10-30 20 2 8%
Large >10 >30 45 7 29%
Total 24 100%
Observe that there is a logic behind defining the classes in such a manner. [Note: In India,
following categories of landholdings are generally used: Marginal: <1 ha; Small: 1.01–2 ha;
Semi-medium: 2–4 ha; Medium: 4–10 ha; Large: >10 ha. However, to use these categories
as your classes, you have to convert the landholding from acre to hectare (ha) and 1 ha
=2.5 acre (approx)]
2/28/2019 17
Representing 2 variables in one table
Table 2: Distribution of households according to their castes
(mentioned as 'category') in various villages (Based on Data_1)
Caste
Village SC ST OBC Gen Total
Paschim Malipur 0 0 0 9 9
Sherpara 0 0 1 6 7
Madanpur 0 0 0 3 3
Gopalpur 3 0 0 1 4
Purushia 0 1 0 1 2
Total 3 1 1 20 25
2/28/2019 19
Table 3: Distribution of households according to caste, religion
and monthly expenditure
Stub Title
Caption
10000-15000 0 0 0 1 0 1 0 0 0 1 1 2 2 1 3
>15000 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
Total 3 0 3 1 0 1 0 1 1 10 9 19 14 10 24
Observe that there is column that displays the total number of households under each
category. H: Hindu, M: Muslim, T: Total. Since one data point is missing under the
variable monthly expenditure, the total count will remain 24. Body of the
Table
2/28/2019 22
Frequency Distribution
Cumulative
Frequency Freq.
No. of Relative (Fi) Density
Classes Midpoint households Frequency (fi/class
(in acre) (xi) (fi) (fi/N) < type > type length)
Landless 0 0 11 0.46 11 24 -
Marginal 0-2.5 1.25 2 0.08 13 13 0.8
Small 2.5-5 3.75 2 0.08 15 11 0.8
Semi-med 5-10 7.5 0 0.00 15 9 0
Medium 10-30 20 2 0.08 17 9 0.1
Large 30-60 45 7 0.30 24 7 0.23
Total N=24 1
2/28/2019 24
Diagrammatic representation of data
2/28/2019 25
Figure 1:
Distribution of households according to ownership of agricultural land
12 11 Bar/column diagram
10 Bar diagram
No. of households
8 7
6
Pie chart
4
2 2 2
2
0
0 Large
Landless Marginal Small Semi Medium Large 29%
medium
Landless
46%
Medium
8%
Small
8% Marginal
2/28/2019 9% 26
Table 2: Distribution of households according to their
castes (mentioned as 'category') in various villages
Caste
Village SC ST OBC Gen Total
Paschim Malipur 0 0 0 9 9
Sherpara 0 0 1 6 7
Madanpur 0 0 0 3 3
Gopalpur 3 0 0 1 4
Purushia 0 1 0 1 2
Total 3 1 1 20 25
2/28/2019 27
Figure 2:
Distribution of households according to their castes in various villages
10
No of households
Bar diagram SC
8 ST
6 OBC
4 Gen
2
0
Paschim Sherpara Madanpur Gopalpur Purushia
Malipur
Stacked bar diagrams
100% 10
90%
80%
8
70%
60%
6
50%
40%
4
30%
20%
10%
2
0%
Paschim Sherpara Madanpur Gopalpur Purushia 0
Malipur Paschim Sherpara Madanpur Gopalpur Purushia
Malipur
2/28/2019
28
Table 3: Distribution of households according to caste,
religion and monthly expenditure
10000-15000 0 0 0 1 0 1 0 0 0 1 1 2 2 1 3
>15000 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1
Total 3 0 3 1 0 1 0 1 1 10 9 19 14 10 24
Observe that there is column that displays the total number of households under
each category. H: Hindu, M: Muslim, T: Total. Since one data point is missing
under the variable monthly expenditure, the total count will remain 24.
2/28/2019 29
Figure 3.
Distribution of households according to caste, religion and monthly expenditure
12
10
No. of households
>15000
6
10000-15000
5000-10000
4
<5000
2
0
H M T H M T H M T H M
SC ST OBC Gen
2/28/2019 30
• Histogram is another way of representation
– isto-s – ‘mast’/ something set upright
– gram-ma – something written/graphics
– Histogram
Not really!!
This is a column diagram.
Also remember, the term `histogram' was coined by the
statistician Karl Pearsonwhile talking about the geometry of
statistics (1892).
2/28/2019 32
• Have a careful look at the monthly expenditure data
• Bin the range into a series of intervals (continuous but disjoint) and
identify frequency corresponding to each range
• Bins may contain less that the lowest value and more than the highest
value
2/28/2019 33
Frequency Table
Total 24 0.0048
2/28/2019 34
0.0050
Proportion of households
0.0040
0.0030
0.0020
Frequency curve
0.0010
0.0000
2500 7500 12500 17500
2/28/2019 35
Central tendency and dispersion
2/28/2019 36
Type of data Measure of central Measure of
tendency dispersion
2/28/2019 37
Correlation:
Chalk and Talk!
2/28/2019 38
Food for thought…
2/28/2019 39
Questions?
Cartoon curtsey:
The Cartoon Guide to Statistics
By Larry Gonick and Woollcott Smith
2/28/2019 40