You are on page 1of 29

Basic Concepts of Statistics

Prof. dr. Liliana Dugulean ldugul@unitbv.ro

Discipline Definition
Statistics is the science of data. It
involves:
collecting, classifying, summarizing, organizing, analyzing and interpreting the numerical information.

Applications of Statistics
Descriptive Statistics uses numerical and graphical
methods: to look for patterns in a data set, to summarize the information revealed in a data set, to present the information in a convenient form.

Inferential Statistics uses sample data:


to make estimates, decisions, predictions, or other generalizations about a larger set of data.

Basic Concepts of Statistics


The population (collectivity) = the phenomenon to be studied (events, people, objects, tranzactions) The experimental unit (statistic unit) = the integrant element of the population (simple or complex) A variable (characteristic) = is a property of an individual experimental unit The value (measurement) The frequency (the number of units with the same value of characteristic)

Fundamental Elements of Statistics


A sample = a subset of the units of a population A statistical inference = an estimate or prediction or some other generalization about a population based on information contained in a sample A measure of reliability = a statement (usually quantified) about the degree of uncertainty associated with a statistical inference.

Types of Data
Quantitative data are measurements that are recorded on a naturally occurring numerical scale. Qualitative data can only be classified into categories. The statistical methods for describing, reporting and analyzing the data, depend on the data type (quantitative or qualitative).

Describing Qualitative Data


A class is one of the categories into which qualitative data can be classified.
The class frequency is the number of observations in the data set falling into a particular class. The class relative frequency (class percentage) is the class frequency divided by the total number of observations in the data set (*100).

Summary of Graphical Descriptive Methods for qualitative data:


Bar graph, Pie chart, Pareto diagrams: a column graph with the categories of the qualitative variable (the columns) arranged by height in descending order from left to right.

Bar graph
Function fi Asistent 82 Lecturer 82 Assistant professor 49 Professor 57 total 270

The distribution of academic staff in a survey, at Transilvania University, in 2009


90 80 70 60
number

82

82

57 49

50 40 30 20 10 0 Assistant Lecturer

Ass. Professor functions

Professor

Pie chart
Function fi Asistent 30.4% Lecturer 30.4% Assistant professor 18.1% Professor 21.1% total 100%

The structure of the academic staff survey, at Transilvania University, in 2009

Professor 21%

Assistant 31%

Ass. Professor 18% Lecturer 30%

Pareto diagram
The Pareto diagram of the academic staff distribution in a survey, at Transilvania University, in 2009
35% 30% 30% 25% 21%
number

100% 30% 90% 80% 70% 18% 60% 50%

20% 15% 10% 5% 0% Assistant Lecturer functions Professor

40% 30% 20% 10% 0% Ass. Professor

The 40 Best Paid Executives


fi fi* 20% fi fi* fi*c

Bachelors Law Masters MBA

8 10% 4 10% 4 50% 20 5% 2 5%

MBA Bachelors Law Masters

20 8 4 4

50% 20% 10% 10%

50% 70% 80% 90%

None
PhD

2
2 40

5%
5% 100%

95%
100%

None
PhD Total

2 40 100% Total

Source: Forbes, May 8, 2006

The 40 Best Paid Executives


The Pareto diagram for degrees of 40 CEOs, in 2005
60% 50% 50%
frequency (%)

100% 90% 80% 70% 60% 50% 20% 40% 30% 10% 10% 5% 5% 20% 10% 0% MBA Bachelors Law Masters degrees None PhD

40% 30% 20% 10% 0%

Source: Forbes, May 8, 2006

The Pareto Principle


Vilfredo Pareto (1843 1923) Born in Paris, University of Turin: engineering and mathematics Univ. of Lausanne in Switzerland (1896) - Cours deconomiepolitique proved that the distribution of income and wealth in society is not random The pattern appears throughout history in all societies: approximately 80% of the total wealth in a society lies with only 20% of the families. vitalfewandthetrivialmany- the Pareto
principle in economics

Graphical Methods for Describing Quantitative Data


Dot plots Stem-and-leaf displays Histograms

the use of a diagram


A diagram can be used for the purposes:
to summarize large sets of data (structure), or to focus attention on some aspect of the data, or to display a trend in the data over time.

A good diagram enables the viewer to grasp in a single glance the relevant features of the data, features that wouldn't be obvious from the raw numbers themselves. The power that diagrams have to give us an instant impression of the data can also be abused. Diagrams can be constructed to give the impression that the data have a feature that they don't really have, or some common ways of pictorially representing (and misrepresenting) data.

Hong Kong's soaring population

Example
Years 1990 1991 1992 1993 Newspaper A (thou. pieces) 510 621 624 654 Newspaper B (thou. pieces) 1911 1829 1636 1555

1994

732

1490

Scales origines
using two Y Axis, one for each series
Number of copies evolution for newspaprs A and B, during 1990-1994
2500 750

2000

700

1500

650

1000

600

500

550

0 1990 1991 1992 1993 1994

500

years

nr. of pieces A (thou.)

nr. of pieces B (thou.)

Using two Y Axis (a)


Number of copies evolution for newspaprs A and B, during 1990-1994
2500 800 700 2000 500 400 1000 300 200 500 100 0 1990 1991 1992 1993 1994 years 0

1500

nr. of pieces A (thou.)

nr. of pieces B (thou.)

600

Using two Y Axis (b)


Number of copies evolution for newspaprs A and B, during 1990-1994
1950 1850 750 700 650 600 550 500 450 400 1990 1991 1992 1993 1994 years

1750 1650 1550 1450 1350 1250

nr. of pieces A (thou.)

nr. of pieces B (thou.)

Using two Y Axis (c)


Number of copies evolution for newspaprs A and B, during 1990-1994
1950 1900 1850 700 750

1800 1750 1700 1650 1600 1550 1500 1450 1990 1991 1992 1993 1994 500 550 600 650

years

nr. of pieces A (thou.)

nr. of pieces B (thou.)

Correct graph
Number of copies evolution for newspaprs A and B, during 1990-1994
2500

2000

nr. of pieces (thou.)

1500

1000

500

0 1990 1991 1992 1993 1994 years

Comparative evolution of newspapers A si B, during 1990-1994

Using plane images


Doubling the production

1996

1999

The mobile phone revolution

Using spatial images


2 errors: the dimensions and the inflation

Tricky comparisons
Absolute values
900 800 700 600

bilioane $

500 400 300 200 100 0 1930 1936 1942 1948 1954 1960 1966 1972 1978 1984
ani i

Governmental expenditure evolution in U.S.A. during 1930-1984


(Wonnacott,ediiaa4-a. pag.64)

Correct graph
Relative values (%)
60 50

procente din PNB (%)

40 30 20 10 0 1930193619421948195419601966197219781984
ani i

Governmental expenditure evolution in U.S.A. (% din PNB) during 1930-1984


(Wonnacott,ediiaa4-a. pag.64)

An effective campaign?
In 1956, the U.S.A. state of Connecticut began a severe crackdown on speeding drivers. The following graph shows the annual number of traffic fatalities before and after the crackdown.

You might also like