You are on page 1of 37

Regression model of Gini

Coefficient
2014R2-DSME2021A : APPLIED ECONOMETRICS FOR
BUSINESS DECISIONS
BY
C H E N T I A N YA N G

1155029138

TA I WA I L O K

1155032652

X I A N RA N

1155028999

ZHENG JIA MIN

1155029178

What: About Gini Coefficient


Commonly used as a measure of inequality
- A wealthy nation can have the same Gini
Coefficient of a poor nation (e.g. Canada and India)

Range from 1 (or 100%, absolute inequality) to 0 (or


0%, maximal equality)
After tax effect: Below 0.49

Why: The example of USs


income inequality
A dangerous and growing
inequality has jeopardized
middle-classs basic
bargain, that if you work
hard, you have a chance to
get ahead.
The defining challenge of
our time
(President Obama, April 2014)

A pressing issue:
1. Significant increase since
1970
2. Ranked at 30% percentile
globally
3. Rationales of such trend?

How: Assumptions LINE


Assumption 1: Linearity
The expected value of dependent
variable is a straight-line function of
each independent variable, holding
the others fixed.
Assumption 2: Independence of
errors
Independence of error terms (i.e. no
correlation between consecutive
errors)
Assumption 3: Normality
Normal probability distribution
Assumption 4: Equal variance
Constant variance of the errors to

Dependent variables (y) and


independent variables (x)
Dependent variable (Y):
Y: Gini coefficient

Quantitative variables (X1 to


X9):

X1: GDP (PPP) Per Capita


X2: Education Index
X3: Population density
X4: Unemployment
X5: Government expenditure (i.e. share
of % of GDP)
X6: Inflation rate
X7: Corruption Perception Index
X8: History of sovereignty
X9: GDP (PPP)

Qualitative variables / Dummy


variables (X10 to X12):
X10: European regions
X11: Americas
X12: Asian regions

List of independent variables


X1: GDP (PPP) Per Capita

As a quantitative variable:
X1: GDP (PPP) Per Capita
Data from World Bank, 2014

Explanation:
Measured by the purchasing power parity
(PPP) value of all final goods and services
produced within a country in a given year,
divided by the average (or mid-year)
population for the same year

Predictions:
The greater the economic development,
the higher the Gini Coefficient
E.g. Hong Kong and Singapore ( GC above 0.5)
E.g. Slovak Republic and Slovenia (GC below
0.5)

List of independent variables


X2: Education Index
As a quantitative variable:
X2: Education Index
Data from UN Human Development Reports, 2014
Proxy for distribution of educational attainment

Explanation:
Given that human choices are infinite, it was also
recognized that at all levels of development, the
three essential ones (include) to acquire
knowledge.
If these essential choices are not available, many
other opportunities remain inaccessible.
(UN, 2014)

Predictions:
The greater the education index, the smaller the
Gini Coefficient
E.g. Denmark (EI = 0.873; GC = 0.26)
E.g. South Africa (EI = 0.695; GC = 0.631)

List of independent variables


X5: Government expenditure (i.e. share of % of GDP)

As a quantitative variable:
X5: Government expenditure (i.e. share of % of
GDP)
Data from OECD, 2014

Explanation:
Fiscal policy and public expenditure envisage the
government objective of reducing income inequality
(University of Karachi, 2011)
Discernible impacts on the changes in minimum
wages, money supply and price reform and social
welfare, etc.

Predictions:
The greater the government expenditure as a
share of GDP, the smaller the Gini Coefficient
E.g. France (Gov spd. = 56.1%; GC = 0.317)
E.g. Singapore (Gov spd = 13.8%; GC = 0.463)

Other possible factors of Gini


Coefficient?

Factor

Item

Demographic

Population size
Age structure
Geographical
distribution
Ethnic mix

Economic

Inflation rate
Interest rate
Trade deficit / surplus
Gross domestic product

Socio-cultural

Workforce diversity
Women in the workforce

Physical

Precise
reasons not well
understood
- Low R-square?

In all likelihood
- Unexplained
factors?

Energy consumption
Environmental footprint
Renewable energy

Global

Multiple facades
of Gini
coefficient:

Newly industrialized

Interaction of
multiple factors
- Interaction terms?

Results of simple regression:


Scatter plots

X1

X2

X5

X6

X3

X4

X7

X8

X9

Statistical information:
Mean, mode, median, range, s.d.
Y

Gini
coefficient

X1

X2

X3

X4

X5

X6

X7
X8
X9
CORRUPTIO
N
PERCEPTIO History of
NS INDEX
sovereignty GDP (PPP)
2013
(2013)
2013

Governmen
GDP (PPP)
t
Per Capita Education population Unemploym expenditure Inflation
2013
Index 2013
density ent
(% of GDP) rate

Mean

0.3498685
0.7926363 442.03977 8.6886363 18.240909 3.1022727 63.659090 158.93181 1.85937E
17 35017.925
64
27
64
09
27
91
82
+12

S.d.

0.0981799 17467.629 0.0988068 1465.9637 4.6786111 5.3667936 1.8838960 18.654687 207.99546 3.52034E
57
62
57
24
99
03
91
27
03
+12

Range

0.4595325
42

84998.5

0.454

7537.2

21.6

26.7

10

63

1102

1.67546E
+13

Median

0.3202577
7 33982.65

0.8125

104

7.75

19.35

2.75

68.5

75

4.61776E
+11

7.9

18.1

2.8

81

68

Mode

#N/A

#N/A

0.794

#N/A

#N/A

Regression Model
Dependent variable
Gini coefficient
Potentially independent variables
GDP (PPP)
History of sovereignty
Education index
Population density
Unemployment rate
Government expenditure(% of GDP)
Inflation rate
Corruption perceptions index
GDP (PPP) Per Capita

Testing the utility of the


model
F statistic
E(Y)=0+1X1+2X2++9X9
H0: 1=2==9=0
H: At least one of the coefficients is nonzero

F-test
Regression Analysis

Regression Statistics
Multiple R

0.7207

R Square

0.5194

Adjusted R Square

0.3922

Standard Error

0.0765

Observations

44

ANOVA

Regression

df

SS

MS

0.2153

0.0239

Residual

34

0.1992

0.0059

Total

43

0.4145

Signific
ance F

4.0832 0.0013

Potentially independent
variables
PHStat--- Stepwise regression
Select from 9 independent variables

GDP (PPP)
History of sovereignty
Education index
Population density
Unemployment rate
Government expenditure(% of GDP)
Inflation rate
Corruption perceptions index
GDP (PPP) Per Capita

Stepwise regression 1

Stepwise regression 1
Step 1 H0: 1=0
Stepwise Regression
Analysis
Table of Results for Forward Selection

Education index
entered.

Regression

df

SS
1

MS

42

0.2835 0.0068

Total

43

0.4145

Coefficients

Significance F

0.1310 0.1310 19.4003

Residual

Standard Error

t Stat

0.0001

P-value

Lower 95%

Upper 95%

Stepwise regression 1
Step 2 H0: 2=0
Population density
entered.

df

SS

Regression

0.1683

MS

Residual

41

Total

43

0.4145

Intercept

Education index
Population

Coefficients
0.7703

-0.5416

Significance
F

0.084 14.012
1
4

0.006
0.2462
0

0.0000

Standard
Error
t Stat P-value Lower 95%
8.029
0 0.0000
4.521
0.1198
2 0.0001
0.0959

2.493

Upper
95%

0.5765

0.9640

-0.7835

-0.2997

Stepwise regression 1
Step 3 H0: 3=0
Unemployment
entered.

Regression

df

SS
3

MS

Significance
F

0.067 12.820
0.2032
7
9

Residual

40

0.2113

Total

43

0.4145

0.0000

0.005
3

No other variables could be entered into the


Coefficient Standard
Upper
model.
Stepwise
ends.

s
Error
t Stat P-value Lower 95%
95%
Intercept

Education index
Population
density

0.7033

-0.5294
0.0000

7.506
0.0937
9 0.0000
4.707
0.1125
6 0.0000

0.5139

0.8926

-0.7567

-0.3021

3.250
5 0.0023

0.0000

0.0000

0.0000

Stepwise regression 2
Education index
entered.

df

SS

Regression

MS

0.1310

0.1310

Residual

42

0.2835

0.0068

Total

43

0.4145

Coefficients

Intercept
Education index

Standard Error

Significance
F

F
19.4003

t Stat

0.0001

P-value

Lower 95%

0.7926

0.1013

7.8262

0.0000

0.5882

0.9970

-0.5585

0.1268

-4.4046

0.0001

-0.8145

-0.3026

Population density
entered.

Upper 95%

df

SS

MS

Significance
F

Stepwise regression 2
Unemployment
entered.

df

SS

Regression

MS

0.2032

0.0677

Residual

40

0.2113

0.0053

Total

43

0.4145

Intercept

Coefficients

Standard Error

Significance F
12.8209

t Stat

0.0000

P-value

Lower 95%

Upper 95%

0.7033

0.0937

7.5069

0.0000

0.5139

0.8926

-0.5294

0.1125

-4.7076

0.0000

-0.7567

-0.3021

Population density

0.0000

0.0000

3.2505

0.0023

0.0000

0.0000

Unemployment

0.0063

0.0025

2.5702

0.0140

0.0014

0.0113

Education index

Regression model
E(Y)=0.70330.5294X1+0.000026X2+0.0063X3
X1 Education index
X2 Population density
X3 Unemployment rate

Dummy Variables

Dummy Variables
Australia
Austria
Belgium
Brazil
Canada
Chile
China
Colombia
Czech
Republic
Denmark
Estonia
Finland
France
Germany
Greece
Hong Kong
SAR, China
Hungary
Iceland
India
Indonesia
Ireland
Israel

Europe
0
1
1
0
0
0
0
0

Americas
0
0
0
1
1
1
0
1

Asia-Pacific
1
0
0
0
0
0
1
0

1
1
1
1
1
1
1

0
0
0
0
0
0
0

0
0
0
0
0
0
0

0
1
1
0
0
1
0

0
0
0
0
0
0
0

0
0
0
1
1
0
1

Italy
Japan
Korea, Rep.
Latvia
Luxembourg
Mexico
Netherlands
New
Zealand
Norway
Poland
Portugal
Russian
Federation
Singapore
Slovak
Republic
Slovenia
South Africa
Spain
Sweden
Switzerland
Turkey
United
Kingdom
United
States

1
0
0
1
1
0
1

0
0
0
0
0
1
0

0
1
1
0
0
0
0

0
1
1
1

0
0
0
0

1
0
0
0

1
0

0
0

0
0

1
1
0
1
1
1
1

0
0
0
0
0
0
0

0
0
0
0
0
0
0

Countries
Sample
30
25

27

20
15
10
5
0

6
Europe

Americas

8
Asia-Pacific

Regression Output
SUMMARY OUTPUT

Regression Statistics
Multiple R
R Square

0.828957
0.68717

Adjusted R
Square

0.663707

Standard
Error

0.056935

Observatio
ns

44

ANOVA

Regression

df

SS

MS

0.284825

0.094942

Residual

40

0.129665

0.003242

Total

43

0.41449

Significanc
eF

29.28828

3.47E-10

New Model
Education index
Population density
Unemployment rate
Dummy variables

Regression output
SUMMARY OUTPUT

Regression Statistics
Multiple R

0.91610730
1

R Square

0.83925258
7

Adjusted R
Square

0.81318543
9

Standard
Error

0.04243539

Observation
s

44

ANOVA

Regression

df

SS
6

MS

Significance
F

0.34786186 0.05797697 32.1957962


5
7
8 2.95621E-13

Residual

37

0.06662820 0.00180076
7
2

Total

43

0.41449007
1

Further analyze
VIF (Variance Inflation Factor)
>> multicollinearity

DW (Durbin-Watson Test)
>> autocorrelation

VIF test
Regression Analysis

Higher VIF value means


higher multicollinearity.

Education Index 2013 and all other X


Regression Statistics
Multiple R

0.4005

By experience, when VIFi


>= 10, there is a serious
multicollinearity.

R Square

0.1604

Adjusted R Square

0.0974

Standard Error

0.0939

Sample of VIF test in Excel:

Observations
VIF

44
1.1911

VIF test result


We wish VIFi <= 10, if it does,
then it can be kept in our
model
VIF
1,2,3 are small
Can be kept in the regression
model

VIF4,5,6 Represent the


influence of
multicollinearaity is
notable
Two of them should be
removed because
they are not
independent

Keep the dummy


variable Europe
Regression
Analysis

Regression
Statistics
Multiple R

0.8425

R Square
Adjusted R
Square
Standard
Error
Observation
s

0.7098
0.6801
0.0555
44

ANOVA

Regression

df

SS
4

MS

F
23.849
0.2942 0.0736
5

Residual

39

0.1203 0.0031

Total

43

0.4145

Significance
F
0.0000

Keep the dummy


variable Europe
Regression
Analysis
Education Index 2013
and all other X
Regression Statistics
Multiple R
0.4005
R Square
0.1604
Adjusted R
Square
0.0974
Standard Error
0.0939
Observations
44
VIF
1.1911

Regression
Analysis
population density
and all other X
Regression
Statistics
Multiple R
0.3488
R Square
0.1217
Adjusted R
Square
0.0558
Standard
1424.49
Error
05
Observation
s
44
VIF
1.1385

Regression
Analysis
Unemployment and
all other X
Regression
Statistics
Multiple R
0.3288
R Square
0.1081
Adjusted R
Square
0.0412
Standard
Error
4.5812
Observation
s
44
VIF
1.1212

Regression
Analysis
Europe and all other
X
Regression
Statistics
Multiple R
0.4923
R Square
0.2423
Adjusted R
Square
0.1855
Standard
Error
0.4445
Observation
s
44
VIF
1.3198

Conclusion
Regression Analysis

Regression Statistics
Multiple R

0.8425

R Square

0.7098

Adjusted R
Square

0.6801

Standard Error

0.0555

Observations

44

ANOVA

Regression

df

SS

MS

0.2942

0.0736 23.8495

Residual

39

0.1203

0.0031

Total

43

0.4145

Significance F
0.0000

Conclusion
Final model:
Y = 0.5955 -0.3286X1 + 0.000018X2 +
0.0084X3 - 0.1073 X4

Prediction of China Gini


Index in 2014
Y prediction = 0.432
Y actual

= 0.469

(by National Bureau of Statistics of


China)

Limitation
Only for OECD countries.
Not for long-term prediction.

You might also like