Professional Documents
Culture Documents
Remarks:
Work in teams (2 persons/team). Each team should solve a different application (with a
different database). Print the text of the application (requirements and database).
Copy the database in Excel.
Process the data in Excel (using Data/Data Analysis/Regression) and get the Output
tables.
Print the Output tables.
Write down the explanations for each requirement of the application (handwritten on
paper or typed in a Word file).
Deadline: last seminar in January.
TEAM 1
Aim of study: the relationship between the turnover value of supermarkets, number of families in
the neighborhood and commercial area of the supermarkets.
Data recorded for 30 supermarkets owned by a businessman:
Turnover value (million lei) Number of families Commercial area (hundred m2)
483 68 46
411 83 33
422 48 38
410 68 26
369 38 22
198 10 21
209 35 26
197 55 14
156 25 10
85 28 12
187 43 20
43 15 5
211 33 28
120 23 9
62 24 26
176 45 10
117 20 8
273 56 36
408 82 31
419 47 36
407 67 24
366 37 20
295 40 22
397 55 30
253 27 15
421 45 38
330 35 19
272 16 16
386 57 20
327 32 18
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 95%
confidence level? (critical value: 3,35).
c. Test the significance of the model parameters (critical value: 2,05).
d. Find and interpret the confidence intervals for the model parameters.
e. What percentage in the total variability of turnover value is explained by the regression
model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between the Number of families
and Turnover value. Use correlation coefficient (“CORREL” function in Excel).
h. Predict the turnover value for a supermarket with 33 hundred m2, placed in a
neighborhood where 60 resident families live.
TEAM 2
Aim of study: the behavior of a mobile phone company clients’ propensity to give up the
company services, depending on the average monthly bill value and the seniority in company
service.
Data recorded for 40 clients of the mobile phone company:
Propensity to give up the company The average monthly bill value Seniority in company
services (points) (lei) service (years)
64,12 69,06 1,26
64,85 81,09 2,2
68,48 82,23 2,58
58,77 85,59 3,45
83,45 172,78 1,38
60,77 50,85 3,23
57,48 78,82 2,75
71,99 93 1,44
56,79 46,26 3,25
53,69 62,45 4,49
55,66 49,77 3,45
54,84 60,34 3,13
63,72 47,9 2,42
78,29 162,64 1,23
62,23 59,45 2,28
61,65 70,17 2,1
73,47 170,86 1,57
66,03 84,55 2,1
63,96 59,33 3,36
56,65 70,68 3,2
60,47 58,89 3,87
71,47 72,38 1,6
71,04 69,78 3,02
67,39 61,74 2,52
70,94 100,93 1,38
62,74 65,21 2,4
51,09 84,49 4,36
64,98 48,43 2,5
68,06 98,12 3,02
66,04 71,42 2,66
61,12 69,4 2,67
66,41 56,23 2,96
59,25 55,03 2,96
70,41 65,83 1,18
77,37 184,14 1,46
62,63 37 3,3
61,02 51,95 1,92
72,11 166,92 1,34
56,81 44,72 2,51
52,65 61,93 3,42
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 95%
confidence level? (critical value: 3,25).
c. Test the significance of the model parameters (critical value: 2,026).
d. Find and interpret the confidence intervals for the model parameters.
e. Compute and interpret the coefficient of determination.
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Find out the Correlation Matrix (use Data/Data Analysis/Correlation). Explain the main
diagonal values.
h. Predict the propensity to give up the company services for a client with 4 years seniority
in the company service and a 65 lei monthly bill.
TEAM 3
Aim of study: the relationship between the number of TV sets owned by the clients of an
electrical household appliances store, household size and the average monthly income.
Data recorded for 25 clients:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Test the validity of the regression model, at 5% significance level (critical value: 3,44).
c. Test the significance of the model parameters (critical value: 2,07).
d. Find and interpret the confidence intervals for the model parameters.
e. Measure the relative influence of the household size and average monthly income on the
variability of the number of TV sets, using the coefficient of determination. What percent
in the total variability of the number of TV sets is not explained by the regression model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between the Household size and
the Number of TV sets. Use correlation coefficient (“CORREL” function in Excel).
h. Predict the number of TV sets owned by a client with an average monthly income of 75
hundred lei and which comes from a 4 person household.
TEAM 4
Aim of study: the relationship between the sales value in the previous year, advertising
expenditure and the number of competitors, for the supermarkets owned by a businessman.
Data recorded for 24 supermarkets owned by the businessman:
Number of competitors Advertising expenditure (1000s Eur) Sales value (10000s Eur)
6 2,29 8,71
1 4,9 12,07
1 5,75 12,74
5 3,61 9,82
2 4,62 11,51
2 4,69 12,23
4 6,41 11,84
1 6,47 12,25
3 3,43 11,1
5 8,39 10,97
6 2,15 8,75
6 1,54 7,75
5 2,67 10,5
5 1,24 6,71
7 1,77 7,6
3 4,46 12,46
6 1,83 8,47
2 5,15 12,27
1 7,25 12,57
6 1,72 8,87
4 3,04 11,15
3 4,92 11,86
4 4,85 11,07
5 3,13 10,38
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 99%
confidence level? (critical value: 5,78).
c. Test the significance of the model parameters (critical value: 2,83).
d. Find and interpret the confidence intervals for the model parameters.
e. What percent in the total variability of sales value is not explained by the regression
model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Find and display the Correlation Matrix (use Data/Data Analysis/Correlation). Explain
the values on the main diagonal.
h. Predict the sales value for a supermarket, if the advertising expenditures are 9.5 thousand
Eur and there are 5 other similar supermarkets in the neighborhood.
TEAM 5
Aim of study: the behavior of the menswear sales value of a clothing factory, depending on the
the number of catalogs mailed to the customers and on the amounts spent on advertising the
products manufactured.
Data recorded for 30 retail stores of the clothing factory:
Amounts spent on Menswear sales value
Number of catalogs mailed
advertising (thousand $) (thousand $)
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 5%
significance level? (critical value: 3,35).
c. Test the significance of the model parameters (critical value: 2,05).
d. Find and interpret the confidence intervals for the model parameters.
e. What percent in the total variability of sales value is not explained by the regression
model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between Number of catalogs
mailed and Menswear sales value. Use correlation coefficient (“CORREL” function in
Excel).
h. Predict the sales value for a retail store, if the amount spent on advertising was 4000
thousand $ and there were 5000 catalogs mailed to the customers.
TEAM 6
Aim of study: the relationship between the seniority at current job, household income and debt on
the credit-card for a bank customers.
Data recorded for 30 customers of the bank, randomly drawn:
Household income (lei) Seniority at current job (years) Credit-card debt (lei)
12240 23 200,6
10370 6 95,2
4420 0 17
8840 22 195,5
7310 17 100,3
4420 3 73,1
4590 8 68
2720 1 40,8
5440 0 363,8
11730 9 120,7
10880 25 161,5
9860 12 523,6
6290 2 34
27720 16 1731,2
5170 11 231,2
9350 15 146,2
20300 15 452,2
4760 2 304,3
4250 5 66,3
11390 20 651,1
6460 12 22,1
3230 3 231,2
4250 0 472,6
2720 0 30,6
3910 4 42,5
10880 24 668,1
4930 6 292,4
17000 22 629
8330 9 139,4
6970 13 496,4
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 1%
significance level? (critical value: 5,48).
c. Test the significance of the model parameters (critical value: 2,77).
d. Find and interpret the confidence intervals for the model parameters.
e. What percent in the total variability of credit-card debt is explained by the regression
model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Find the Correlation Matrix (use Data/Data Analysis/Correlation). Explain the main
diagonal values.
h. Predict the credit-card debt for a customer with 15 years seniority at the current job and a
household income of 7500 lei.
TEAM 7
Aim of study: the relationship between the sales value of mobile phones, the number of phone
lines opened for customers’ orders and the number of service units.
Data recorded for 30 mobile phone stores:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 10%
significance level? (critical value: 2,51).
c. Test the significance of the model parameters (critical value: 1,703).
d. Find and interpret the confidence intervals for the model parameters.
e. To what extent the total variability of sales value is explained by the regression model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between Sales value and
Number of service units. Use correlation coefficient (“CORREL” function in Excel).
h. Predict the sales value if there are 30 service units and 40 phone lines opened for
customers’ orders.
TEAM 8
Aim of study: the behavior of Number of visits to health care providers (rate per 10,000
inhabitants), depending on the Health care funding (amount per 100 inhabitants) and Reported
diseases (rate per 10,000 inhabitants).
Data recorded for 20 towns, in the previous year:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Test the significance (validity) of the regression model, at 5% significance level (critical
value: 3,59).
c. Test the significance of the model parameters (critical value: 2,11).
d. Find and interpret the confidence intervals for the model parameters.
e. To what extent the total variability of Visits to health care providers is explained by the
regression model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Obtain the Correlation Matrix (use Data/Data Analysis/Correlation). Explain the values
on the main diagonal.
h. Predict the visits to health care providers if the health care funding amounted 200 units
per 100 inhabitants and there were reported 210 diseases cases per 10000 inhabitants.
TEAM 9
Aim of study: the behavior of Sale price of houses, depending on the Number of days beween
announcing the house sale and closing the deal, on one hand and on Living area, on the other
hand.
Data recorded for 25 houses, sold by a real estate agency:
Sale price (thousand Number of days beween announcing the house sale House living
dollars) and closing the deal area (m2)
274 85 244
265 61 151
254 1 247
229 13 158
250 16 209
335 91 298
321 54 261
300 13 306
325 96 282
210 18 210
416 62 411
342 18 394
347 133 365
284 103 250
290 104 259
294 46 366
235 74 233
250 10 185
290 13 269
247 34 289
232 15 290
278 66 240
222 73 162
265 55 218
300 85 273
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 95%
confidence level? (critical value: 3,44).
c. Test the significance of the model parameters (critical value: 2,074).
d. Find and interpret the confidence intervals for the model parameters.
e. What percent in the total variability of selling price is determined by the influence of the
two explanatory variables?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between Sale price and House
living area. Use correlation coefficient (“CORREL” function in Excel).
h. Predict the selling price of a house with 300 m2, which was sold after 50 days since the
sale announcement.
TEAM 10
Aim of study: the behavior of extra weight, depending on the number of cigarettes smoked per
day (over past 30 days) and on the age when first smoked a cigarette.
Data recorded for 30 smokers randomly drawn:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 90%
confidence level? (critical value: 2,45).
c. Test the significance of the model parameters (critical value: 1,687).
d. Find and interpret the confidence intervals for the model parameters.
e. Compute and interpret the coefficient of determination.
f. Analyze the direction and the strength of the relationship between the three variables,
using an appropriate statistical indicator. Test its significance.
g. Get the Correlation Matrix (use Data/Data Analysis/Correlation). Explain the values on
the main diagonal.
h. Predict a person’s extra weight, if he started to smoke when he was 15 years old and used
to smoke 3 cigarettes per days in the last 30 days.
TEAM 11
Aim of study: the behavior of additional income earned by employees after graduating a training
course, depending on their age and household size.
Data recorded for 35 employees randomly selected:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 95%
confidence level? (critical value: 3,3).
c. Test the significance of the model parameters (critical value: 2,037).
d. Find and interpret the confidence intervals for the model parameters.
e. Compute and interpret the coefficient of determination.
f. Analyze the direction and the strength of the relationship between the three variables,
using an appropriate statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between Additional income and
Household size. Use correlation coefficient (“CORREL” function in Excel).
h. Predict the additional income that might be earned by an employee aged 35, whose
household includes 4 persons.
TEAM 12
Aim of study: the behavior of monthly income depending on work experience and expertise level
(ranging from 1 to 40), for a company employees.
Data recorded for 40 employees:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Is there enough evidence to conclude that the regression model is valid, at 95%
confidence level? (critical value: 3,25).
c. Test the significance of the model parameters (critical value: 2,026).
d. Find and interpret the confidence intervals for the model parameters.
e. What percent in the total variability of monthly income is not explained by the regression
model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Analyze the direction and the strength of the relationship between Monthly income and
Work experience. Use correlation coefficient (“CORREL” function in Excel).
h. Predict the monthly income of an employee with 15 years work experience and an
expertise level of 25.
TEAM 13
Aim of study: the relationship between the insured persons’ intelligence (measured by the score
obtained in an intelligence test), age (years old) and the number of major car crashes, over the
last 10 years.
Data recorded for 50 clients of an auto insurance company:
Process the data in Excel (Data/Data Analysis/Regression) and answer the following questions:
a. Identify the variables, the linear regression equation and interpret the partial regression
coefficients.
b. Test the validity of the regression model, at 5% significance level (critical value: 3,195).
c. Test the significance of the model parameters (critical value: 2,012).
d. Find and interpret the confidence intervals for the model parameters.
e. Measure the relative influence of the insured persons’ intelligence and age on the
variability of the number of car accidents, using the coefficient of determination. What
percent in the total variability of the number of car accidents is not explained by the
regression model?
f. Measure the strength of the relationship between the three variables, using an appropriate
statistical indicator. Test its significance.
g. Get the Correlation Matrix (use Data/Data Analysis/Correlation). Explain the values on
the main diagonal.
h. Predict the number of car accidents for a client aged 25, who obtined 60 points at the
intelligence test.