655 views

Uploaded by Muwaga Musa Iganga Musa

find your documents

- factors effecting mobile customers satisfaction
- MPRA Paper 22321
- Linear Models
- Methods for Analyzing Electric Load Shape and Its Variability
- Length Weight Relationship and Condition Factor of Puntius Sophore (Hamilton) From Kolkata and Suburban Fish Market
- 01-Forecasting_with_seasonal_relatives2
- Quantative Methods Final Assesment Test 2.docx
- Multiple Linear Regression
- Cap 16 Construccion de Modelos
- An Ova
- 548-1724-1-PB
- 1-s2.0-S0196890403002425-main
- jurnal 6.pdf
- 1-s2
- Applications of Statistics On Banking and Pharmaceutical Companies
- Simple LR Lecture.ppt
- How to Establish a Reasonable Shelf Life
- tugas jurnal
- STA302_Mid_2010S
- 428-1623-1-PB

You are on page 1of 9

Stat 112

1. When, in 1982, average Scholastic Achievement Test (SAT) scores were first published on a state-by-state basis in the United States, the huge variation in the scores was a source of great pride for some states and of consternation for others. Average scores ranged from a low of 790 (out of a possible 1,600) in South Carolina to a high of 1,088 in Iowa. Two researchers set out to figure out how certain variables are associated with state SAT differences.1 The variable SAT is the average total SAT (verbal+quantitative) score in the state and the two explanatory variables considered are the following: Takers Expend percentage of the total eligible students (high school seniors) in state who took the exam total state expenditure on secondary schools, expressed in hundreds of dollars per student

Response SAT Whole Model Actual by Predicted Plot

1100 1050 1000 SAT Actual 950 900 850 800 750 750 800 850 900 950 1000 1050 1100

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.808786 0.800472 31.93721 948.449 49 Sum of Squares 198456.79 46919.33 245376.12 Estimate 932.41448 4.2985226 Std Error 22.16843 1.025343 Mean Square 99228.4 1020.0 F Ratio 97.2841 Prob > F <.0001 Prob>|t| <.0001 0.0001

Analysis of Variance

Source Model Error C. Total Term Intercept EXPEND

1

DF 2 46 48

Parameter Estimates

t Ratio 42.06 4.19

B. Powell and L.C. Steelman, Variations in State SAT Performance: Meaningful or Misleading?, Harvard Educational Review 54(4), 1984: 389-412.

TAKERS

-3.07411 Nparm 1 1 DF 1 1

0.2206

-13.94

Effect Tests

Source EXPEND TAKERS Sum of Squares 17926.44 198071.21

100

SAT Residual

50

SAT Predicted

For questions (a)-(e), assume the ideal multiple linear regression model holds. (a) For Pennsylvania, SAT=885, TAKERS=50 and EXPEND=27.98. What would you predict Pennsylvanias average SAT score to be based on knowing its TAKERS and EXPEND, but not knowing its SAT? What is the residual for Pennsylvania? (b) Is there strong evidence that the multiple regression model provides better predictions of SAT than just using the sample mean of SAT to predict SAT? Use a test at the .05 level to justify your answer. (c) Find an approximate 95% confidence interval for the coefficient on TAKERS. (d) Is there strong evidence that total state expenditures (EXPEND) helps to predict a states average SAT score once TAKERS has been taken into account? Use a test at the . 05 level to justify your answer. (e) The two states with the largest Cooks distances are Alaska and South Carolina with Cooks distances of 2.06 and 0.18 respectively and leverages of 0.44 and 0.09 respectively. For each state (Alaska, South Carolina), answer whether it would be justified to delete the state from the analysis and report that we omitted the state and that our conclusions only hold for a reduced range of explanatory variables, not including the explanatory variables of the state.

(f) Suppose we want to use either Takers or Log(Takers) in the multiple regression. On the basis of the below information, which of these two forms would you choose to use? Explain.

Bivariate Fit of SAT By TAKERS

1100 1050 1000 SAT 950 900 850 800 750 0 10 20 30 40 50 60 70 TAKERS

Linear Fit:

Linear Fit

SAT = 1020.3062 - 2.7599621 TAKERS

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.735838 0.730335 36.79525 947.94 50

Analysis of Variance

Source Model Error C. Total DF 1 48 49 Sum of Squares 181024.09 64986.73 246010.82 Mean Square 181024 1354 F Ratio 133.7066 Prob > F <.0001

Parameter Estimates

Term Intercept TAKERS Estimate 1020.3062 -2.759962 Std Error 8.139082 0.238686 t Ratio 125.36 -11.56 Prob>|t| <.0001 <.0001

100 Residual 50 0 -50 -100 0 10 20 30 40 50 60 70

TAKERS

SAT = 1112.2477 - 59.018822 Log(TAKERS)

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.810762 0.80682 31.14298 947.94 50

Analysis of Variance

Source Model Error C. Total DF 1 48 49 Sum of Squares 199456.33 46554.49 246010.82 Mean Square 199456 970 F Ratio 205.6494 Prob > F <.0001

Parameter Estimates

Term Intercept Log(TAKERS) Estimate 1112.2477 -59.01882 Std Error 12.27496 4.11554 t Ratio 90.61 -14.34 Prob>|t| <.0001 <.0001

50 Residual 0 -50 -100 0 10 20 30 40 50 60 70 TAKERS

2. The number of car accidents on a particular stretch of highway seems to be related to the number of vehicles that travel over it and the speed at which they are traveling. A city alderman has decided to ask the county sheriff to provide him with statistics covering the last few years, with the intention of examining these data statistically so that he can (if possible) introduce new speed laws that will reduce traffic accidents. Using the number of accidents as the response variable, he obtains estimates of the number of cars passing along a stretch of road (subtracted from the mean number of cars passing along a stretch of the road) and their average speeds (in miles per hour, subtracted from the mean average speed) for 60 randomly selected days. (a) JMP output from simple linear regressions of (i) Accidents on Speed and (ii) Cars on Speed are shown below. Would you expect the estimated coefficient on Speed to increase, decrease or stay the same in a multiple linear regression of Accidents on Speed and Cars as compared to the estimated coefficient of Speed in the simple linear regression of Accidents on Speed. Justify your answer using the omitted variable bias formula.

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.021001 0.004122 2.430355 7.033333 60

Parameter Estimates

Term Intercept Speed Estimate -8.018052 0.2508495 Std Error 13.49733 0.224888 t Ratio -0.59 1.12 Prob>|t| 0.5548 0.2693

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.003515 -0.01367 1.222004 9.935 60

Parameter Estimates

Term Intercept Speed Estimate 13.003931 -0.051147 Std Error 6.786575 0.113076 t Ratio 1.92 -0.45 Prob>|t| 0.0603 0.6527

(b) JMP output from a multiple linear regression of Accidents on Cars, Speed and Cars*Speed is shown below. Is there strong evidence of an interaction between Cars and Speed? Justify your answer using a test at the .05 level.

Response Accidents Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.743622 0.729887 1.265725 7.033333 60

Analysis of Variance

Source Model Error C. Total DF 3 56 59 Sum of Squares 260.21801 89.71533 349.93333 Mean Square 86.7393 1.6021 F Ratio 54.1424 Prob > F <.0001

Parameter Estimates

Term Intercept Cars Speed Cars*Speed Estimate 7.1405117 0.4158119 0.0644162 1.0763228 Std Error 0.163638 0.136049 0.118519 0.087791 t Ratio 43.64 3.06 0.54 12.26 Prob>|t| <.0001 0.0034 0.5889 <.0001

(c) The alderman proposes decreasing the speed limit by 5 MPH. The number of cars on the road is higher on average on weekdays than the weekends. Assuming that the average number of cars will not be changed by decreasing the speed limit and that there are no confounding variables, would you expect the decrease in the speed limit to have a larger impact on the number of accidents during the weekends or the weekdays? 3. Car designers have been experimenting with ways to improve gas mileage for many years. An important element in this research is the way in which a cars speed affects how quickly fuel is burned. Competitions whose objective is to drive the farthest on the smallest amount of gas have determined that low speeds and high speeds are inefficient. Designers would like to know which speed burns gas most efficiently. As an experiment, 50 identical cars are driven at different speeds and the gas mileage measured. (a) JMP output from a simple linear regression model of Mileage on Speed is shown below. Comment on the regression diagnostics the residual plot, the histogram of the residuals and the boxplot of the Cooks distances. If you see any problems, suggest what you would do next in the analysis to try to address those problems.

Bivariate Fit of Mileage By Speed

40 35 30 Mileage 25 20 15 10 5 0 10 20 30 40 50 60 70 80 90 100 110 Speed

Linear Fit

Linear Fit

Mileage = 23.266776 - 0.0012701 Speed

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.000028 -0.02081 7.102586 23.202 50

Analysis of Variance

Source Model Error C. Total DF 1 48 49 Sum of Squares 0.0672 2421.4426 2421.5098 Mean Square 0.0672 50.4467 F Ratio 0.0013 Prob > F 0.9710

Parameter Estimates

Term Intercept Speed Estimate 23.266776 -0.00127 Std Error 2.039431 0.034802 t Ratio 11.41 -0.04 Prob>|t| <.0001 0.9710

-15

-10

-5

10

15

0.2

0.15

0.1

0.05

(b) JMP output for a quadratic regression of mileage on speed and speed squared is shown below. Is there strong evidence that the quadratic regression provides better predictions of mileage based on speed than the simple linear regression? Justify your answer using a test at the .05 level.

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.710249 0.697919 3.863732 23.202 50 Estimate 9.3413673 0.8021188 -0.007876 Std Error 1.70707 0.077207 0.000734 t Ratio 5.47 10.39 -10.73 Prob>|t| <.0001 <.0001 <.0001

Parameter Estimates

Term Intercept Speed Speed Squared

40 35 Mileage Actual 30 25 20 15 10 5 5 10 15 20 25 30 35 40 Mileage Predicted P<.0001 RSq=0.71 RMSE=3.8637

Summary of Fit

RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) 0.710249 0.697919 3.863732 23.202 50 Sum of Squares 1719.8740 701.6358 2421.5098 Estimate 9.3413673 0.8021188 -0.007876 Mean Square 859.937 14.928 F Ratio 57.6040 Prob > F <.0001 Prob>|t| <.0001 <.0001 <.0001

Analysis of Variance

Source Model Error C. Total Term Intercept Speed Speed Squared DF 2 47 49

Parameter Estimates

Std Error 1.70707 0.077207 0.000734 t Ratio 5.47 10.39 -10.73

10

Mileage Residual

-5 5 10 15 20 25 30 35 40

Mileage Predicted

40 Mileage Leverage Residuals 35 30 25 20 15 10 5 0 10 20 30 40 50 60 70 80 90 100 Speed Leverage, P<.0001

40 Mileage Leverage Residuals 35 30 25 20 15 10 5 0 1000 3000 5000 7000 9000 Speed Squared Leverage, P<.0001

(c) Suppose you are low on gas. Which speed does the quadratic regression model suggest that it is best to drive at 20 MPH, 50 MPH or 70 MPH? Justify your answer.

- factors effecting mobile customers satisfactionUploaded byAfsa Bhatti
- MPRA Paper 22321Uploaded byRichaJain
- Linear ModelsUploaded byCART11
- Methods for Analyzing Electric Load Shape and Its VariabilityUploaded bymihai_chiru
- Length Weight Relationship and Condition Factor of Puntius Sophore (Hamilton) From Kolkata and Suburban Fish MarketUploaded byDR. BIJAY KALI MAHAPATRA
- 01-Forecasting_with_seasonal_relatives2Uploaded byObaid Sarvana
- Quantative Methods Final Assesment Test 2.docxUploaded byAnonymous z77xOgW4d
- Multiple Linear RegressionUploaded by3432meesala
- Cap 16 Construccion de ModelosUploaded byJorge Enriquez
- An OvaUploaded byRenu Venugopal
- 548-1724-1-PBUploaded byHaseeb Malik
- 1-s2.0-S0196890403002425-mainUploaded byJuan Peralta
- jurnal 6.pdfUploaded byCurieJulia
- 1-s2Uploaded bymelissastgo
- Applications of Statistics On Banking and Pharmaceutical CompaniesUploaded byMidul Khan
- Simple LR Lecture.pptUploaded byEarl Kristof Li Liao
- How to Establish a Reasonable Shelf LifeUploaded bylouish9175841
- tugas jurnalUploaded byAga Satria Nurachman
- STA302_Mid_2010SUploaded byexamkiller
- 428-1623-1-PBUploaded byCassandra Wiggins
- DASUploaded byPond Juprasong
- Intro to Linear Regression.pdfUploaded bydiegocue
- Stat a Cheat SheetsUploaded byivanmrn
- RegressionUploaded byMansi Chugh
- me310_5_regression.pdfUploaded byiqbal
- MGMT1050 L04 Regression W17Uploaded byways
- 330_Lecture6_2014.pdfUploaded byAnonymous gUySMcpSq
- 21141_telurUploaded byisdiarto milenawan
- Modeling and Forecasting Hourly Electric Load by Multiple Linear Regression with InteractionsUploaded bySebastián Ribadeneira
- amikacina2Uploaded byVeronica Diaz Medina

- Insperational QuotesUploaded byMuwaga Musa Iganga Musa
- The valueUploaded byMuwaga Musa Iganga Musa
- The Influence of Value $ Sexual Self-regulationUploaded byMuwaga Musa Iganga Musa
- Sexual BehaviorsUploaded byMuwaga Musa Iganga Musa
- CoursesUploaded byMuwaga Musa Iganga Musa
- District IUIUUploaded byMuwaga Musa Iganga Musa
- PsychologyUploaded byMuwaga Musa Iganga Musa
- TR YOUUploaded byMuwaga Musa Iganga Musa
- ValuesUploaded byMuwaga Musa Iganga Musa

- Instrumental Variables Estimation in Political Science. A Readers Guide.pdfUploaded bymargarita
- ReliabilityUploaded byMuhammad Dinata
- DSP 2015.pdfUploaded byPraveen
- Eviews UnderstandingUploaded byarmailgm
- Bayesian Inference in Regression Models with Ordinal Explanatory VariablesUploaded bySune Karlsson
- DP806_webUploaded byCvitaCvitić
- Interpreting Regression Output in ExcelUploaded byMay Ann Toyoken
- Logistic Regression %26 Scorecard.pdfUploaded byNipun Goyal
- Topic 2 - MapReduce With Pyrhon Extra II Linear RegressionUploaded bymayankkapoor85
- 1st-assignment.pdfUploaded byAnonymous KwSOTJLS
- panel dataUploaded byShikha Sharma
- Reg_2012_slideUploaded by송동근
- Simple Linear RegressionUploaded byBie JhingcHay
- Econometrics Chapter 8 PPT slidesUploaded byIsabelleDwight
- l11Uploaded byFrancien Bailey
- 1.0 Regression Problems for Magnitudes - Castellaro 2006Uploaded byGovind Gaurav
- Mid Term UmtUploaded byfazalulbasit9796
- Help Nonlinear RegressionUploaded bybennyferguson
- AAOC ZC111Uploaded bySatheeskumar
- 10560Uploaded bychinitn
- CFA Level 2 Formula Sheets SampleUploaded byBernard Bortey
- c15 Dynamic Causal EffectsUploaded byArjine Tang
- Over FittingUploaded byAnnisa Putri Utami
- Study Guide Chapter 1 %28EC220%29Uploaded byAnjaliPunia
- RmsUploaded bymondo
- 10.1.1.612.3134Uploaded byRui Cardoso Pedro
- Statistics for Business and Economics: bab 8Uploaded bybalo
- SyllabusUploaded byshekhar28071
- ECO 270Uploaded byTarun Shankar Choudhary
- 2 Simple Regression Model Estimation and PropertiesUploaded byMuhammad Chaudhry