You are on page 1of 51

BasicEconometricsofOutcomes

Research
JimLightwood
May10,2011
UCSFEpi211
PerformanceMeasurement
1
EconometricsofOutcomeMeasurement

Epi211
OutlineofLecture
CostAnalysis
BasicsofcostanalysisinUShealthcare
DealingwithUncertainty:WhenandHow
VarianceEstimationConcepts
SensitivityAnalysis
StatisticalMethods
Summarystatistics
RegressionMethods
PredictiveDataMining
Simulationmodels
QuestionsandDiscussion(hopefully,last20minutes)
2
EconometricsofOutcomeMeasurement

Epi211
ImportantIssues
Description,versuspredictionofchangedueto
intervention
Describesituationwithnointervention
Predictwhatwillchangeifyoudointervene
Pointestimationversusintervalestimation
Ispointestimateofparameter(mean,median)

enough?
Doyouneedintervalestimates(confidenceintervalof

meanordistribution,interquartilerange)?
EconometricsofOutcomeMeasurement

Epi211
3
Cost(Price),RealOutput,and

Expenditure
Needtokeepthreedistinctconceptsinmind
Costperunit(Price):valueofanunitofarealgoodfor

service
thecostoftreatingonepersonforstroke)
RealInput:independentlymeasurableunitreal

productionthatgoesintoprovidingaunitofservice
measurableresourcesusedasinputsfortreatment

followingstroke,attributabletothestroke
RealExpenditureperunit=Costperunit*RealInput

perunit
TotalRealExpenditure:SumofRealExpenditureper

unitoverallunitsattributabletodiseaseorriskfactor
EconometricsofOutcomeMeasurement

Epi211
4
TotalCosts
Totalcosts
FixedCosts(FC)
Fixedoverperiodof

analysis
Overheadcosts
VariableCosts(VC)
Flowcostsofgoodsand

servicesasfunctionof

flowofunitsproduced
TotalCosts(FC+VC)
SumofFCandVC
Mustnotdecreaseas

unitsproducedincreases
5
EconometricsofOutcomeMeasurement

Epi211
0
200
400
600
800
1000
1200
1400
1600
1800
0 20 40 60 80 100
D
o
l
l
a
r
s
EpisodesofCareperYear
VariableCost FixedCost TotalCost
MarginalandAverageCosts
MarginalTotalCost
Changeintotalcostof

producingonemoreunit

ofservice
AverageVariablecost
(VariableCost/Unitsof

serviceproduced)
AverageTotalCost
(TotalCost/Unitsof

serviceproduced)
6
EconometricsofOutcomeMeasurement

Epi211
0
2
4
6
8
10
12
14
16
18
20
0 20 40 60 80 100
D
o
l
l
a
r
s
EpisodesofCareperYear
MarginalCost AverageVariablelCost
AverageTotalCost
MarketEconomicsRationaleforUsing

MarginalCost=Price
Incompetitivemarket,

overlongrun
Firmwillproduceuntil

marginalcost=average

totalcost
Marketpricewillbeequal

tomarginalcost
Usemarketpriceto

measurecostofunitof

resource
Howusefulisthisadvice

forUShealthcare?
7
EconometricsofOutcomeMeasurement

Epi211
0
2
4
6
8
10
12
14
16
18
20
0 20 40 60 80 100
D
o
l
l
a
r
s
EpisodesofCareperYear
MarginalCost AverageVariablelCost
AverageTotalCost
CursesofHighFixedCosts
IncreaseFCto10,000
Supposeyourhospitalis

incatchmentareawith

200casesperyear
CanMCpricingwork?
Ifthereisjointproduction
(fixedcostsareallocated

betweendepartments)
Allocationoffixedcostsof,

sayhospitaltoservices,is

arbitrary
Focusonlongrun

averagetotalcosts
EconometricsofOutcomeMeasurement

Epi211
8
0
10
20
30
40
50
60
70
80
90
100
0 50 100 150 200
D
o
l
l
a
r
s
EpisodesofCareperYear
MarginalCost AverageVariablelCost
AverageTotalCost
DifferenceBetweenChargesandCosts
Chargesareusuallytheonlypublicallyavailable

costsdatafromproviders
Costsofanykindareusuallyproprietary,and

functionofinstitutionspecific,arbitrary,

algorithms
ChargesareNOTusuallytheactualtransaction

prices
Analogoustostickerprice

atcardealers,starting

pointfornegotiation
Whattodo?
EconometricsofOutcomeMeasurement

Epi211
9
DerivingCostsfromCharges
Usualapproachistoderiveaveragelongruncostfrom

charges
Criticalroleofcosttochargeratio
Usualalgorithmforcosttochargeratio
Assumehealthcarecost=healthcarerevenueforinstitution
Countallfundsthatcomein
Subtractincidentalservices,profitsandretainedearningstoget

totalrevenuesforpatientcare
Countobservedoutlaysforhealthcareinputs
Dividetotalrevenueforpatientcarebyobservedoutlays
Maybeavailableoninstitution,service,unitlevel
CosttochargeratiosforCaliforniahospitalsrecently

around50%;Maryland:87%
EconometricsofOutcomeMeasurement

Epi211
10
Problemswithestimatedcosts
Estimatesofvalueoftotalexpenditureusingvariablecosts

dependoninstitutionalfactorsthatmayvarywidely
Somevariationduetodifferencesininstitutionspecificcost

accounting
Estimatesusingcostderivedfromcosttochargeratio

dependoninstitutionalandregionalhealthcaremarket

factors
Comparativestudiesshowthatthereissignificant

heterogeneityinestimatesofcostofdiagnosis,ordisease

tx
StrokeistheonlycaseIveseenwheredifferentestimatesseem

toberandomlydistributedaroundasingleoverallpoint

estimates
EconometricsofOutcomeMeasurement

Epi211
11
PracticalTipsonCost
Somesurveysanddatasetsprovideestimatesof

averagecost
MedicalExpenditurePanelSurvey
CaliforniaOSHPD(institutionallevelforhospitalsthrough

financialreports)
Needtoreadalgorithmsforcostestimatescarefully
Derivedfromchargesorfrominstitutionalcostaccounting

system?
Doanalysisintermsofrelativecharges
Butthesemaynotbecomparablewhencomparing

institutionsfromdifferenthealthcaremarkets
EconometricsofOutcomeMeasurement

Epi211
12
DealingwithUncertainty:Whenand

How
EconometricsofOutcomeMeasurement

Epi211
13
Cost
F
r
e
q
u
e
n
c
y
1600 1400 1200 1000 800 600 400
35
30
25
20
15
10
5
0
Hi st ogr am of Cost
Normal
Statistic Dollars
mean 1047
standarddeviation 278
standarderror 20
median 1039
firstquartile 940
thirdquartile 1180
interquartilerange 240
range 750

DealingwithUncertainty:Whenand

How
EconometricsofOutcomeMeasurement

Epi211
14
Statistic Dollars
mean 1047
standarddeviation 278
standarderror 20
median 1039
firstquartile 940
thirdquartile 1180
interquartilerange 240
range 750

Forsocialdecisionmaking

(country,world)
Focusonpointestimateofthe

meanormediancost
Extremeeconomiccostbenefit

analysispositionisthatONLYpoint

estimateofthemeanisimportant
Forplanningforindividual

organization
Standarderror,orinterquartile

range
Fororganizationalrisk

management,financialplanning

overshorttomediumterm,

individualpatientlevel
Standarddeviation,orrange
PointandIntervalEstimatesina

RegressionContext
Pointestimateof

conditionalmean
Regressionline
Intervalestimatefor

regressionline
Confidenceintervalfor

regressionline
Intervalestimatefor

individualobservations
Prediction

orforecast

intervalforindividual

observations
EconometricsofOutcomeMeasurement

Epi211
15
RelevantPartsofRegressionOutput
EconometricsofOutcomeMeasurement

Epi211
16
The regression equation is
Cost1 = 514 + 9.63 Age


Predictor Coef SE Coef T P
Constant 513.97 31.22 16.46 0.000
Age 9.6325 0.5392 17.86 0.000


S = 100.596 R-Sq = 61.7% R-Sq(adj) = 61.5%


Analysis of Variance

Source DF SS MS F P
Regression 1 3229214 3229214 319.11 0.000
Residual Error 198 2003665 10120
Total 199 5232879

MoreUsualCostDistribution
EconometricsofOutcomeMeasurement

Epi211
17
Statistic Dollars
mean 63225
standarddeviation 79301
standarderror 5607
median 19906
firstquartile 4579
thirdquartile 104674
interquartilerange 100095
range 286774

NaturalLogofMoreUsualCost

Distribution
EconometricsofOutcomeMeasurement

Epi211
18
Statistic Dollars
mean 9.9691
standarddeviation 1.6619
standarderror 0.1175
median 9.8986
firstquartile 8.4290
thirdquartile 11.5586
interquartilerange 3.1296
range 5.1802

ProsandConsofTransformations
Pros
Ifyoucandemonstratecostdifferencein

transformeddatawithnice

distribution,there

willbecostdifferenceonoriginalscale,but

perhapsonlywithVERYlargesamples
Cons
GovernmentagenciesandCFOsdonotpaybillsin

naturallogarithmdollars,orsquarerootdollars
EconometricsofOutcomeMeasurement

Epi211
19
ProsandConsofTransformations

(cont.)
AVeryImportantCONofTransformations
EconometricsofOutcomeMeasurement

Epi211
20
)) ( mean exp( ) ( mean
), ( mean )) (ln( mean
x y
but
x y
| o
| o
+ =
+ =
ProsandConsofTransformations

(cont.)
AVeryImportantPROofTransformations
EconometricsofOutcomeMeasurement

Epi211
21
)) ( percentile exp( ) ( percentile
), ( precentile )) (ln( percentile
x y
and
x y
z z
z z
| o
| o
+ =
+ =
ProsandConsofTransformations

(cont.)
AVeryImportantPROofTransformations
EconometricsofOutcomeMeasurement

Epi211
22
)) ( median exp( ) ( median
), ( median )) (ln( median
x y
and
x y
| o
| o
+ =
+ =
SensitivityAnalysis:WhatVariationis

Relevant?
EconometricsofOutcomeMeasurement

Epi211
23
Prototocol 2
Prototocol 3
Prototocol 1
Episodes of treatment / per unit time
C
o
s
t

/

E
p
i
s
o
d
e
s

o
f

t
r
e
a
t
m
e
n
t

Estimate Cost of Treatment
Protocol given observed outcome
and cost data from observational
trial
Estimate both outcome
and point on cost curve
from observations
SensitivityAnalysis:WhatVariationis

Relevant?
EconometricsofOutcomeMeasurement

Epi211
24
Episodes of treatment / per unit time
C
o
s
t

/

E
p
i
s
o
d
e
s

o
f

t
r
e
a
t
m
e
n
t

Estimate Cost of Treatment with
expansion of volume, given same
capital plant
Forecasted Cost
UsualAssumptiononMarginaland

AverageCostsforSocialCosts
EconometricsofOutcomeMeasurement

Epi211
25
GeneralizationofTreatmentGroups:

BreastCancer
EconometricsofOutcomeMeasurement

Epi211
26
Age
OR
Recovery
Tx1
Tx2
1
?
?
?
GeneralizationofTreatmentGroups:

Importanceofestimatingtreatmenteffect,RCT,

ornaturalexperiment
EconometricsofOutcomeMeasurement

Epi211
27
Age
OR
Recovery
Tx1
Tx2
1
GeneralizationofTreatmentGroups:

Importanceofestimatingtreatmenteffect,

observationaldata
EconometricsofOutcomeMeasurement

Epi211
28
Age
OR
Recovery
Tx1
Tx2
1
Estimationwithpredetermined

groupings
Continuousvariables(cost,lengthofstay,etc.)
Usesummarystatistics
Mean,standarddeviation,standarderror
Median,interquartilerange,standarderrorofmedian
Graphicalmethods(boxplots,histograms,etc.)
Lookfor
Normaldistribution
Ifnotnormal,trymonotonictransformationsto
Normaldistribution
Symmetricdistribution,ifnormalitynotachievable
Forhypothesistests
Normalityorlargesample:ttests,trimmedttests
Normalitynotachievable:ranksumormediantests
EconometricsofOutcomeMeasurement

Epi211
29
Estimationwithpredetermined

groupings
Continuousvariableswithcensoring(survival,

timetorelapse,etc.)
UseKaplanMeieranalysis
EstimateRR,OR,dependingonstudydesign
Lookfor
Assumptionsonindependentcensoring,selectionto

treatmentgroup,met
Ifassumptionsnotmet,sensitivityanalysis
Forhypothesistests
Logranktest
EconometricsofOutcomeMeasurement

Epi211
30
Estimationwithpredetermined

groupings
Categoricalvariableswithcensoring

(response,recovery,etc.)
Use2x2tables
Estimaterelativerisk,oddratio(dependingonstudy

design)
Lookfor
Assumptionsonselectiontotreatmentmet
Ifassumptionsnotmet,dosensitivityanalysis
Hypothesistests:chisquare,orFisherexactfor

independence
EconometricsofOutcomeMeasurement

Epi211
31
Regressionmethods
Estimationfor
Largenumberofobservations
Socialcostoroutcomeanalysis
Mostimportanttoestimatetheconditionalmean
Uncertainty,varianceestimationlessimportant
Ifresultsofasimpletechniquenotsensitiveto

changesindistributionofoutcomesthatwill

occurafteranyintervention,
Thenstartwithsimpletechniques:OrdinaryLeast

Squares,Instrumentalregression,etc.
EconometricsofOutcomeMeasurement

Epi211
32
Example1forSimpleRegression

Analysis
EconometricsofOutcomeMeasurement

Epi211
33
0
5000
10000
15000
20000
25000
30000
35000
40000
30 40 50 60 70 80 90
C
o
s
t

p
e
r

e
p
i
s
o
d
e

o
f

c
a
r
e
Age
Example1forSimpleRegression

Analysis
EconometricsofOutcomeMeasurement

Epi211
34
Age
C
o
s
t
80 70 60 50 40 30
40000
35000
30000
25000
20000
15000
10000
Fi t t ed Li ne Pl ot
Cost = - 28573 + 1923 Age
- 15.95 Age* * 2
Example2forSimpleRegression

Analysis
EconometricsofOutcomeMeasurement

Epi211
35
0
2000
4000
6000
8000
10000
12000
30 40 50 60 70 80 90
A
n
n
u
a
l

C
o
s
t

p
e
r

E
n
r
o
l
l
e
e
Age
Example2forSimpleRegression

Analysis
EconometricsofOutcomeMeasurement

Epi211
36
Enr ollee
A
n
n
u
a
l
_
C
o
s
t
80 70 60 50 40 30
12000
10000
8000
6000
4000
2000
0
-2000
-4000
Fi t t ed Li ne Pl ot
Annual_Cost = 6555 - 320.7 Enrollee
+ 4.291 Enrollee* * 2
Twopartestimation
Partone:Logistic(orProbit)regressionto

estimateprobabilityofincurringpositivecost
Parttwo:forthosewhohavepositivecost,

regressionestimateoflogcost
EconometricsofOutcomeMeasurement

Epi211
37
i i i
i
i
smoking age
p
p
, 1
* *
1
ln c | o + + + =
|
|
.
|

\
|

i i i i i i
sex smoking age
, 2
* * * ) 0 cost | cost ln( c o | o + + + + = >
Howtodealwiththisconof

transformations
AVeryImportantConofTransformations
EconometricsofOutcomeMeasurement

Epi211
38
)) ( mean exp( ) ( mean
), ( mean )) (ln( mean
x y
but
x y
| o
| o
+ =
+ =
LognormalCostData
EconometricsofOutcomeMeasurement

Epi211
39
0
5000
10000
15000
20000
25000
30000
35000
20 30 40 50 60 70
A
n
n
u
a
l

C
o
s
t

p
e
r

E
n
r
o
l
l
e
e
Age
LognormalCostData
EconometricsofOutcomeMeasurement

Epi211
40
0
2
4
6
8
10
12
20 30 40 50 60 70
L
n
(
A
n
n
u
a
l

C
o
s
t

p
e
r

E
n
r
o
l
l
e
e
)
Age
LinearRegressiononRawCostData
EconometricsofOutcomeMeasurement

Epi211
41
age
c
o
s
t
65 60 55 50 45 40 35 30 25
40000
30000
20000
10000
0
-10000
-20000
S 5377.55
R- Sq 28.2%
R- Sq( adj ) 27.4%
Regr ession
95% CI
95% PI
Fi t t ed Li ne Pl ot
cost = - 10396 + 315.9 age
LinearRegressiononLogofCostData
EconometricsofOutcomeMeasurement

Epi211
42
age
l
n
c
o
s
t
65 60 55 50 45 40 35 30 25
12
11
10
9
8
7
6
5
4
3
S 1.01160
R- Sq 48.7%
R- Sq( adj ) 48.2%
Regr ession
95% CI
95% PI
Fi t t ed Li ne Pl ot
lncost = 3.155 + 0.09241 age
Whathappenswhenwetrytopredict

meancostusingln(cost)regression
Calculatepredicted

meanofcostfrom

regressionoflogcost
=3.155+

0.09241*mean(age)
=7.28
Predictedcost=

exp(7.28)=1608.02?
EconometricsofOutcomeMeasurement

Epi211
43
Meanofobservations
age 45.75
ln(cost) 7.38
cost 4058.15
Predictedmeansfromregression
ln(cost) 7.38
Cost(=exp(ln(cost)) 1608.02
SmearingEstimator
Taketheresidualsfromtheestimated

regression
Taketheexponentialoftheresiduals
Thesmearingfactoristhemeanofthe

exponentialoftheresiduals
EconometricsofOutcomeMeasurement

Epi211
44
i i i
x y c | o

+ + =
))

( exp( )

exp(
i i i
x y | o c + =

= =
+ = =
N
i
i i
N
i
i
x y
N N
SF
1 1
))

( exp(
1
)

exp(
1
| o c
Whathappenswhenwetrytopredict

meancostusingln(cost)regression
Calculatepredictedmean

ofcostfromregressionof

logcost
=3.155+

0.09241*mean(age)
=7.28
Predictedcost=exp(7.28)

=1608.02
SF*predictedcost=

1.66*1608.02=2669.05?
EconometricsofOutcomeMeasurement

Epi211
45
Meanofobservations
age 45.75
ln(cost) 7.38
cost 4058.15
Predictedmeansfrom

regression
ln(cost) 7.38
cost 1608.02
SF 1.66
SF*cost 2669.05
Whathappenswhenwetrytopredict

meancostusingln(cost)regression
Trialsolution
Stratifybyagewhereyou

seenchangeindistribution

ofcost
Notethatvarianceofcost

increasesataroundage55
Calculateln(cost)andSF

stratifiedbyageat55
Resultismuchcloser

approximation
Meanobservedcost
<=55:2187.04
>55:9383.63
EconometricsofOutcomeMeasurement

Epi211
46
meanofobservations
cost 4058.15
age=<55 41.00
age>55 59.27
predictedmeanfromregression
lncost=<55 6.94
lncost>55 8.63
SF<=55 1.66
SF>55 1.66
proportion>55 0.26
predictedcostusingstratifiedSF
cost=<55 1721.47
cost>55 9298.72
cost 3691.56
ExploratoryAnalysis:Classificationand

RegressionTrees
EconometricsofOutcomeMeasurement

Epi211
47
RegressionResultsonLnCost
EconometricsofOutcomeMeasurement

Epi211
48

The regression equation is
lncost = 3.15 + 0.0924 age


Predictor Coef SE Coef T P
Constant 3.1547 0.4500 7.01 0.000
age 0.092412 0.009585 9.64 0.000


S = 1.01160 R-Sq = 48.7% R-Sq(adj) = 48.2%


Analysis of Variance

Source DF SS MS F
P
Regression 1 95.125 95.125 92.96
0.000
Residual Error 98 100.287 1.023
Total 99 195.412

SimulationEstimates
EconometricsofOutcomeMeasurement

Epi211
49
) * 09241 . 0 . 0 155 . 3 exp(
scost_old
age +
=
params old_tx new_tx
forlncost
constant 3.155 4.000
agecoeff 0.09241 0.0500
agedist
low 25 25
hig 64 64
meanage 44.5 44.5
simulatedcost
scost_old scost_new dcost
1432.598 505.223 927.3752
SimulationEstimates
EconometricsofOutcomeMeasurement

Epi211
50
SimulationEstimates
Forecast:dcost
Statistic Forecastvalues
Trials 100,000
Mean 2,347.73
Median 805.8
Mode '
StandardDeviation 4,583.05
Variance 21,004,330.22
Skewness 4.84
Kurtosis 47.61
Coeff.ofVariability 1.95
Minimum 3,922.10
Maximum 131,110.03
MeanStd.Error 14.49
EconometricsofOutcomeMeasurement

Epi

211
51

You might also like