*insheet using ../Data_raw/mpdta.csv, clear
insheet using "https://ayumu-tanaka.github.io/teaching/mpdta.csv",clear
tsset countyreal year
set seed 123
(6 vars, 2,500 obs)
Panel variable: countyreal (strongly balanced)
Time variable: year, 2003 to 2007
Delta: 1 unit
このデータは、did
パッケージに付属しているものである。Callaway
and Sant’Anna
(2021)によるデータセットであり、米国の最低賃金の引き上げが若年労働者の雇用に与える影響を検証するために作成されたものである。
人工的なデータセットである。
Callaway and Sant’Anna (2021)論文のデータそのものは公開されておらず、このデータセットと年次が違うなど論文の結果を再現することはできない。
含まれている変数は以下の通りである。
year: 年次。2003年から2007年までの5年間。
countyreal: 群のID
lpop: 群の人口
lemp: 群の若年雇用者数
firsttreat: 最初の介入の年次
treat: 処置群ダミー
* 前処理
* 時変処置ダミー作成 --- CS推定に必要
g treatit = year>= firsttreat
replace treatit = 0 if treat==0
* Never-treatedダミー作成 --- SA推定に必要
g never_treated = firsttreat==0
* Always-treatedダミー作成 --- CS推定の際に除去
su year
g always_treated = firsttreat==r(min)
* eventstudyinteractのヘルプファイルに従って、相対時間変数を作成 --- TWFE/SA推定に必要
* 相対時間
g relative_time = year - firsttreat
replace relative_time = 0 if never_treated == 1
* 相対時間ダミー
* プレ処置期間[-4, -2]
forvalues k = 4(-1)2 {
gen g_`k' = relative_time == -`k'
}
* ポスト処置期間[0, 3]
forvalues k = 0/3 {
gen g`k' = relative_time == `k'
}
save "../Data_output/mpdta2.dta", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof
> ile.do ...
(1,545 real changes made)
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
year | 2,500 2005 1.414496 2003 2007
(1,545 real changes made)
file ../Data_output/mpdta2.dta saved
reghdfe lemp g_4 g_3 g_2 g0 g1 g2 g3, absorb(countyreal year) vce(cluster countyreal)
coefplot, keep(g_4 g_3 g_2 g0 g1 g2 g3) vertical xlabel(, angle(45)) addplot(line @b @at) ///
title("TWFE") xtitle(Relative time) ytitle(Effect) yline(0) ///
ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) ///
caption(N = `e(N)'. reghdfe & coefplot.) ///
recast(line) ciopts(recast(rcap)) ///
order(g_4 g_3 g_2 g0 pre1 g0 g1 g2 g3) ///
coeflabels( ///
g_4="-4" ///
g_3="-3" ///
g_2="-2" ///
g_1="-1" ///
g0="0" ///
g1="1" ///
g2="2" ///
g3="3" ///
)
qui graph save "../Figures/TWFE.gph", replace
qui graph export "../Figures/TWFE.png",replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof
> ile.do ...
(MWFE estimator converged in 2 iterations)
HDFE Linear regression Number of obs = 2,500
Absorbing 2 HDFE groups F( 7, 499) = 3.60
Statistics robust to heteroskedasticity Prob > F = 0.0009
R-squared = 0.9933
Adj R-squared = 0.9915
Within R-sq. = 0.0103
Number of clusters (countyreal) = 500 Root MSE = 0.1388
(Std. err. adjusted for 500 clusters in countyreal)
------------------------------------------------------------------------------
| Robust
lemp | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
g_4 | .0035493 .0228286 0.16 0.877 -.0413027 .0484013
g_3 | .0246235 .0176794 1.39 0.164 -.0101117 .0593587
g_2 | .0233548 .0134367 1.74 0.083 -.0030447 .0497543
g0 | -.0181439 .0109822 -1.65 0.099 -.039721 .0034332
g1 | -.0434724 .0175769 -2.47 0.014 -.0780063 -.0089384
g2 | -.1317948 .0288374 -4.57 0.000 -.1884526 -.0751371
g3 | -.0922468 .0323362 -2.85 0.005 -.1557787 -.028715
_cons | 5.784483 .0086872 665.87 0.000 5.767416 5.801551
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
countyreal | 500 500 0 *|
year | 5 1 4 |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
eventstudyinteract lemp g_4 g_3 g_2 g0 g1 g2 g3, ///
absorb(countyreal year) cohort(firsttreat) control_cohort(never_treated) ///
vce(cluster countyreal)
matrix C = e(b_iw)
mata st_matrix("A",sqrt(diagonal(st_matrix("e(V_iw)"))))
matrix C = C \ A'
matrix list C
coefplot matrix(C[1]), se(C[2]) vertical xlabel(, angle(45)) addplot(line @b @at) ///
title("Sun and Abraham (2021)") xtitle(Relative time) ytitle(Effect) yline(0) ///
ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) ///
caption(N = `e(N)'. eventstudyinteract & coefplot.) ///
recast(line) ciopts(recast(rcap)) ///
order(g_4 g_3 g_2 g0 pre1 g0 g1 g2 g3) ///
coeflabels( ///
g_4="-4" ///
g_3="-3" ///
g_2="-2" ///
g_1="-1" ///
g0="0" ///
g1="1" ///
g2="2" ///
g3="3" ///
)
qui graph save "../Figures/SA.gph", replace
qui graph export "../Figures/SA.png",replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof
> ile.do ...
(obs=955)
IW estimates for dynamic effects Number of obs = 2,500
Absorbing 2 HDFE groups F(12, 499) = 2.88
Prob > F = 0.0008
R-squared = 0.9933
Adj R-squared = 0.9915
Root MSE = 0.1389
(Std. err. adjusted for 500 clusters in countyreal)
------------------------------------------------------------------------------
| Robust
lemp | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
g_4 | .0033064 .0245551 0.13 0.893 -.0449378 .0515505
g_3 | .0250218 .0181951 1.38 0.170 -.0107266 .0607702
g_2 | .0244587 .0142963 1.71 0.088 -.0036296 .052547
g0 | -.0199318 .0118761 -1.68 0.094 -.0432652 .0034016
g1 | -.0509574 .016964 -3.00 0.003 -.084287 -.0176277
g2 | -.1372587 .0365895 -3.75 0.000 -.2091471 -.0653703
g3 | -.1008114 .0345043 -2.92 0.004 -.1686029 -.0330198
------------------------------------------------------------------------------
C[2,7]
g_4 g_3 g_2 g0 g1 g2 g3
r1 .00330635 .02502181 .02445872 -.01993181 -.05095736 -.1372587 -.10081137
c1 .0245551 .01819507 .01429625 .01187613 .01696401 .03658948 .03450428
xthdidregress
を使ってCallaway
and Sant’Anna (2021)の推定を行う。aipw
オプションを使って、Augmented Inverse Probability
Weighting (AIPW)推定を行う。Doubly robust methodと呼ばれる、Callaway and
Sant’Anna
(2021)が提案している3種類の推定方法の中で最も望ましいとされているものである。
drop if always_treated==1 //allways-treated 除く
* CS doubly robust method
* The simultaneous confidence intervals are computed using the multiplier bootstrap algorithm in Callaway and Sant'Anna (2021)
xthdidregress aipw (lemp) (treatit), group(firsttreat) vce(cluster countyreal)
estat aggregation, dynamic(-3(1)3) graph sci ///
graph( ///
xlabel(-4(1)3, angle(45)) xtitle(Relative Time) ///
ytitle(Effect) title("Callaway and Sant'Anna (2021)") ///
ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) ///
caption("N = `e(N)'. xthdidregress aipw." "The simultaneous confidence intervals.") legend(off) ///
lcolor(green) lpattern(solid) ciopts(lcolor(blue) recast(rcap) ) ///
)
qui graph save "../Figures/CS_official.gph", replace
qui graph export "../Figures/CS_official.png", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof
> ile.do ...
(0 observations deleted)
note: variable _did_cohort, containing cohort indicators formed by treatment variable
treatit and group variable firsttreat, was added to the dataset.
Computing ATET for each cohort and time:
Cohort 2004 (4): .... done
Cohort 2006 (4): .... done
Cohort 2007 (4): .... done
Treatment and time information
Time variable: year
Time interval: 2003 to 2007
Control: _did_cohort = 0
Treatment: _did_cohort > 0
-------------------------------
| _did_cohort
------------------+------------
Number of cohorts | 4
------------------+------------
Number of obs |
Never treated | 1545
2004 | 100
2006 | 200
2007 | 655
-------------------------------
Heterogeneous-treatment-effects regression Number of obs = 2,500
Number of panels = 4
Estimator: Augmented IPW
Panel variable: countyreal
Treatment level: firsttreat
Control group: Never treated
(Std. err. adjusted for 500 clusters in countyreal)
------------------------------------------------------------------------------
| Robust
Cohort | ATET std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
2004 |
year |
2004 | -.0105032 .023251 -0.45 0.651 -.0560744 .035068
2005 | -.0704231 .0309848 -2.27 0.023 -.1311521 -.0096941
2006 | -.1372587 .0364357 -3.77 0.000 -.2086713 -.0658461
2007 | -.1008114 .0343592 -2.93 0.003 -.1681542 -.0334685
-------------+----------------------------------------------------------------
2006 |
year |
2004 | .0065201 .0233268 0.28 0.780 -.0391996 .0522398
2005 | -.0027508 .0195586 -0.14 0.888 -.0410849 .0355833
2006 | -.0045946 .0177552 -0.26 0.796 -.0393942 .0302049
2007 | -.0412245 .0202292 -2.04 0.042 -.080873 -.001576
-------------+----------------------------------------------------------------
2007 |
year |
2004 | .0305066 .0150336 2.03 0.042 .0010414 .0599719
2005 | -.0027259 .0163958 -0.17 0.868 -.0348611 .0294094
2006 | -.0310871 .0178775 -1.74 0.082 -.0661264 .0039522
2007 | -.0260544 .0166554 -1.56 0.118 -.0586985 .0065897
------------------------------------------------------------------------------
Duration of exposure ATET Number of obs = 2,500
Replications = 999
(Std. err. adjusted for 500 clusters in countyreal)
--------------------------------------------------------------
| Observed Bootstrap Simultaneous
Exposure | ATET std. err. [95% conf. interval]
-------------+------------------------------------------------
-3 | .0305066 .0155754 -.0108387 .0718519
-2 | -.0005631 .0131466 -.035461 .0343348
-1 | -.0244587 .013223 -.0595595 .010642
0 | -.0199318 .0118013 -.0512588 .0113952
1 | -.0509574 .0169269 -.0958903 -.0060244
2 | -.1372587 .0363963 -.2338736 -.0406437
3 | -.1008114 .0339439 -.1909165 -.0107063
--------------------------------------------------------------
Note: Exposure is the number of periods since the first
treatment time.
Note: Simultaneous confidence intervals provide inference
across all aggregations simultaneously.
csdid
パッケージを使って、Callaway and Sant’Anna
(2021)の推定を行う。
* never-treatedを0に設定。
replace firsttreat=0 if never_treated==1
* (1) 推定 doubly robust method
csdid lemp , ivar(countyreal) time(year) ///
gvar(firsttreat) notyet long2 agg(event) method(dripw) wboot reps(5)
* (2) イベント時間集計
estat event, window(-4 3)
* (3) イベント・スタディ・プロット with csdid_plot
* csdid_plotで作図すると、x軸の範囲が適切にならない場合ある。
csdid_plot, title("Callaway and Sant'Anna (2021)") xtitle(Relative Time) style(rcap) legend(off) ytitle(Effect) ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) caption(N = `e(N)'. csdid & csdid_plot.)
qui graph save "../Figures/CS_csdid_plot.gph", replace
qui graph export "../Figures/CS_csdid_plot.png", replace
* (4) イベント・スタディ・プロット with coefplot
coefplot, vertical keep(Tm4 Tm3 Tm2 Tm1 Tp0 Tp1 Tp2 Tp3) xlabel(, angle(45)) ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) recast(line) ciopts(recast(rcap)) yline(0) addplot(line @b @at) title("Callaway and Sant'Anna (2021)") xtitle(Relative Time) ytitle(Effect) caption("N = `e(N)'. csdid & coefplot.") ///
coeflabels( ///
Tm4="-4" ///
Tm3="-3" ///
Tm2="-2" ///
Tm1="-1" ///
Tp0="0" ///
Tp1="1" ///
Tp2="2" ///
Tp3="3" ///
)
qui graph save "../Figures/CS_coefplot.gph", replace
qui graph export "../Figures/CS_coefplot.png", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof
> ile.do ...
(0 real changes made)
............
Difference-in-difference with Multiple Time Periods
Number of obs = 2,500
Outcome model : least squares
Treatment model: inverse probability
---------------------------------------------------------------------
| Coefficient Std. err. t [95% conf. interval]
------------+--------------------------------------------------------
Pre_avg | .0181773 .0168533 1.08 -.0263013 .0626558
Post_avg | -.0773993 .020939 -3.70 -.1326605 -.0221381
Tm4 | .0033064 .0248261 0.13 -.0622135 .0688262
Tm3 | .0269566 .0174021 1.55 -.0189703 .0728834
Tm2 | .0242689 .0144381 1.68 -.0138355 .0623733
Tp0 | -.0189222 .0117155 -1.62 -.0498412 .0119968
Tp1 | -.0535893 .0174318 -3.07 -.0995944 -.0075842
Tp2 | -.1362743 .035155 -3.88 -.2290538 -.0434948
Tp3 | -.1008114 .0357803 -2.82 -.1952412 -.0063816
---------------------------------------------------------------------
Control: Not yet Treated
See Callaway and Sant'Anna (2021) for details
Test will be based on asymptotic VCoV
If you want aggregations based on WB, use option saverif() ad csdid_stats
ATT by Periods Before and After treatment
Event Study:Dynamic effects
------------------------------------------------------------------------------
| Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
Pre_avg | .0181773 .016605 1.09 0.274 -.014368 .0507225
Post_avg | -.0773993 .0195602 -3.96 0.000 -.1157366 -.0390621
Tm4 | .0033064 .0244519 0.14 0.892 -.0446184 .0512311
Tm3 | .0269566 .0175797 1.53 0.125 -.007499 .0614121
Tm2 | .0242689 .0144637 1.68 0.093 -.0040794 .0526172
Tp0 | -.0189222 .0120446 -1.57 0.116 -.0425291 .0046847
Tp1 | -.0535893 .0169464 -3.16 0.002 -.0868036 -.020375
Tp2 | -.1362743 .0354034 -3.85 0.000 -.2056637 -.0668849
Tp3 | -.1008114 .0343592 -2.93 0.003 -.1681542 -.0334685
------------------------------------------------------------------------------
graph combine "../Figures/CS_official.gph" ///
"../Figures/CS_coefplot.gph" ///
"../Figures/CS_csdid_plot.gph" ///
"../Figures/TWFE.gph" ///
"../Figures/SA.gph" ///
, c(3)
qui graph export "../Figures/ES_combined.png", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof
> ile.do ...
処置前期間
xthdidregress
を用いた場合のCallaway and
Sant’Anna
(2021)推定量は、-3から3の期間について、推定値が得られる。-4の時点の推定値は得られない。csdid
パッケージでlong2
オプションを用いた場合のCallaway
and Sant’Anna
(2021)推定量は、-4から3の期間について、推定値が得られる。ただし、-1の期間は基準年となっており、推定値が得られない。共変量
対照群
Callaway, B., Sant’Anna, P.H.C., 2021. Difference-in-Differences with multiple time periods. Journal of Econometrics 225, 200–230. https://doi.org/10.1016/j.jeconom.2020.12.001
Sun, L., Abraham, S., 2021. Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics 225, 175–199. https://doi.org/10.1016/j.jeconom.2020.09.006