1 はじめに

  • この資料は、最低賃金の引き上げが若年労働者の雇用に与える影響を検証するものである。

2 データの読み込み


*insheet using ../Data_raw/mpdta.csv, clear 
insheet using "https://ayumu-tanaka.github.io/teaching/mpdta.csv",clear

tsset countyreal year
set seed 123
(6 vars, 2,500 obs)


Panel variable: countyreal (strongly balanced)
 Time variable: year, 2003 to 2007
         Delta: 1 unit
  • このデータは、didパッケージに付属しているものである。Callaway and Sant’Anna (2021)によるデータセットであり、米国の最低賃金の引き上げが若年労働者の雇用に与える影響を検証するために作成されたものである。

  • 人工的なデータセットである。

  • Callaway and Sant’Anna (2021)論文のデータそのものは公開されておらず、このデータセットと年次が違うなど論文の結果を再現することはできない。

  • 含まれている変数は以下の通りである。

  • year: 年次。2003年から2007年までの5年間。

  • countyreal: 群のID

  • lpop: 群の人口

  • lemp: 群の若年雇用者数

  • firsttreat: 最初の介入の年次

  • treat: 処置群ダミー

3 前処理

  • 前処理として、推定に必要な変数を作成する。

* 前処理
* 時変処置ダミー作成 --- CS推定に必要
g treatit = year>= firsttreat
    replace treatit = 0 if treat==0

* Never-treatedダミー作成 --- SA推定に必要
g never_treated = firsttreat==0

* Always-treatedダミー作成 --- CS推定の際に除去
su year
g always_treated = firsttreat==r(min)

* eventstudyinteractのヘルプファイルに従って、相対時間変数を作成 --- TWFE/SA推定に必要
    * 相対時間
    g relative_time = year - firsttreat
    replace relative_time = 0 if never_treated == 1
    * 相対時間ダミー
        * プレ処置期間[-4, -2]
            forvalues k = 4(-1)2 {
               gen g_`k' = relative_time == -`k'
            }
            * ポスト処置期間[0, 3]
            forvalues k = 0/3 {
                 gen g`k' = relative_time == `k'
            }

save "../Data_output/mpdta2.dta", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof

> ile.do ...



(1,545 real changes made)

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        year |      2,500        2005    1.414496       2003       2007



(1,545 real changes made)



file ../Data_output/mpdta2.dta saved

4 TWFE

  • まずは、Two-way fixed effects (TWFE) 推定を行う。
reghdfe lemp g_4 g_3 g_2 g0 g1 g2 g3, absorb(countyreal year) vce(cluster countyreal)
coefplot, keep(g_4 g_3 g_2 g0 g1 g2 g3) vertical xlabel(, angle(45)) addplot(line @b @at) ///
    title("TWFE") xtitle(Relative time) ytitle(Effect) yline(0) ///
        ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) ///
  caption(N = `e(N)'. reghdfe & coefplot.) ///
  recast(line) ciopts(recast(rcap)) ///
    order(g_4 g_3 g_2 g0 pre1 g0 g1 g2 g3) ///
    coeflabels( ///
            g_4="-4" ///
            g_3="-3" ///
            g_2="-2" ///
            g_1="-1" ///
            g0="0"  ///
            g1="1"  ///
            g2="2"  ///
            g3="3"  ///     
            ) 
qui graph save "../Figures/TWFE.gph", replace
qui graph export "../Figures/TWFE.png",replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof

> ile.do ...


(MWFE estimator converged in 2 iterations)

HDFE Linear regression                            Number of obs   =      2,500
Absorbing 2 HDFE groups                           F(   7,    499) =       3.60
Statistics robust to heteroskedasticity           Prob > F        =     0.0009
                                                  R-squared       =     0.9933
                                                  Adj R-squared   =     0.9915
                                                  Within R-sq.    =     0.0103
Number of clusters (countyreal) =        500      Root MSE        =     0.1388

                           (Std. err. adjusted for 500 clusters in countyreal)
------------------------------------------------------------------------------
             |               Robust
        lemp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         g_4 |   .0035493   .0228286     0.16   0.877    -.0413027    .0484013
         g_3 |   .0246235   .0176794     1.39   0.164    -.0101117    .0593587
         g_2 |   .0233548   .0134367     1.74   0.083    -.0030447    .0497543
          g0 |  -.0181439   .0109822    -1.65   0.099     -.039721    .0034332
          g1 |  -.0434724   .0175769    -2.47   0.014    -.0780063   -.0089384
          g2 |  -.1317948   .0288374    -4.57   0.000    -.1884526   -.0751371
          g3 |  -.0922468   .0323362    -2.85   0.005    -.1557787    -.028715
       _cons |   5.784483   .0086872   665.87   0.000     5.767416    5.801551
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
  countyreal |       500         500           0    *|
        year |         5           1           4     |
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

5 Sun and Abraham (2021)

  • 次に、Sun and Abraham (2021)の推定を行う。
eventstudyinteract lemp g_4 g_3 g_2 g0 g1 g2 g3, ///
    absorb(countyreal year) cohort(firsttreat) control_cohort(never_treated) ///
    vce(cluster countyreal)
    
matrix C = e(b_iw)
mata st_matrix("A",sqrt(diagonal(st_matrix("e(V_iw)"))))
matrix C = C \ A'
matrix list C
coefplot matrix(C[1]), se(C[2]) vertical xlabel(, angle(45))  addplot(line @b @at) ///
    title("Sun and Abraham (2021)") xtitle(Relative time) ytitle(Effect) yline(0) ///
        ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) ///
  caption(N = `e(N)'. eventstudyinteract & coefplot.) ///
  recast(line) ciopts(recast(rcap)) ///
    order(g_4 g_3 g_2 g0 pre1 g0 g1 g2 g3) ///
    coeflabels( ///
            g_4="-4" ///
            g_3="-3" ///
            g_2="-2" ///
            g_1="-1" ///
            g0="0"  ///
            g1="1"  ///
            g2="2"  ///
            g3="3"  ///     
            ) 

qui graph save "../Figures/SA.gph", replace
qui graph export "../Figures/SA.png",replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof

> ile.do ...


(obs=955)

IW estimates for dynamic effects                        Number of obs =  2,500
Absorbing 2 HDFE groups                                 F(12, 499)    =   2.88
                                                        Prob > F      = 0.0008
                                                        R-squared     = 0.9933
                                                        Adj R-squared = 0.9915
                                                        Root MSE      = 0.1389
                           (Std. err. adjusted for 500 clusters in countyreal)
------------------------------------------------------------------------------
             |               Robust
        lemp | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         g_4 |   .0033064   .0245551     0.13   0.893    -.0449378    .0515505
         g_3 |   .0250218   .0181951     1.38   0.170    -.0107266    .0607702
         g_2 |   .0244587   .0142963     1.71   0.088    -.0036296     .052547
          g0 |  -.0199318   .0118761    -1.68   0.094    -.0432652    .0034016
          g1 |  -.0509574    .016964    -3.00   0.003     -.084287   -.0176277
          g2 |  -.1372587   .0365895    -3.75   0.000    -.2091471   -.0653703
          g3 |  -.1008114   .0345043    -2.92   0.004    -.1686029   -.0330198
------------------------------------------------------------------------------





C[2,7]
           g_4         g_3         g_2          g0          g1          g2          g3
r1   .00330635   .02502181   .02445872  -.01993181  -.05095736   -.1372587  -.10081137
c1    .0245551   .01819507   .01429625   .01187613   .01696401   .03658948   .03450428

6 CS - Stata 公式 xthdidregress

  • 最後に、Stataの公式コマンドであるxthdidregressを使ってCallaway and Sant’Anna (2021)の推定を行う。
  • aipwオプションを使って、Augmented Inverse Probability Weighting (AIPW)推定を行う。Doubly robust methodと呼ばれる、Callaway and Sant’Anna (2021)が提案している3種類の推定方法の中で最も望ましいとされているものである。

drop if always_treated==1 //allways-treated 除く

* CS doubly robust method
* The simultaneous confidence intervals are computed using the multiplier bootstrap algorithm in Callaway and Sant'Anna (2021)
xthdidregress aipw (lemp) (treatit), group(firsttreat) vce(cluster countyreal) 
estat aggregation, dynamic(-3(1)3) graph  sci ///
    graph( ///
    xlabel(-4(1)3, angle(45)) xtitle(Relative Time) ///
    ytitle(Effect) title("Callaway and Sant'Anna (2021)") ///
    ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) ///
    caption("N = `e(N)'. xthdidregress aipw." "The simultaneous confidence intervals.") legend(off) ///
    lcolor(green) lpattern(solid) ciopts(lcolor(blue) recast(rcap) ) ///
    )
    qui graph save "../Figures/CS_official.gph", replace
    qui graph export "../Figures/CS_official.png", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof

> ile.do ...


(0 observations deleted)

note: variable _did_cohort, containing cohort indicators formed by treatment variable
      treatit and group variable firsttreat, was added to the dataset.

Computing ATET for each cohort and time:
Cohort 2004 (4): .... done
Cohort 2006 (4): .... done
Cohort 2007 (4): .... done

Treatment and time information

Time variable: year
Time interval: 2003 to 2007
Control:       _did_cohort = 0
Treatment:     _did_cohort > 0
-------------------------------
                  | _did_cohort
------------------+------------
Number of cohorts |           4
------------------+------------
Number of obs     |
    Never treated |        1545
             2004 |         100
             2006 |         200
             2007 |         655
-------------------------------

Heterogeneous-treatment-effects regression            Number of obs    = 2,500
                                                      Number of panels =     4
Estimator:       Augmented IPW
Panel variable:  countyreal
Treatment level: firsttreat
Control group:   Never treated

                           (Std. err. adjusted for 500 clusters in countyreal)
------------------------------------------------------------------------------
             |               Robust
Cohort       |       ATET   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
2004         |
        year |
       2004  |  -.0105032    .023251    -0.45   0.651    -.0560744     .035068
       2005  |  -.0704231   .0309848    -2.27   0.023    -.1311521   -.0096941
       2006  |  -.1372587   .0364357    -3.77   0.000    -.2086713   -.0658461
       2007  |  -.1008114   .0343592    -2.93   0.003    -.1681542   -.0334685
-------------+----------------------------------------------------------------
2006         |
        year |
       2004  |   .0065201   .0233268     0.28   0.780    -.0391996    .0522398
       2005  |  -.0027508   .0195586    -0.14   0.888    -.0410849    .0355833
       2006  |  -.0045946   .0177552    -0.26   0.796    -.0393942    .0302049
       2007  |  -.0412245   .0202292    -2.04   0.042     -.080873    -.001576
-------------+----------------------------------------------------------------
2007         |
        year |
       2004  |   .0305066   .0150336     2.03   0.042     .0010414    .0599719
       2005  |  -.0027259   .0163958    -0.17   0.868    -.0348611    .0294094
       2006  |  -.0310871   .0178775    -1.74   0.082    -.0661264    .0039522
       2007  |  -.0260544   .0166554    -1.56   0.118    -.0586985    .0065897
------------------------------------------------------------------------------


Duration of exposure ATET                Number of obs = 2,500
                                         Replications  =   999

           (Std. err. adjusted for 500 clusters in countyreal)
--------------------------------------------------------------
             |   Observed   Bootstrap         Simultaneous
    Exposure |       ATET   std. err.     [95% conf. interval]
-------------+------------------------------------------------
          -3 |   .0305066   .0155754     -.0108387    .0718519
          -2 |  -.0005631   .0131466      -.035461    .0343348
          -1 |  -.0244587    .013223     -.0595595     .010642
           0 |  -.0199318   .0118013     -.0512588    .0113952
           1 |  -.0509574   .0169269     -.0958903   -.0060244
           2 |  -.1372587   .0363963     -.2338736   -.0406437
           3 |  -.1008114   .0339439     -.1909165   -.0107063
--------------------------------------------------------------
Note: Exposure is the number of periods since the first
      treatment time.
Note: Simultaneous confidence intervals provide inference
      across all aggregations simultaneously.

7 CS - csdid

  • csdidパッケージを使って、Callaway and Sant’Anna (2021)の推定を行う。

* never-treatedを0に設定。
replace firsttreat=0 if never_treated==1

* (1) 推定 doubly robust method
csdid lemp , ivar(countyreal) time(year) ///
    gvar(firsttreat) notyet long2 agg(event) method(dripw) wboot reps(5)

* (2) イベント時間集計
estat event, window(-4 3)

* (3) イベント・スタディ・プロット with csdid_plot
* csdid_plotで作図すると、x軸の範囲が適切にならない場合ある。
csdid_plot, title("Callaway and Sant'Anna (2021)") xtitle(Relative Time) style(rcap) legend(off) ytitle(Effect) ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1))  caption(N = `e(N)'. csdid & csdid_plot.)
qui graph save "../Figures/CS_csdid_plot.gph", replace
qui graph export "../Figures/CS_csdid_plot.png", replace


* (4) イベント・スタディ・プロット with coefplot
coefplot, vertical keep(Tm4 Tm3 Tm2 Tm1 Tp0 Tp1 Tp2 Tp3) xlabel(, angle(45)) ylabel(-0.25(.1)0.1) yscale(range(-0.25 0.1)) recast(line) ciopts(recast(rcap))  yline(0) addplot(line @b @at) title("Callaway and Sant'Anna (2021)") xtitle(Relative Time) ytitle(Effect) caption("N = `e(N)'. csdid & coefplot.")  ///
    coeflabels( ///
            Tm4="-4" ///
            Tm3="-3" ///
            Tm2="-2" ///
            Tm1="-1" ///
            Tp0="0"  ///
            Tp1="1"  ///
            Tp2="2"  ///
            Tp3="3"  ///
            ) 
qui graph save "../Figures/CS_coefplot.gph", replace
qui graph export "../Figures/CS_coefplot.png", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof

> ile.do ...


(0 real changes made)

............
Difference-in-difference with Multiple Time Periods

                                                         Number of obs = 2,500
Outcome model  : least squares
Treatment model: inverse probability
---------------------------------------------------------------------
            | Coefficient  Std. err.      t      [95% conf. interval]
------------+--------------------------------------------------------
    Pre_avg |   .0181773   .0168533     1.08    -.0263013    .0626558
   Post_avg |  -.0773993    .020939    -3.70    -.1326605   -.0221381
        Tm4 |   .0033064   .0248261     0.13    -.0622135    .0688262
        Tm3 |   .0269566   .0174021     1.55    -.0189703    .0728834
        Tm2 |   .0242689   .0144381     1.68    -.0138355    .0623733
        Tp0 |  -.0189222   .0117155    -1.62    -.0498412    .0119968
        Tp1 |  -.0535893   .0174318    -3.07    -.0995944   -.0075842
        Tp2 |  -.1362743    .035155    -3.88    -.2290538   -.0434948
        Tp3 |  -.1008114   .0357803    -2.82    -.1952412   -.0063816
---------------------------------------------------------------------
Control: Not yet Treated

See Callaway and Sant'Anna (2021) for details

Test will be based on asymptotic VCoV
If you want aggregations based on WB, use option saverif() ad csdid_stats
ATT by Periods Before and After treatment
Event Study:Dynamic effects
------------------------------------------------------------------------------
             | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
     Pre_avg |   .0181773    .016605     1.09   0.274     -.014368    .0507225
    Post_avg |  -.0773993   .0195602    -3.96   0.000    -.1157366   -.0390621
         Tm4 |   .0033064   .0244519     0.14   0.892    -.0446184    .0512311
         Tm3 |   .0269566   .0175797     1.53   0.125     -.007499    .0614121
         Tm2 |   .0242689   .0144637     1.68   0.093    -.0040794    .0526172
         Tp0 |  -.0189222   .0120446    -1.57   0.116    -.0425291    .0046847
         Tp1 |  -.0535893   .0169464    -3.16   0.002    -.0868036    -.020375
         Tp2 |  -.1362743   .0354034    -3.85   0.000    -.2056637   -.0668849
         Tp3 |  -.1008114   .0343592    -2.93   0.003    -.1681542   -.0334685
------------------------------------------------------------------------------
  • csdid_plot

  • coefplot

8 まとめ

  • グラフを結合する。
graph combine "../Figures/CS_official.gph" ///
              "../Figures/CS_coefplot.gph" ///
              "../Figures/CS_csdid_plot.gph" ///
              "../Figures/TWFE.gph" ///
              "../Figures/SA.gph" ///
              , c(3) 
qui graph export "../Figures/ES_combined.png", replace
Running /Users/ayumu/Documents/GitHub/DiD_notes/DiD_Ex/DiD_Ex09-Stata-Wage/Scripts/prof

> ile.do ...

  • 以上の結果から、最低賃金の引き上げが若年労働者の雇用に与える影響は、TWFE推定、Sun and Abraham (2021)推定、Callaway and Sant’Anna (2021)推定のいずれでも、有意にマイナスであることが示された。

処置前期間

  • Stata 公式 xthdidregressを用いた場合のCallaway and Sant’Anna (2021)推定量は、-3から3の期間について、推定値が得られる。-4の時点の推定値は得られない。
  • TWFEとSun and Abraham (2021)推定量は、-4から3の期間について、推定値が得られる。ただし、-1の期間は基準年となっており、推定値が得られない。
  • csdidパッケージでlong2オプションを用いた場合のCallaway and Sant’Anna (2021)推定量は、-4から3の期間について、推定値が得られる。ただし、-1の期間は基準年となっており、推定値が得られない。

共変量

  • Callaway and Sant’Anna (2021)は、共変量を加えることができるが、Sun and Abraham (2021)は、共変量を加えることができない。

対照群

  • Callaway and Sant’Anna (2021)は、never-treatedとnot-yet-treatedを対照群として使える。
  • Sun and Abraham (2021)は、never-treatedを対照群として使うことができる。Sun and Abraham (2021)は、never-treatedがない場合、最初の処置が最も遅かったコホートを対照群として使うことができる。

参考文献