Associate Professor, College of Economics, Aoyama Gakuin University, Tokyo, Japan.
ipolate
levelsof
)misstable
、mdesc
labmask
)while
を使った繰り返しunicode
collect
substr
ds
、lookfor
encode
を使えない場合の対処策geoplot
graph combine
ustrpos
graph combine
/ grc1leg
: 複数のグラフの結合。kountry
twoway (scatter weight price, sort), title(Scatter plot for `: variable label weight')
sysuse auto,clear
foreach v of var weight length {
twoway (scatter `v' price, sort), title(Scatter plot for `: variable label `v'')
}
Including variable labels in graph titles
ipolate
ipolate
のオプションにepolateを付けると、内挿だけではなく外挿もできる。
by KSFAffiliateCode: ipolate L1 year, gen(L) epolate
sysuse auto,clear
histogram mpg if foreign==0, bin(10) fcolor(blue%30) barwidth(2) addplot((histogram mpg if foreign==1, bin(10) fcolor(red%30))) legend(order(1 "domestic" 2 "foreign"))
modern
scheme-modern
は、以下のようにインストール、セットできる。net install scheme-modern, from("https://raw.githubusercontent.com/mdroste/stata-scheme-modern/master/")
set scheme modern
sysuse auto,clear
twoway (scatter price mpg), scheme(modern)
graph export modern.png
plotplain
sysuse auto,clear
twoway (scatter price mpg), scheme(plotplain)
graph export plotplain.png
levelsof
)sysuse auto,clear
levelsof foreign, local(type)
foreach i of local type {
su mpg if foreign==`i'
scalar mean_`i'=r(mean)
}
di mean_0
di mean_1
misstable
、mdesc
use https://www.stata-press.com/data/r18/nlswork.dta, clear
misstable summarize
mdesc
boxpanel
を用いる方法* パッケージのインストール
ssc install boxpanel
* データの読み込み
use https://www.stata-press.com/data/r17/nlswork.dta,clear
boxpanel ln_wage year, joinmedian
stripplot
を使う方法* パッケージのインストール
*ssc install stripplot
* データ読み込み
use https://www.stata-press.com/data/r18/nlswork.dta, clear
*平均値の作成
bys year: egen meanln_wage=mean(ln_wage)
*グラフ作成
*ここで、addplot(connect mean rep78,sort)で、平均値をつなげるグラフを追加している
*オプションのboxは箱を作成している。
stripplot ln_wage, over(year) box ms(none) vertical addplot(connect meanln_wage year,sort )
graph export stripplot.png,replace
<img src=”img/stripplot.png” width=70%>
plot_confidently
を使う方法use https://www.stata-press.com/data/r18/nlswork.dta, clear
ssc install plot_confidently
plot_confidently ln_wage, over(year) ///
graphopts(vertical xtitle(year) ytitle(ln_wage) mlabgap(relative3p5) ///
xlabel(0 "1968" 5 "1973" 10 "1978" 15 "1983"))
use https://www.stata-press.com/data/r18/nlswork.dta, clear
ssc install plot_confidently
* ラベルの値を定義しておかないとエラーになる
label define collgrad 0 "High School" 1 "Colledge"
label values collgrad collgrad
* グラフ
plot_confidently ln_wage, over(year) by(collgrad) ///
graphopts(vertical xtitle(year) ytitle(ln_wage) mlabgap(relative3p5) ///
xlabel(0 "1968" 5 "1973" 10 "1978" 15 "1983"))
graph export plot_confidently2.png,replace
<img src=”img/plot_confidently2.png” width=70%>
use https://www.stata-press.com/data/r18/nlswork.dta, clear
bys year: egen meanln_wage=mean(ln_wage)
format %12.1f meanln_wage
twoway (connected meanln_wage year, sort mlabel(meanln_wage) mlabposition(12) mlabgap(relative1p5)), xlabel(68(5)88)
twoway (connected meanln_wage year, sort mlabel(meanln_wage) mlabposition(12) mlabgap(relative1p5))
use https://www.stata-press.com/data/r18/nlswork.dta, clear
bys year: egen meanln_wage=mean(ln_wage)
format %12.1f meanln_wage
bys year: egen ln_wage75=pctile(ln_wage), p(75)
bys year: egen ln_wage25=pctile(ln_wage), p(25)
twoway (connected meanln_wage year, mlabel(meanln_wage) mlabposition(1) mlabgap(relative1p5)) (rcap ln_wage25 ln_wage75 year, sort), legend(order(1 "Mean" 2 "25th and 75th percentiles") position(6))
graph export percentile.png,replace
use https://www.stata-press.com/data/r18/nlswork.dta, clear
bys year: egen meanln_wage=mean(ln_wage)
format %12.1f meanln_wage
* 信頼区間をciで計算する。
g ln_wageub=.
g ln_wagelb=.
forval i = 68/88 {
ci means ln_wage if year==`i'
replace ln_wageub=`r(ub)' if year==`i'
replace ln_wagelb=`r(lb)' if year==`i'
}
twoway (connected meanln_wage year, mlabel(meanln_wage) mlabposition(1) mlabgap(relative1p5)) (rcap ln_wagelb ln_wageub year, sort), legend(order(1 "Mean" 2 "95% CI") position(6))
graph export ci.png,replace
use https://www.stata-press.com/data/r18/nlswork.dta, clear
collapse (mean) mean= ln_wage (sd)sd=ln_wage (count) n=ln_wage, by(year collgrad)
format %12.2f mean
g ub= mean+(1.96*sd)/sqrt(n)
g lb= mean-(1.96*sd)/sqrt(n)
rename collgrad group
global group0 "Control"
global group1 "Treatment"
g year2=year+0.5 //グループ別のグラフが重ならないように横軸をずらす
* 簡単なコード
twoway (connected mean year if group ==0, mlabel(mean) mlabposition(12) mlabgap(relative3p)) (rcap lb ub year if group ==0, sort) ///
(connected mean year2 if group ==1, mlabel(mean) mlabposition(12) mlabgap(relative3p)) (rcap lb ub year2 if group ==1, sort) ///
, legend(order(1 "$group0: Mean" 2 "$group0: 95%CI" 3 "$group1: Mean" 4 "$group1: 95%CI") position(6) rows(2)) xline(1998)
graph export ci2.png
* 色を揃えたコード
twoway (connected mean year if group ==0, mlabel(mean) mlabposition(12) mlabgap(relative3p) mlabcolor(blue) mcolor(blue) lcolor(blue)) (rcap lb ub year if group ==0, sort mcolor(blue) lcolor(blue)) (connected mean year2 if group ==1, mlabel(mean) mlabposition(12) mlabgap(relative3p) mlabcolor(black) mcolor(black) lcolor(black)) (rcap lb ub year2 if group ==1, sort mcolor(black) lcolor(black)) , legend(order(1 "$group0: Mean" 2 "$group0: 95%CI" 3 "$group1: Mean" 4 "$group1: 95%CI") position(6) rows(2)) xline(1998)
help string_functions
ustrregexrf
)replace KSFNationNameAlph = ustrregexrf(KSFNationNameAlph, "Kingdom of ", "")
replace KSFNationNameAlph = ustrregexrf(KSFNationNameAlph, "Republic of ", "")
labmask
)labmask KSFNationCode, values(KSFNationNameAlph)
while
を使った繰り返しlocal i=0
while `i'<=2 {
local i=`i'+1
by KSFNationCode, sort:replace KSFNationNameAlph=KSFNationNameAlph[_n+1] if year<2018
}
unicode
そのため、DTAファイルとして保存したあと、エンコードをUnicodeに変換する必要がある。
* 最初にデータを空にする必要がある
clear
* エンコード変更のために作業フォルダをデータがあるフォルダ(例/Users/XXX/Desktop)とする
cd /Users/XXX/Desktop
* 元々のエンコードがShift_JISであることを設定
unicode encoding set Shift_JIS
* エンコード変更(全てのDTAファイル)
unicode translate *.dta
collect
コード例)
* 2x2の表を作成
table ( year ) ( foreign_owned ) () if provinceorstateaddressofincorp=="CA", statistic(frequency)
* タイトル追加
collect title "Foreign owned firms in CA, USA \label{tab:CA}"
* Notes追加
collect notes `"Notes: A foreign-owned firm is defined as one whose global ultimate owner belongs to a foreign country."'
* 縦線削除
collect style cell, border( right, pattern(nil) )
* 表頭のラベル変更
collect label dim foreign_owned "Foregin owned firm", modify
collect label levels foreign_owned 0 "No", modify
collect label levels foreign_owned 1 "Yes", modify
* 表を出力
collect export "../Tables/CA.tex", replace tableonly
例1)
* 欠けている月次に変数の値を埋める
sort stock_code ym
foreach var in ymd y res yyyymm isin firmnameJ firmname {
bys stock_code: replace `var' = `var'[_n-1] if missing(`var')
}
例2)
* パネルデータで欠けている値の補完
sort iso_code year
foreach var in country countyjp eu_accession eu_exit {
local i=0
while `i'<=75 {
local i=`i'+1
* 後方補完
bys iso_code : replace `var' = `var'[_n-1] if missing(`var')
* 前方補完
bys iso_code : replace `var' = `var'[_n+1] if missing(`var')
}
}
assert
: 間違いがあれば、以後のプログラムを停止する. sysuse bplong,clear
(Fictional blood-pressure data)
. replace sex = 3 in 1
(1 real change made)
. assert sex == 0 | sex == 1
1 contradiction in 240 observations
assertion is false
r(9);
substr
substr("abcdef",-3,2) = "de"
usubstr("médiane",-3,2) = "an" //Unicode
udsubstr("東京都",2,4) = "京都" //全角文字
udsubstr("東京都",2,4) = "京都" //全角文字
udsubstr("東京都",-2,4) = "京都" //全角文字
ds
、lookfor
ds, has(varl 文字列*)
lookfor 文字列
*文字列ABCの部分を削除
ds, has(varl 文字列*)
foreach var in `r(varlist)' {
local variable_label : variable label `var'
local variable_label : subinstr local variable_label "文字列ABC" ""
label variable `var' "`variable_label'"
}
lookfor 文字列ABC //文字列ABCを含む変数を探す
foreach var in `r(varlist)' {
local variable_label1 : variable label `var' //既存変数ラベルの保存
local variable_label2 "`variable_label1' 追加文字列XYZ" //新規変数ラベルの作成
label variable `var' "`variable_label2'" //変数ラベルの付与
}
例)すべての年次に観測がある個体のみを残す。
* Balanced Panel
bys id: egen ny=count(year)
su ny
keep if ny==r(max)
drop ny
例)パッケージ(xtpattern
)を使って、すべての年次に観測がある個体のみを残す。
tsset id year
xtpattern, gen(pat)
tab pat
destring pat, gen(patn) ignore(.)
su patn
keep if patn==r(max)
drop pat patn
tsset id year
例)欠損年次のある個体の欠損を埋める。
tsset id year
fillin id year
tab _fillin
encode
を使えない場合の対処策例)以下のように企業ID(文字変数: bvdidnumber)の値の種類が多すぎると、encodeはエラーになる。
. encode bvdidnumber,gen(id)
too many values
その場合、group()関数で企業IDから数値変数を作成できる。
* Firm ID作成
egen long id = group(bvdidnumber)
tsset id year
geoplot
コード例
ssc install geoplot, replace
help geoplot
local url http://fmwww.bc.edu/repec/bocode/i/
geoframe create regions `url'Italy-RegionsData.dta, id(id) coord(xcoord ycoord) shpfile(Italy-RegionsCoordinates.dta)
geoframe create country `url'Italy-OutlineCoordinates.dta
geoframe create capitals `url'Italy-Capitals.dta, coord(xcoord ycoord)
geoframe create lakes `url'Italy-Lakes.dta, feature(water)
geoframe create rivers `url'Italy-Rivers.dta, feature(water)
geoplot (area regions) (line country, lwidth(medthick)), tight
graph combine
sysuse auto
scatter mpg weight , name(g1)
scatter mpg price , name(g2)
graph combine g1 g2, l1(Miles per gallon) b1(Interesting predictors)
sysuse auto, clear
su mpg
global mpg_mean_neat = strltrim("`: display %10.3f r(mean)'")
di $mpg_mean_neat
hist mpg, title(Mean=$mpg_mean_neat)
graph export hist.png,replace
ustrpos
企業名変数=name、従業員数=Lのとき
browse if ustrpos(name, "アサヒビール")>0
browse L if ustrpos(name, "トヨタ自動車(株)")>0
foreach v of varlist B C{
local x : variable label `v' //変数B,Cの変数ラベル(1990,2000)をローカルxに保存
rename `v' year`x' //変数B,Cの変数名をyear1990,year2000に変更
}
ds , has(type string) //文字列を含む変数を特定
foreach var of varlist `r(varlist)' {
replace `var' = "" if `var' == "n.a."
replace `var' = "" if `var' == "n.s."
}
参考) Replacing “NA” with missing
* 文字列変数のリストr(varlist)を取得
ds , has(type string)
* 文字列変数の"n.a."という文字列を削除して、変数を数値変数に変換
foreach var of varlist `r(varlist)' {
destring `var', ignore("n.a.") replace
}
ds
foreach v of varlist `r(varlist)' {
lab var `v' "`v'"
}
sysuse auto,clear
describe, replace clear
export excel using varlist.xlsx,replace first(variables)
Exporting variable names and corresponding labels
graph combine
/ grc1leg
: 複数のグラフの結合。legendが同じ時は、grc1legが便利。それ以外は、graph combineで良い。
コード例
grc1leg "StataGraph/AMCE_task1.gph" ///
"StataGraph/AMCE_nfalse0.gph" ///
,rows(1) cols(2) title("") legendfrom(StataGraph/AMCE_task1.gph)
コード例
graph combine "StataGraph/AMCE_task1.gph" ///
"StataGraph/AMCE_nfalse0.gph" ///
,rows(1) cols(2) title("")
Stata: 複数のグラフでlegendが同じ時
一つだけlegendを表示するには、grc1legをインストールする必要あり。
optionのlegendfrom(../Figures/Unmatched/eventdd_procure_employment.gph)でどのグラフのlegendを使うか指定する。
新しいgrc1leg2もあるよう。
公式のgr combineではできない。
コード例
*net install grc1leg,from( http://www.stata.com/users/vwiggins/)
grc1leg "../Figures/Unmatched/eventdd_procure_employment.gph" ///
"../Figures/Unmatched/eventdd_procure_sales.gph" ///
"../Figures/Unmatched/eventdd_procure_productivity.gph" ///
"../Figures/matched/eventdd_procure_employment_psmatch.gph" ///
"../Figures/matched/eventdd_procure_sales_psmatch.gph" ///
"../Figures/matched/eventdd_procure_productivity_psmatch.gph" ///
,rows(3) cols(3) title("Transaction type: Procure") legendfrom(../Figures/Unmatched/eventdd_procure_employment.gph)
\begin{figure}[htbp]
\begin{tabular}{cc}
\begin{minipage}[t]{0.45\hsize}
\centering
\includegraphics[width = 6 cm]{fig/Fig04_KOR_all.jpg}
\subcaption{Korea}
\label{KOR}
\end{minipage} &
\begin{minipage}[t]{0.45\hsize}
\centering
\includegraphics[width = 6 cm]{fig/Fig04_USA_all.jpg}
\subcaption{USA}
\label{USA}
\end{minipage}
\end{tabular}
\caption{Evolving distribution of Japanese firms' ownership ratio in Korea and USA, 1990--2018. \label{fig:distribution}}
\footnotesize{Source: Authors' compilation based on Toyo Keizai Inc.'s OJC data.}
\footnotesize{Notes: The line inside the box indicates the median, while the lower and upper hinges of the box indicate the 25th and 75th percentiles, respectively. The lower and upper adjacent lines of the whiskers show the lower and upper adjacent values that are the furthest observations that are within 1.5 times the interquartile range. The dots indicate the outside values.}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[htbp]
\begin{tabular}{cc}
\begin{minipage}[t]{0.45\hsize}
\centering
\includegraphics[width = 6 cm]{fig2_trend_average_ratio.eps}
\subcaption{All}
\label{composite}
\end{minipage} &
\begin{minipage}[t]{0.45\hsize}
\centering
\includegraphics[width = 6 cm]{fig2_trend_average_ratio_CHN.eps}
\subcaption{China}
\label{Gradation}
\end{minipage} \\
\begin{minipage}[t]{0.45\hsize}
\centering
\includegraphics[width = 6 cm]{fig2_trend_average_ratio_THA.eps}
\subcaption{Thailand}
\label{fill}
\end{minipage} &
\begin{minipage}[t]{0.45\hsize}
\centering
\includegraphics[width = 6 cm]{fig2_trend_average_ratio_USA.eps}
\subcaption{USA}
\label{transform}
\end{minipage}
\end{tabular}
\caption{Number of new foreign subsidiaries and their average ownership ratio in manufacturing, 1989--2016. \label{fig:trend3}}
\footnotesize{Source: Authors' compilation based on Toyo Keizai Inc.'s OJC data.}
\end{figure}
%des03_trend.do, des04_trend_by_country.do
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
kountry
ssc install kountry
use "kountry.dta",clear
*ISO2桁コードからISO3桁コード(_ISO3C_)の生成
kountry CountryISOCode,from(iso2c) to(iso3c)
*ISO2桁コードから国名(NAMES_STD)の生成
kountry CountryISOCode,from(iso2c)
ssc install kountry
kountry Country, from(other) stuck //国名コード番号_ISO3N_ 作成
rename _ISO3N_ ison
kountry ison, from(iso3n) to(iso3c) //国名コード番号から国名コード作成
rename _ISO3C_ countrycode
kountry: A Stata utility for merging cross-country data from multiple sources
g mnum=.
replace mnum=1 if mname=="January"
replace mnum=2 if mname=="February"
replace mnum=3 if mname=="March"
replace mnum=4 if mname=="April"
replace mnum=5 if mname=="May"
replace mnum=6 if mname=="June"
replace mnum=7 if mname=="July"
replace mnum=8 if mname=="August"
replace mnum=9 if mname=="September"
replace mnum=10 if mname=="October"
replace mnum=11 if mname=="November"
replace mnum=12 if mname=="December"
*insheet using https://stopcovid19.metro.tokyo.lg.jp/data/130001_tokyo_covid19_patients.csv,clear
insheet using https://ayumu-tanaka.github.io/Notes/130001_tokyo_covid19_patients.csv,clear
**YEAR 年次
gen y=substr(公表_年月日,1,4)
destring y,replace
*Date 日付
gen ymd=date(公表_年月日,"YMD")
format ymd %td
*Month 月次
gen m = mofd(ymd)
format m %tm
*Days 経過日数1
gen days=date(公表_年月日,"YMD")
label variable days "days since 01jan1960"
*Days since the first Covid 経過日数2
g cdays=days-21937
参考)Changing Daily Data into Monthly Data
g m=monthly(ym,"YM")
format m %tm
gen date = dofm(m)
format date %d
gen yr=year(date)
gen month=month(date)
format month %tm
参考)HOW CAN I EXTRACT MONTH AND YEAR COMPONENT FROM A VARIABLE WITH %TM FORMAT?
gen week_days = dow(daily_date )
browse if m==tm(2021m7)
browse if ymd==td(13sep2021)
参考) Statalist
tsfill, full
** 年、月別の変数から月別の変数を生成する。**
*年(2020)と月(1)から年月(2020m1)作成
gen ym = ym(year, m)
format ym %tm
参考) Generating Monthly variable from Year and Month separate Variables.
** 月次変数の作成例 **
g ym1="2018m01"
g ym=monthly(ym,"YM")
format ym %tm
drop ym1
参考) チート・シート
split Week, parse("~") limit(1) gen(startdate)
gen ymd=date(startdate1,"YMD")
format ymd %td
gen w=wofd(ymd)
format w %tw
tsset ymd
参考)How to declare weekly data as time series data in Stata 15
substr
g m=substr(myear, 1, strpos(myear,"2")-1)
display age(td(25jan2020), td(27dec2021))
display datediff(td(24jan2020), td(27dec2021), "month")
display datediff(td(24jan2020), td(27dec2021), "day")
参考)New date and time functions
generate stata`var' = `var' + td(30dec1899)
format stata`var' %td
gen year_of_birth = year(month_year_of_birth)
sysuse auto,clear
estpost correlate price mpg rep78, matrix listwise
esttab using correlationresults.tex, replace unstack not noobs compress b(2) ///
label star(* 0.1 ** 0.05 *** 0.01) nonote ///
note(Note: \sym{*} \(p<0.1\), \sym{**} \(p<0.05\), \sym{***} \(p<0.01\))
latexでthreeparttableパッケージを使用。
\usepackage{threeparttable}
esttab で以下のようにpostfootを利用。
esttab modelA modelB ///
using "../Tables/table_baseline.tex",b(3) se(3) bracket ///
drop(*cty_* *indcode* *year* _cons) ///
mtitle(All OECD Non-OECD All OECD Non-OECD) compress booktab replace label ///
title(Baseline fractional logit results \label{baseline}) ///
postfoot("\hline\hline \end{tabular} \begin{tablenotes} \footnotesize \item Notes: Robust standard errors are clustered by parent firm. Dep. var.: Parent firms' ownership ratio of foreign subsidiaries ($ t $). Columns (1)--(5) are estimated by the fractional logit model. Host countries' log GDP, log per capita GDP, level of IPR protection, and financial development are included in the estimation. * 10\% level, ** 5\% level, and *** 1\% level. \end{tablenotes} \end{table}") ///
scalars( "N Obs." "NA N of Subsidiaries" "NP N of Parent firms" "NB N of Banks" "NC N of Countries" "ymean Mean of Dep. Var." "FEC Country FE" "FEI Parent Industry FE" "FEY Year FE") sfmt(3) ///
star(* 0.1 ** 0.05 *** 0.01) nonotes eqlabels(" ") ///
mgroups("Baseline" "Extended model", pattern(1 0 0 1 0 0) ///
prefix(\multicolumn{@span}{c}{) suffix(}) span erepeat(\cmidrule(lr){@span}))
Export a LaTeX three part table using esttab
esttab model1 using "../Tables/TableXX_reg2018.tex",b(3) se(3) bracket ///
drop(_cons ) ///
compress booktab replace label ///
title(Gravity equations \label{regression1}) ///
postfoot("\tabnotes{6}{ Notes: Robust standard errors are clustered by the host country. * 10\% level, ** 5\% level, and *** 1\% level.}") ///
sfmt(3) ///
star(* 0.1 ** 0.05 *** 0.01) nonotes eqlabels(" ")
ドキュメントクラスは以下のいずれかにする
\documentclass{jsarticle}
\documentclass[uplatex]{jsarticle}
latexmkrcという名前のファイルに以下の内容をコピー。
$latex = 'uplatex';
$bibtex = 'upbibtex';
$dvipdf = 'dvipdfmx %O -o %D %S';
$makeindex = 'mendex -U %O -o %D %S';
$pdf_mode = 3;