R语言中工具变量:例题二

2022-01-18  本文已影响0人  多美丽

这篇文章主要是:R语言中工具变量的使用、涉及到的数据处理以及模型含义。(本例题涉及数据处理很常用)

两个例题:例题一涉及6个问题,使用的数据集为R语言自带的fertil2。例题二涉及3个问题,使用的数据集为stata格式的eitc.dta。本文介绍例题二。例题一见上篇。

例题一

1.1 问题

使用数据集eitc.dta,因变量为children,educ为自变量(是否为内生在问题1-6中会讨论),还有其他自变量如age等。


1
2
3

1.2 我的解答

解答

1.3 R语言代码

> library(tidyverse)   # ggplot(), %>%, mutate(), and friends
> library(scales)      # Format numbers with functions like comma(), percent(), and dollar()
> library(broom)       # Convert models to data frames
> library(wooldridge)  # Econometrics-related datasets like injury
> library(stargazer)
> library(foreign)
> library(readstata13)
> eitc = read.dta13("C:\\Users\\LENOVO\\Desktop\\eitc.dta")
> 
> head(eitc,2)
  state year urate children nonwhite      finc       earn age ed work   unearn
1    11 1991   7.6        0        1 18714.394 18714.3943  26 10    1 0.000000
2    12 1991   7.2        1        0  4838.568   471.3656  22  9    1 4.367203

#问题7:(这些不同级别儿童的平均工作、收入、收入、非白人、教育程度和年龄是多少? 这些群体有何不同?)
#7.1 将儿童数分为3类,即0,1,2。
> eitc <- eitc %>%  mutate(children_cat = case_when(
+   children == 0 ~ "0",
+   children == 1 ~ "1",
+   children >= 2 ~ "2+"
+ ))

#7.2 选取儿童数为0的,求work、finc等的均值
> eitc %>%
+   filter(children =="0")%>%
+   summarize(mean_0_work = mean(work),
+             mean_0_finc = mean(finc),
+             mean_0_earn = mean(earn),
+             mean_0_nonwhite = mean(nonwhite),
+             mean_0_ed = mean(ed),
+             mean_0_age =mean(age)
+             )
  mean_0_work mean_0_finc mean_0_earn mean_0_nonwhite mean_0_ed mean_0_age
1   0.5744896    18559.86    13760.26        0.515944  8.548676   38.49823

#7.3 选取儿童数为1的,求work、finc等的均值
> eitc %>%
+   filter(children =="1")%>%
+   summarize(mean_1_work = mean(work),
+             mean_1_finc = mean(finc),
+             mean_1_earn = mean(earn),
+             mean_1_nonwhite = mean(nonwhite),
+             mean_1_ed = mean(ed),
+             mean_1_age =mean(age)
+   )
  mean_1_work mean_1_finc mean_1_earn mean_1_nonwhite mean_1_ed mean_1_age
1   0.5376063    13941.57    9928.279       0.5964683  8.992479   33.75899

#7.4 选取儿童数为2的,求work、finc等的均值
> eitc %>%
+   filter(children =="2")%>%
+   summarize(mean_2_work = mean(work),
+             mean_2_finc = mean(finc),
+             mean_2_earn = mean(earn),
+             mean_2_nonwhite = mean(nonwhite),
+             mean_2_ed = mean(ed),
+             mean_2_age =mean(age)
+   )
  mean_2_work mean_2_finc mean_2_earn mean_2_nonwhite mean_2_ed mean_2_age
1   0.4782972    12357.29    7487.978       0.6527546  9.082638   32.26002
> 
> eitc <- eitc %>%  mutate(children_cat = case_when(
+   children == 0 ~ "0",
+   children == 1 ~ "1",
+   children >= 2 ~ "2+"
+ ))
> 
> 
> #问题8:(创建一个名为 any_kids 的新变量(如果 children > 0,则应为 TRUE 或 1)和一个名为 after_1993 的时间变量(如果 year > 1993,则应为 TRUE 或 1))
> any_kids = (eitc$children > 0)*1
> eitc = cbind(eitc,any_kids)
> 
> after_1993 = (eitc$year > 0)*1
> eitc = cbind(eitc,after_1993)
> 
> 
> #问题9:(创建一个新数据集,显示治疗组和对照组(即有孩子和没有孩子)中每年就业女性(工作)的平均比例。)
> eitc %>%
+   filter(any_kids =="1")%>%
+   summarize(mean_any_kid_1_work = mean(work)
+   )
  mean_any_kid_1_work
1           0.4664279
> 
> eitc %>%
+   filter(any_kids =="0")%>%
+   summarize(mean_any_kid_0_work = mean(work)
+   )
  mean_any_kid_0_work
1           0.5744896

以上是我自己做的答案,也不知道正确答案如何,如果有会的同学来点评帮助一下,小编将感激不尽。共勉。

上一篇下一篇

猜你喜欢

热点阅读