baselinetable命令:论文基本统计量表格输出到Exce

2019-08-02  本文已影响0人  stata连享会

作者:何庆红(北京大学中国卫生经济研究中心)

连享会:(知乎 | 简书 | 码云 | CSDN)

Stata连享会   计量专题 || 精品课程 || 简书推文 || 公众号合集

点击查看完整推文列表

2020寒假Stata现场班 (北京, 1月8-17日,连玉君-江艇主讲),「+助教招聘」

2020寒假Stata现场班

连享会:内生性问题及估计方法专题

连享会-内生性专题现场班-2019.11.14-17

正所谓 「Garbage in, garbage out」。在实证分析中,对数据的基本描述是必要一环。今天我们介绍一个简洁便利的外部命令 baselinetable,用于连续变量和类别变量的基本描述性统计分析,呈现形式为 一维表 或者 二维表

baselinetable 命令有如下特征:

1. 下载安装

在 Stata 命令窗口中输入如下命令即可自动安装 baselinetable 命令:

ssc install baselinetable, replace

2. 语法格式

baselinetable 的基本语法格式如下:

baselinetable [insertrow["string"]] varname [(rowvariable_options)] [[insertrow] varname...] [if] [in] [, main_options]

统计量的设定

变量名的设定

连享会计量方法专题……

3. 应用举例

这里我们使用 Stata 官方范例数据 "nlsw88.dta" 来说明该命令的具体使用方法。

sysuse "nlsw88.dta", clear 

首先,看简单的一维表形式:

. baselinetable wage(cts) hours(cts) age(cts) union race  //一维表
  +------------------------------------+
  |                     | N=2246       |
  |---------------------+--------------|
  | hourly wage         | 7.8 (5.8)    |
  |---------------------+--------------|
  | usual hours worked  | 37.2 (10.5)  |
  |---------------------+--------------|
  | age in current year | 39.2 (3.1)   |
  |---------------------+--------------|
  | union worker        |              |
  |---------------------+--------------|
  |      nonunion       | 1417 (75.5%) |
  |---------------------+--------------|
  |      union          | 461 (24.5%)  |
  |---------------------+--------------|
  | race                |              |
  |---------------------+--------------|
  |      white          | 1637 (72.9%) |
  |---------------------+--------------|
  |      black          | 583 (26.0%)  |
  |---------------------+--------------|
  |      other          | 26 (1.2%)    |
  +------------------------------------+

其次,加入分组变量,汇报二维列联表形式:

. baselinetable wage(cts) hours(cts) age(cts) union race, by( married , totalcolumn) 
  +-----------------------------------------------------------------+
  |                     | married     |              |              |
  |---------------------+-------------+--------------+--------------|
  |                     | single      | married      | Total        |
  |---------------------+-------------+--------------+--------------|
  |                     | N=804       | N=1442       | N=2246       |
  |---------------------+-------------+--------------+--------------|
  | hourly wage         | 8.1 (6.3)   | 7.6 (5.4)    | 7.8 (5.8)    |
  |---------------------+-------------+--------------+--------------|
  | usual hours worked  | 39.2 (9.1)  | 36.1 (11.1)  | 37.2 (10.5)  |
  |---------------------+-------------+--------------+--------------|
  | age in current year | 39.2 (3.0)  | 39.1 (3.1)   | 39.2 (3.1)   |
  |---------------------+-------------+--------------+--------------|
  | union worker        |             |              |              |
  |---------------------+-------------+--------------+--------------|
  |      nonunion       | 475 (72.4%) | 942 (77.1%)  | 1417 (75.5%) |
  |---------------------+-------------+--------------+--------------|
  |      union          | 181 (27.6%) | 280 (22.9%)  | 461 (24.5%)  |
  |---------------------+-------------+--------------+--------------|
  | race                |             |              |              |
  |---------------------+-------------+--------------+--------------|
  |      white          | 487 (60.6%) | 1150 (79.8%) | 1637 (72.9%) |
  |---------------------+-------------+--------------+--------------|
  |      black          | 309 (38.4%) | 274 (19.0%)  | 583 (26.0%)  |
  |---------------------+-------------+--------------+--------------|
  |      other          | 8 (1.0%)    | 18 (1.2%)    | 26 (1.2%)    |
  +-----------------------------------------------------------------+

第三,添加选项,汇报缺漏的样本数:

. baselinetable wage(cts) hours(cts) age(cts) union race,  ///
        by( married , totalcolumn) reportmissing

  +-----------------------------------------------------------------+
  |                     | married     |              |              |
  |---------------------+-------------+--------------+--------------|
  |                     | single      | married      | Total        |
  |---------------------+-------------+--------------+--------------|
  |                     | N=804       | N=1442       | N=2246       |
  |---------------------+-------------+--------------+--------------|
  | hourly wage         | 8.1 (6.3)   | 7.6 (5.4)    | 7.8 (5.8)    |
  |---------------------+-------------+--------------+--------------|
  | usual hours worked  | 39.2 (9.1)  | 36.1 (11.1)  | 37.2 (10.5)  |
  |---------------------+-------------+--------------+--------------|
  |      MISSING        | 3           | 1            | 4            |
  |---------------------+-------------+--------------+--------------|
  | age in current year | 39.2 (3.0)  | 39.1 (3.1)   | 39.2 (3.1)   |
  |---------------------+-------------+--------------+--------------|
  | union worker        |             |              |              |
  |---------------------+-------------+--------------+--------------|
  |      nonunion       | 475 (72.4%) | 942 (77.1%)  | 1417 (75.5%) |
  |---------------------+-------------+--------------+--------------|
  |      union          | 181 (27.6%) | 280 (22.9%)  | 461 (24.5%)  |
  |---------------------+-------------+--------------+--------------|
  |      MISSING        | 148         | 220          | 368          |
  |---------------------+-------------+--------------+--------------|
  | race                |             |              |              |
  |---------------------+-------------+--------------+--------------|
  |      white          | 487 (60.6%) | 1150 (79.8%) | 1637 (72.9%) |
  |---------------------+-------------+--------------+--------------|
  |      black          | 309 (38.4%) | 274 (19.0%)  | 583 (26.0%)  |
  |---------------------+-------------+--------------+--------------|
  |      other          | 8 (1.0%)    | 18 (1.2%)    | 26 (1.2%)    |
  +-----------------------------------------------------------------

连享会计量方法专题……

4. baselinetable 与其他命令的比较

baselinetable 功能相似的命令还有 summarizetabulatetabletabstatfsum 等。summarize 命令 (通常简写为 susum) 主要用于一维列表的相关统计量的计算。table 命令主要是用来做列表统计,尤其对于类别变量的统计,包括一维表、二维表、三维等以上维度,具体可参看 「Stata:今天你 “table” 了吗?(微信版)」CSDN版本

下面通过具体例子对它们之间的差异进行对比。

4.1 fsum 命令

fsum 命令的语法非常简洁,输出结果也颇为丰富,包括:N、mean、sd、min、max、median 等几十个统计量 (sum varlist, detail 存储于内存中的返回值都可以通过 fsum 输出)。相对于 Stata 官方命令 summarizetabstat,该命令可以对类别变量进行更为细致的统计分析。更为便利的是,该命令允许用户设置的变量标签,汇报格式自动调整更符合多数期刊的内容和格式要求。

下面举例说明该命令的主要用法。

*-基本表格

. fsum wage hours age union race, cat(union) mcat(race) 
        
      Variable |        N     Mean       SD      Min      Max                                                                                                                              
---------------+---------------------------------------------
          wage |     2246     7.77     5.76     1.00    40.75  
         hours |     2242    37.22    10.51     1.00    80.00  
           age |     2246    39.15     3.06    34.00    46.00  
         union |     1878     0.25     0.43     0.00     1.00  
 nonunion (%)  |     1417    75.45
    union (%)  |      461    24.55
          race |     2246     1.28     0.48     1.00     3.00  
    white (%)  |     1637    72.89
    black (%)  |      583    25.96
    other (%)  |       26     1.16

*-借助 `bysort' 前缀可以实现分组统计

. bysort married: fsum wage hours age union race

-------------------------------------------------------------
-> married = single

      Variable |        N     Mean       SD      Min      Max 
---------------+---------------------------------------------
          wage |      804     8.08     6.34     1.15    40.20  
         hours |      801    39.24     9.10     2.00    80.00  
           age |      804    39.22     3.05    34.00    46.00  
         union |      656     0.28     0.45     0.00     1.00  
          race |      804     1.40     0.51     1.00     3.00  

-------------------------------------------------------------
-> married = married

      Variable |        N     Mean       SD      Min      Max  
---------------+---------------------------------------------
          wage |     1442     7.59     5.40     1.00    40.75  
         hours |     1441    36.10    11.06     1.00    80.00  
           age |     1442    39.12     3.07    34.00    45.00  
         union |     1222     0.23     0.42     0.00     1.00  
          race |     1442     1.21     0.44     1.00     3.00 

4.2 tabstat 命令

这是目前最为常用的用于报告基本统计量的 Stata 官方命令。它可以汇报一系列的统计量,汇报内容和格式也较为灵活。

\\\一维表形式如下:

. tabstat wage hours  age  union race, stat(N mean sd min max) col(stat)

    variable |         N      mean        sd       min       max
-------------+--------------------------------------------------
        wage |      2246  7.766949  5.755523  1.004952  40.74659
       hours |      2242  37.21811  10.50914         1        80
         age |      2246  39.15316  3.060002        34        46
       union |      1878  .2454739  .4304825         0         1
        race |      2246  1.282725  .4754413         1         3
----------------------------------------------------------------

\\\加入分组变量,汇报二维表形式:

. tabstat wage hours  age  union race, by(married) stat(N mean sd min max) nototal long col(stat)

married     variable |         N      mean        sd       min       max
---------------------+--------------------------------------------------
single          wage |       804  8.080765  6.336071  1.151368  40.19808
               hours |       801  39.23845  9.099001         2        80
                 age |       804  39.21891  3.049911        34        46
               union |       656  .2759146  .4473151         0         1
                race |       804  1.404229  .5109335         1         3
---------------------+--------------------------------------------------
married         wage |      1442  7.591978  5.399229  1.004952  40.74659
               hours |      1441  36.09507  11.06107         1        80
                 age |      1442   39.1165  3.066058        34        45
               union |      1222  .2291326  .4204468         0         1
                race |      1442  1.214979  .4402987         1         3
------------------------------------------------------------------------

连享会计量方法专题……

5. baselinetable 结果输出

baselinetable 命令提供了两个选项,可以把屏幕上输出的结果快捷地输出为 Excel 或 Word 文档:

*-第一种方法:将结果输出到Excel表格里面
 baselinetable race age(cts) ht lwt(cts), by(smoke, totalcolumn) exportexcel(table1)
*-第二种方法:将结果输出到word里面
putdocx begin
baselinetable race age(cts) ht lwt(cts), by(smoke, totalcolumn) putdocxtab(table1)
putdocx save mydoc, replace

6. 结果输出和呈现相关推文

7. 参考资料

关于我们

点击查看完整推文列表

image
上一篇 下一篇

猜你喜欢

热点阅读