STATA命令table与百分比列联表

2022-05-08  本文已影响0人  冬之心

title: "table"
author: "wintryheart"
date: "2022/5/6"
output: html_document



knitr::opts_chunk$set(engine="stata", engine.path="C:\\Program Files\\Stata17\\StataMP-64.exe", error=TRUE, cleanlog=TRUE, comment=NA)
library(Statamarkdown)

注:

  1. 当初我的电脑上的STATA版本是15,所以Statamarkdown包在安装时搜索到的是STATA15。
  2. 后来我又安装了STATA17,但是这个包似乎没有提供函数来修改stata engine。
  3. 所以我只好在knitr里修改engine.path,但是似乎只在chuck里运行有用,整个markdown文件运行knitr时,会出错。暂时没找到解决办法。

解决办法终于找到。参见3 Stata Engine Path | Using Statamarkdown (wisc.edu)

library(Statamarkdown)
stataexe <- "C:/Program Files/Stata17/StataMP-64.exe"
knitr::opts_chunk$set(engine.path=list(stata=stataexe))
knitr::opts_chunk$set(echo = TRUE, error=TRUE, cleanlog=TRUE, comment=NA)

前言

1. 与Table有关的常用命令

2. 个人需求

有时,我需要做多维的百分比列联表。以前是使用by+tabulate。输出的结果是一系列二维表,不太好放进PPT,只能手工整理成一个表格。

其他Table命令,多擅长处理连续型变量,并不适合做分类变量或定序变量的百分比列联表。

直到,我发现table命令,终于解决将多维列联表输出在一个表中的问题。

table命令简介

[R] table -- Table of frequencies, summaries, and command results

table是非常灵活的命令,主要做变量描述性统计和列联表。

老版的table命令介绍,参见《Stata:今天你 “table” 了吗?》

这里介绍的是STATA17的新版table命令。

语法

table (rowspec) (colspec) [(tabspec)] [if] [in] [weight] [, options]

rowspeccolspectabspec 可以为空,可以是变量名,也可以是关键词。

常用关键词:

关键词 描述
result requested statistics
var variables from statistic() option
across index across() specifications
colname column names for matrix statistics
rowname row names for matrix statistics
command index option command()
statcmd index options statistic() and command()

基本概念

1. 布局

一个表的布局是由行、列和表的维度构成。rowspeccolspectabspec统称为表的“布局”(layout)

例如,我们指定变量名来定义行,并将统计信息放在列中,反之亦然。

2.关键词

table可以包含这么多不同的统计信息,我们可以指定 关键词keywords)来唯一标识从命令中收集到的结果和表计算出的统计信息。

如果我们在layout中忽略了一个必要的关键字,table将自动填充一个。

3. 关键词使用规则

决定关键词对于唯一标识表中的值是否必需的规则如下:

  1. 如果指定了多个统计信息,则在布局中使用result
  2. 如果在选项statistic()中指定了多个变量,而没有指定选项command(),则在布局中使用var
  3. 如果比率统计使用了多个across(),那么在布局中使用across
  4. 如果指定了选项 command(),则布局中需要使用colname。另外,如果还在选项statistic()中指定了多个变量,则需要colname而不是2中要求的var
  5. 如果指定了多个command()选项,而未指定选项statistic(),则需要在布局中使用command
  6. 如果同时指定了选项command()和statistic(),那么布局中使用statcmd

如果我们没有在rowspeccolspectabspec中直接指定一个必要的关键字,则缺少的关键字将被自动添加到布局中,如下所示:

  1. 如果行规范为空,则将缺少的关键字放入rowspec中。
  2. 如果行规范不为空,但列规范为空,则将缺失的关键字放入colspec中。
  3. 如果行和列的规范不为空,但表的高维规范为空,并且result是唯一缺少的关键字,并且只有一个统计信息(result),那么将result放入tabspec
  4. 否则,将缺少的关键字附加到rowvars

下面演示关键字在布局中的使用规则:

sysuse auto
describe
//指定统计信息放在行
table (result) rep78, statistic(mean mpg) statistic(sd mpg)
//省略关键词,命令与上面等价
table () rep78, statistic(mean mpg) statistic(sd mpg)
//行不空,而列空,则统计信息放在列。
table rep78, statistic(mean mpg) statistic(sd mpg)

running: C:\Program Files\Stata17\StataMP-64.exe  /q /e do "C:\Users\wintryheart\Desktop\stata238026c06106.do" 
stata output from unnamed-chunk-1
Contains data from C:\PROGRA~1\Stata17\ado\base/a/auto.dta
 Observations:            74                  1978 automobile data
    Variables:            12                  13 Apr 2020 17:45
                                              (_dta has notes)
-------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------
make            str18   %-18s                 Make and model
price           int     %8.0gc                Price
mpg             int     %8.0g                 Mileage (mpg)
rep78           int     %8.0g                 Repair record 1978
headroom        float   %6.1f                 Headroom (in.)
trunk           int     %8.0g                 Trunk space (cu. ft.)
weight          int     %8.0gc                Weight (lbs.)
length          int     %8.0g                 Length (in.)
turn            int     %8.0g                 Turn circle (ft.)
displacement    int     %8.0g                 Displacement (cu. in.)
gear_ratio      float   %6.2f                 Gear ratio
foreign         byte    %8.0g      origin     Car origin
-------------------------------------------------------------------------------
Sorted by: foreign


-------------------------------------------------------------------------------------
                   |                         Repair record 1978                      
                   |         1          2          3          4          5      Total
-------------------+-----------------------------------------------------------------
Mean               |        21     19.125   19.43333   21.66667   27.36364   21.28986
Standard deviation |  4.242641   3.758324   4.141325    4.93487   8.732385   5.866408
-------------------------------------------------------------------------------------


-------------------------------------------------------------------------------------
                   |                         Repair record 1978                      
                   |         1          2          3          4          5      Total
-------------------+-----------------------------------------------------------------
Mean               |        21     19.125   19.43333   21.66667   27.36364   21.28986
Standard deviation |  4.242641   3.758324   4.141325    4.93487   8.732385   5.866408
-------------------------------------------------------------------------------------


---------------------------------------------------
                   |      Mean   Standard deviation
-------------------+-------------------------------
Repair record 1978 |                               
  1                |        21             4.242641
  2                |    19.125             3.758324
  3                |  19.43333             4.141325
  4                |  21.66667              4.93487
  5                |  27.36364             8.732385
  Total            |  21.28986             5.866408
---------------------------------------------------

主要选项

totals (totals)和nototals控制哪些总计将在表中显示。默认情况下,报告所有的总计。

statistic (statspec)指定要显示的统计信息。包括三类:

  1. 频率统计信息:stat(freqstat)
  2. 汇总统计信息:stat(sumstat varlist)
  3. 比率统计信息:stat(ratiostat [varlist] [, ratio_options]),

command(cmdspec)指定从其中收集结果的Stata命令。可
重复使用,从多个STATA命令收集结果。
commnad可以报告存储在r()e()中的STATA命令运行结果。(通过相关命令的help文件查看存储了哪些结果)

快速使用

sysuse auto, clear

//生成一个新的分类变量
gen mpg2=1
replace mpg2=2 if mpg>20

// 一维表
table mpg2
table mpg2, stat(freq) stat(percent)

//二维表
//交叉联合概率
table rep78 foreign, stat(percent) nformat(%5.2f)
//行边缘和为100%,在每个因子变量水平上的百分比。
table rep78, stat(fvpercent foreign)
table rep78, stat(fvpercent foreign mpg2) 

//三维表的两种方式
table (rep78) (foreign mpg2)
table (rep78) (foreign) (mpg2)
//因子交互
table rep78, stat(fvpercent foreign#mpg2) nototal


//带相关分析的表。命令pwcorr的相关矩阵存储在r(C)里。
table (rowname) (colname), command(r(C): pwcorr mpg weight displacement)

//带回归系数的表
table colname, command(regress mpg weight foreign)


running: C:\Program Files\Stata17\StataMP-64.exe  /q /e do "C:\Users\wintryheart\Desktop\stata23807a6d140e.do" 
stata output from auto2
(36 real changes made)


--------------------
        |  Frequency
--------+-----------
mpg2    |           
  1     |         38
  2     |         36
  Total |         74
--------------------


------------------------------
        |  Frequency   Percent
--------+---------------------
mpg2    |                     
  1     |         38     51.35
  2     |         36     48.65
  Total |         74    100.00
------------------------------


-------------------------------------------------
                   |           Car origin        
                   |  Domestic   Foreign    Total
-------------------+-----------------------------
Repair record 1978 |                             
  1                |      2.90               2.90
  2                |     11.59              11.59
  3                |     39.13      4.35    43.48
  4                |     13.04     13.04    26.09
  5                |      2.90     13.04    15.94
  Total            |     69.57     30.43   100.00
-------------------------------------------------


----------------------------------------
                   |      Car origin    
                   |  Domestic   Foreign
-------------------+--------------------
Repair record 1978 |                    
  1                |    100.00      0.00
  2                |    100.00      0.00
  3                |     90.00     10.00
  4                |     50.00     50.00
  5                |     18.18     81.82
  Total            |     69.57     30.43
----------------------------------------


--------------------------------------------------------
                   |      Car origin            mpg2    
                   |  Domestic   Foreign       1       2
-------------------+------------------------------------
Repair record 1978 |                                    
  1                |    100.00      0.00   50.00   50.00
  2                |    100.00      0.00   62.50   37.50
  3                |     90.00     10.00   66.67   33.33
  4                |     50.00     50.00   33.33   66.67
  5                |     18.18     81.82   36.36   63.64
  Total            |     69.57     30.43   52.17   47.83
--------------------------------------------------------


------------------------------------------------------------------------
                   |                      Car origin                    
                   |      Domestic          Foreign           Total     
                   |        mpg2             mpg2              mpg2     
                   |   1    2   Total   1    2   Total    1    2   Total
-------------------+----------------------------------------------------
Repair record 1978 |                                                    
  1                |   1    1       2                     1    1       2
  2                |   5    3       8                     5    3       8
  3                |  20    7      27        3       3   20   10      30
  4                |   6    3       9        9       9    6   12      18
  5                |        2       2   4    5       9    4    7      11
  Total            |  32   16      48   4   17      21   36   33      69
------------------------------------------------------------------------


mpg2 = 1
------------------------------------------------
                   |          Car origin        
                   |  Domestic   Foreign   Total
-------------------+----------------------------
Repair record 1978 |                            
  1                |         1                 1
  2                |         5                 5
  3                |        20                20
  4                |         6                 6
  5                |                   4       4
  Total            |        32         4      36
------------------------------------------------

mpg2 = 2
------------------------------------------------
                   |          Car origin        
                   |  Domestic   Foreign   Total
-------------------+----------------------------
Repair record 1978 |                            
  1                |         1                 1
  2                |         3                 3
  3                |         7         3      10
  4                |         3         9      12
  5                |         2         5       7
  Total            |        16        17      33
------------------------------------------------

mpg2 = Total
------------------------------------------------
                   |          Car origin        
                   |  Domestic   Foreign   Total
-------------------+----------------------------
Repair record 1978 |                            
  1                |         2                 2
  2                |         8                 8
  3                |        27         3      30
  4                |         9         9      18
  5                |         2         9      11
  Total            |        48        21      69
------------------------------------------------


-------------------------------------------------------------
                   |                 Car origin              
                   |  Domestic   Domestic   Foreign   Foreign
                   |                    mpg2                 
                   |         1          2         1         2
-------------------+-----------------------------------------
Repair record 1978 |                                         
  1                |     50.00      50.00      0.00      0.00
  2                |     62.50      37.50      0.00      0.00
  3                |     66.67      23.33      0.00     10.00
  4                |     33.33      16.67      0.00     50.00
  5                |      0.00      18.18     36.36     45.45
-------------------------------------------------------------


--------------------------------------------------------------------------------
                       |  Mileage (mpg)   Weight (lbs.)   Displacement (cu. in.)
-----------------------+--------------------------------------------------------
Mileage (mpg)          |              1       -.8071749                -.7056426
Weight (lbs.)          |      -.8071749               1                 .8948958
Displacement (cu. in.) |      -.7056426        .8948958                        1
--------------------------------------------------------------------------------


----------------------------
              |  Coefficient
--------------+-------------
Weight (lbs.) |    -.0065879
Car origin    |    -1.650029
Intercept     |      41.6797
----------------------------

列联表实战

sysuse auto, clear

//连续型变量描述
//foreign为分类变量(因子类型),其他变量为连续型变量 
table, stat(mean price-gear_ratio) stat(fvpercent foreign)

//连续变量分foreign类别描述
table (result) (foreign), stat(mean price-gear_ratio) stat(sd price-gear_ratio)
// 对于连续变量的概要统计比不上summary和tabstat好用。


//百分比列联表

//生成两个新的分类变量
egen mpg2=cut(mpg), group(2)  //2等分
egen weight4=cut(weight), group(4)  //4等分

//以mpg2为因变量,以weight4为自变量,以foreign为控制变量
//我们要观察一个变量随着另一个变量的变化而发生的变化。
//即,我们需要计算条件概率或相对概率。

//percent,计算的是交叉联合概率。
//第一种方式:借助across()选项,计算条件概率。
//注意,在自变量的每一类别内,所有因变量的类别的百分比分布合计必须等于100%。
table(mpg2) (weight4), stat(percent, across(mpg2))
//随着车重的增加,低油耗的车比例下降,高油耗的车比例上升。
//增加foreign为控制变量
table (foreign weight4) (mpg2) (result), statistic(percent, across(mpg2)) stat(freq) totals(foreign#weight4)

//第二种方式:使用fvpercent,计算条件概率。
//fvpercent,计算的是在因子类别下的百分比分布。
//我们可使用fvpercent计算条件概率,然后进行比较。
table weight4, stat(fvpercent mpg2) nototal
//或者,把因变量的分类放在行。
table ()(weight4), stat(fvpercent mpg2) nototal
//添加控制变量
table() (foreign weight4), stat(fvpercent mpg2) nototal 

running: C:\Program Files\Stata17\StataMP-64.exe  /q /e do "C:\Users\wintryheart\Desktop\stata23807bdf2781.do" 
stata output from auto3
Mean                     |          
  Price                  |  6165.257
  Mileage (mpg)          |   21.2973
  Repair record 1978     |  3.405797
  Headroom (in.)         |  2.993243
  Trunk space (cu. ft.)  |  13.75676
  Weight (lbs.)          |  3019.459
  Length (in.)           |  187.9324
  Turn circle (ft.)      |  39.64865
  Displacement (cu. in.) |  197.2973
  Gear ratio             |  3.014865
Factor variable percent  |          
  Car origin=Domestic    |     70.27
  Car origin=Foreign     |     29.73
------------------------------------


----------------------------------------------------------
                         |            Car origin          
                         |  Domestic    Foreign      Total
-------------------------+--------------------------------
Mean                     |                                
  Price                  |  6072.423   6384.682   6165.257
  Mileage (mpg)          |  19.82692   24.77273    21.2973
  Repair record 1978     |  3.020833   4.285714   3.405797
  Headroom (in.)         |  3.153846   2.613636   2.993243
  Trunk space (cu. ft.)  |     14.75   11.40909   13.75676
  Weight (lbs.)          |  3317.115   2315.909   3019.459
  Length (in.)           |  196.1346   168.5455   187.9324
  Turn circle (ft.)      |  41.44231   35.40909   39.64865
  Displacement (cu. in.) |  233.7115   111.2273   197.2973
  Gear ratio             |  2.806538   3.507273   3.014865
Standard deviation       |                                
  Price                  |  3097.104   2621.915   2949.496
  Mileage (mpg)          |  4.743297   6.611187   5.785503
  Repair record 1978     |   .837666   .7171372   .9899323
  Headroom (in.)         |  .9157578   .4862837   .8459948
  Trunk space (cu. ft.)  |  4.306288   3.216906   4.277404
  Weight (lbs.)          |  695.3637   433.0035   777.1936
  Length (in.)           |  20.04605   13.68255   22.26634
  Turn circle (ft.)      |  3.967582   1.501082   4.399354
  Displacement (cu. in.) |  85.26299   24.88054   91.83722
  Gear ratio             |  .3359556   .2969076   .4562871
----------------------------------------------------------




-----------------------------------------------------
        |                    weight4                 
        |       0        1        2        3    Total
--------+--------------------------------------------
mpg2    |                                            
  0     |            21.05    76.47    90.00    47.30
  1     |  100.00    78.95    23.53    10.00    52.70
  Total |  100.00   100.00   100.00   100.00   100.00
-----------------------------------------------------


Percent
---------------------------------------
            |            mpg2          
            |       0        1    Total
------------+--------------------------
Car origin  |                          
  Domestic  |                          
    weight4 |                          
      0     |           100.00   100.00
      1     |           100.00   100.00
      2     |   75.00    25.00   100.00
      3     |   90.00    10.00   100.00
  Foreign   |                          
    weight4 |                          
      0     |           100.00   100.00
      1     |   44.44    55.56   100.00
      2     |  100.00            100.00
---------------------------------------

Frequency
------------------------------
            |        mpg2     
            |   0    1   Total
------------+-----------------
Car origin  |                 
  Domestic  |                 
    weight4 |                 
      0     |        6       6
      1     |       10      10
      2     |  12    4      16
      3     |  18    2      20
  Foreign   |                 
    weight4 |                 
      0     |       12      12
      1     |   4    5       9
      2     |   1            1
------------------------------


-------------------------
        |       mpg2     
        |      0        1
--------+----------------
weight4 |                
  0     |   0.00   100.00
  1     |  21.05    78.95
  2     |  76.47    23.53
  3     |  90.00    10.00
-------------------------


----------------------------------------
       |              weight4           
       |       0       1       2       3
-------+--------------------------------
mpg2=0 |    0.00   21.05   76.47   90.00
mpg2=1 |  100.00   78.95   23.53   10.00
----------------------------------------


-------------------------------------------------------------------
       |                          Car origin                       
       |              Domestic                      Foreign        
       |              weight4                       weight4        
       |       0        1       2       3        0       1        2
-------+-----------------------------------------------------------
mpg2=0 |    0.00     0.00   75.00   90.00     0.00   44.44   100.00
mpg2=1 |  100.00   100.00   25.00   10.00   100.00   55.56     0.00
-------------------------------------------------------------------
上一篇下一篇

猜你喜欢

热点阅读