制作 table 1 的经验 (by stata and R)_

2019-12-04  本文已影响0人  liang_rujiang

INTRODUCTION

Table 1 是描述研究对象基本信息的一张表,在各个研究中被经常使用。其中必须的部分是描述性统计(集中趋势、离散趋势、频数、频率),可选的部分是不同组之间的对比(ttest,F-test)。stata中有大量的包可以帮我们做到这一点,但是这些包的输出和偏医学类的结果表还是有点差别,需要手动调整。

使用stata时,

以上过程过于枯燥,且可能在抄写过程中带来差错,iebaltab可以在一定程度上帮助我们。请运行下面的例子观察。当然,首先安装该包ssc install ietoolkit

一些特点:

总的来说,鄙人的经验是R的tableone包更为强大
目前我的经验中,没有好的方法显示T-statistic,有懂得的大神欢迎评论区分享经验(我自己找到了,见下面,先安装包ssc install asdoc(14号凌晨四点半更新, t2docx看起来也好用,但只在15以上stata运行,我用14.1,无法测试,就不说了。))

sysuse auto, clear
asdoc, row(t-value)
foreach i of varlist price-wei {
ttest `i', by(for)
asdoc, row(`r(t)')
}

一个完整且比较美好的例子

sysuse auto, clear
cap rm Myfile.doc
asdoc tabstat price-we, by(for) stat(mean sd) dec(3)

asdoc, row(t-value, p-value)
foreach i of varlist price-wei {
ttest `i', by(for)
asdoc, row(`r(t)', `r(p)') dec(3)
}

输出如下

图片.png
稍微修改一下
图片.png

EXAMPLES WITH IEBALTAB IN STATA

set more off
sysuse auto, clear

des
fmiss
drop rep78
des

iebaltab price headroom length, grpvar(foreign) save(temp) replace onerow pt std format(%7.2f)
* onerow displays the number of observations in additional row at the bottom of the table if 
* each group has the same number of observations for all variables in balancevarlist.
* pttest makes this command show p-values instead of difference-in-mean 
* between the groups in the column for t-tests.
* stdev displays standard deviations in parenthesis instead of standard errors

gen grp = mod(_n, 3)
tab1 grp

iebaltab price headroom length, grpvar(grp) save(temp) replace pt std onerow
iebaltab price headroom length, grpvar(grp) save(temp) replace pt std onerow co(1)
* control(groupcode)  One group is tested against all other groups in t-tests and F-tests. 
* Default is all groups against each other.
iebaltab price headroom length, grpvar(grp) save(temp) replace pt std onerow co(1) ftest
* I do not know what ftest mean.
iebaltab price headroom length, grpvar(grp) save(temp) replace pt std onerow co(1) feqt pf
* using feqt and pf options, I get p-values of f-test. only using feqt option, I get F-measures.

/*a provoking example*/
global project_folder "C:\Users\project\baseline\results"
iebaltab outcome_variable, grpvar(treatment_variable) save("$project_folder\balancetable.xlsx")

EXAMPLES WITH TABLEONE IN R

data <- row_data
data$age %>% hist

data$for_duration %>% hist
data$for_income %>% hist

data$sbp %>% hist
data$dbp %>% hist

data %>% summarise(age_mean = mean(age),
                   age_sd = sd(age),
                   income_median = median(for_income),
                   income_iqr = IQR(for_income),
                   duration_median = median(for_duration),
                   duration_iqr = IQR(for_duration),
                   sbp_mean = mean(sbp),
                   sbp_sd = sd(sbp),
                   dbp_mean = mean(dbp),
                   dbp_sd = sd(dbp)) %>% 
    gather()

cat_des <- function(df, chr) {
    out <- vector("list", length = 2)
    out[[1]] <- table(df[[chr]])
    out[[2]] <- prop.table(table(df[[chr]]))
    out
}

data %>% 
    discard(is.numeric) %>%
    names() %>%
    map(~cat_des(data, .)) %>% 
    map(2)
cat_des(data, "bloodlevel")
cat_des(data, "adherence")


# # -----------------------------------------------------------------------

data %>% filter(adherence == 'nonad') %>% 
     summarise(age_mean = mean(age),
                  age_sd = sd(age),
                  income_median = median(for_income),
                  income_iqr = IQR(for_income),
                  duration_median = median(for_duration),
                  duration_iqr = IQR(for_duration),
                  sbp_mean = mean(sbp),
                  sbp_sd = sd(sbp),
                  dbp_mean = mean(dbp),
                  dbp_sd = sd(dbp)) %>% 
    gather()

data %>%  
    discard(is.numeric) %>%
    names() %>%
    map(~cat_des(filter(data, adherence == "nonad"), .)) %>% 
    map(2)
cat_des(filter(data, adherence == "nonad"), "bloodlevel")

# VERY IMPORTANT 02May2019 ------------------------------------------------
library(tableone)
vars <- setdiff(names(data), "adherence")
vars # change order of charistics
vars <- c("age", "gender", "education", "urbanity", "for_income", 
          "t", "for_duration", "cliniccheck", "sbp", "dbp", "diabete")
tableone <- CreateTableOne(data = data, vars = vars, strata = "adherence")
print(tableone, nonnormal = c("for_income", "for_duration"),
      explain = T, showAllLevels = T, catDigits = 2, quote = T)

vars <- c("age", "gender", "education", "urbanity", "for_income", 
          "t", "for_duration", "cliniccheck", "sbp", "dbp", "diabete")
CreateTableOne(data = data, vars = vars) %>% print(
    nonnormal = c("for_income", "for_duration"),
    showAllLevels = T, catDigits = 2, quote = T)

上一篇下一篇

猜你喜欢

热点阅读