【R>>survminer】生存分析+best-sep
2021-06-07 本文已影响0人
高大石头
肿瘤研究中,生存分析是逃不掉的一关,如果用R语言进行分析,survminer包是必备的法宝,下面就共同学习下survminer包的Cheat Sheet吧。
rm(list = ls())
library(survival)
library(survminer)
head(lung)
## inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
## 1 3 306 2 74 1 1 90 100 1175 NA
## 2 3 455 2 68 1 0 90 90 1225 15
## 3 3 1010 1 56 1 0 90 90 NA 15
## 4 5 210 2 57 1 1 90 60 1150 11
## 5 1 883 2 60 1 0 100 90 NA 0
## 6 12 1022 1 74 1 1 50 80 513 0
变量含义:
-
time:生存时间
-
status:生存状态
-
age:年龄
-
sex:性别
-
ph.ecog:ECOG分数
-
ph.karno:Karnofsky分数
-
meal.cal:每顿消耗的卡路里
-
wt.loss:最近六个月消耗的体重
1.生存曲线
简单版
fit <- survfit(Surv(time,status)~sex,data=lung)
ggsurvplot(fit,
pval = T,
conf.int = T)
image.png
备注:fun
参数有三个选项,pct
默认选项,生存概率百分比;event
累计生存事件百分比,cumhaz
累计生存事件风险。
ggsurvplot(fit,
pval = T,
fun = "event",
legend.title = "Sex",
legend.labs = c("Male","Female"),
conf.int = F)+
labs(title = "fun = event")
image.png
ggsurvplot(fit,
pval = T,
fun = "cumhaz",
legend.title = "Sex",
legend.labs = c("Male","Female"),
conf.int = F)+
labs(title = "fun = cumhaz")
image.png
美化版
ggsurvplot(fit,
pval = TRUE,
conf.int = T, #置信区间
fun = "pct",
risk.table = TRUE,
size = 1,
linetype = "strata",
palette = c("#E7B800","#2E9FDF"),
legend = "bottom", #legend位置
legend.title = "Sex",
legend.labs = c("Male","Female"))
image.png
2.Cox风险模型的假设诊断
核心函数:
-
cox.zph()
-
ggcoxzph()
-
ggcoxdiagnostics()
:展示残差结果
fit <- coxph(Surv(time,status)~age+sex,data = lung)
ftest <- cox.zph(fit)
ftest
## chisq df p
## age 0.209 1 0.65
## sex 2.608 1 0.11
## GLOBAL 2.771 2 0.25
ggcoxzph(ftest)
image.png
ggcoxdiagnostics(fit,
type = "deviance",
ox.scale = "linear.predictions")
image.png
3.Cox风险模型汇总
lung$age <- ifelse(lung$age > 70, ">70","<= 70")
fit <- coxph(Surv(time,status)~age+sex+ph.ecog,data = lung)
fit
## Call:
## coxph(formula = Surv(time, status) ~ age + sex + ph.ecog, data = lung)
##
## coef exp(coef) se(coef) z p
## age>70 0.3065 1.3587 0.1873 1.636 0.101747
## sex -0.5666 0.5675 0.1681 -3.370 0.000752
## ph.ecog 0.4700 1.5999 0.1129 4.163 3.15e-05
##
## Likelihood ratio test=31.59 on 3 df, p=6.387e-07
## n= 227, number of events= 164
## (因为不存在,1个观察量被删除了)
ggforest(fit)
image.png
ggadjustedcurves()显示的是选择因素如何影响估算的生存率,与KM方法不太相同。
lung$sex <- ifelse(lung$sex==1,"MALE","FEMALE")
fit <- coxph(Surv(time,status)~age+sex+ph.ecog,data = lung)
ggadjustedcurves(fit,variable = "sex",data = lung,method = "average")
image.png
rm(list = ls())
lung$age3 <- cut(lung$age,c(35,55,65,85))
fit <- coxph(Surv(time,status)~age+sex+ph.ecog,data = lung)
ggadjustedcurves(fit,variable = "age3",data = lung,method = "average")
image.png
4.最佳分割点
res.cut <- surv_cutpoint(lung, time = "time", event="status",
variables = "meal.cal")
plot(res.cut)
## $meal.cal
image.png
res.cat <- surv_categorize(res.cut)
head(res.cat)
## time status meal.cal
## 1 306 2 high
## 2 455 2 high
## 3 1010 1 <NA>
## 4 210 2 high
## 5 883 2 <NA>
## 6 1022 1 high
fit <- survfit(Surv(time,status)~meal.cal,data = res.cat)
ggsurvplot(fit,pval = T)
image.png
参考资料:
survminer package [Alboukadel Kassambara, Marcin Kosinski 2017]