WDL-第2学
2021-03-24 本文已影响0人
MR来了
本节主要讲述如何配置Cromwell和运行一个简单的hello world
的WDL脚本,主要分下面:
本节内容来自Five minute to Cromwell
- 2.1 配置Cromwell
- 2.2 写hello world 的WDL脚本
- 2.3 检查和运行第一个WDL脚本
2.1 配置Cromwell
本章节下载的是Cromwell 版本 57 中的 cromwell-57.jar
和 womtool-57.jar
cd your_workspace
mkdir cromwell
cp your_path/Downloads/cromwell-57.jar cromwell/
cp your_path/Downloads/womtool-57.jar cromwell/
cd cromwell/
2.2 写hello world 的WDL脚本
WDL脚本一般包括5个部分:
-
workflow
:工作流定义 -
task
:工作流包含的任务定义 -
call
:调用或触发工作流里面的 task 执行 -
command
:task在计算节点上要执行的命令行 -
runtime
:task在计算节点上的运行时参数,包括 CPU、内存、docker 镜像等 -
output
:task 或 workflow 的输出定义
wt.png
workflow myWorkflow {
call myTask
}
task myTask {
command {
echo "hello world"
}
output {
String out = read_string(stdout())
}
}
上述内容保存为myWorkflow.wdl
文件,可以先不用关细节,看大致理解,后面章节再详细讲解。
2.3 检查和运行第一个WDL脚本
用run mode
, 等有一定WDL基础再讲后续的server mode
womtool.jar
可以检查写的WDL是否有问题,但是报错信息没有类似perl
或者python
那么详细清楚。womtool.jar
还有其他作用包括输出json
模板,详见
## 调用womtool-57.jar检查wdl
java -jar womtool-57.jar validate myWorkflow.wdl
## 建议安装conda环境来处理
conda install -c bioconda womtool
womtool validate myWorkflow.wdl
cromwell.jar
执行WDL脚本
## 运行命令,之前可以先chmod 755 cromwell-57.jar
java -jar cromwell-57.jar run myWorkflow.wd
## 输出信息,非常多,重点看output和id部分
[2021-03-24 12:14:22,69] [info] Running with database db.url = jdbc:hsqldb:mem:27d3f637-78fb-4c04-8616-dcb1af4f205f;shutdown=false;hsqldb.tx=mvcc
[2021-03-24 12:14:38,22] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2021-03-24 12:14:38,26] [info] [RenameWorkflowOptionsInMetadata] 100%
[2021-03-24 12:14:38,54] [info] Running with database db.url = jdbc:hsqldb:mem:e01dfafb-a775-44c0-bd5a-2e892cc29103;shutdown=false;hsqldb.tx=mvcc
[2021-03-24 12:14:39,51] [info] Slf4jLogger started
[2021-03-24 12:14:39,88] [info] Workflow heartbeat configuration:
{
"cromwellId" : "cromid-7316809",
"heartbeatInterval" : "2 minutes",
"ttl" : "10 minutes",
"failureShutdownDuration" : "5 minutes",
"writeBatchSize" : 10000,
"writeThreshold" : 10000
}
[2021-03-24 12:14:39,97] [info] Metadata summary refreshing every 1 second.
[2021-03-24 12:14:40,05] [warn] 'docker.hash-lookup.gcr-api-queries-per-100-seconds' is being deprecated, use 'docker.hash-lookup.gcr.throttle' instead (see reference.conf)
[2021-03-24 12:14:40,06] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2021-03-24 12:14:40,07] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2021-03-24 12:14:40,08] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2021-03-24 12:14:40,76] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2021-03-24 12:14:40,91] [info] SingleWorkflowRunnerActor: Version 57
[2021-03-24 12:14:40,93] [info] SingleWorkflowRunnerActor: Submitting workflow
[2021-03-24 12:14:41,01] [info] Unspecified type (Unspecified version) workflow f28db7de-821c-4f82-8084-7a808bc0c781 submitted
[2021-03-24 12:14:41,05] [info] SingleWorkflowRunnerActor: Workflow submitted f28db7de-821c-4f82-8084-7a808bc0c781
[2021-03-24 12:14:41,07] [info] 1 new workflows fetched by cromid-7316809: f28db7de-821c-4f82-8084-7a808bc0c781
[2021-03-24 12:14:41,10] [info] WorkflowManagerActor Starting workflow f28db7de-821c-4f82-8084-7a808bc0c781
[2021-03-24 12:14:41,12] [info] WorkflowManagerActor Successfully started WorkflowActor-f28db7de-821c-4f82-8084-7a808bc0c781
[2021-03-24 12:14:41,12] [info] Retrieved 1 workflows from the WorkflowStoreActor
[2021-03-24 12:14:41,24] [info] WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes.
[2021-03-24 12:14:41,56] [info] MaterializeWorkflowDescriptorActor [f28db7de]: Parsing workflow as WDL draft-2
[2021-03-24 12:14:44,79] [info] MaterializeWorkflowDescriptorActor [f28db7de]: Call-to-Backend assignments: myWorkflow.myTask -> Local
[2021-03-24 12:14:45,78] [info] Not triggering log of token queue status. Effective log interval = None
[2021-03-24 12:14:46,51] [info] WorkflowExecutionActor-f28db7de-821c-4f82-8084-7a808bc0c781 [f28db7de]: Starting myWorkflow.myTask
[2021-03-24 12:14:46,80] [info] Assigned new job execution tokens to the following groups: f28db7de: 1
[2021-03-24 12:14:47,04] [info] BackgroundConfigAsyncJobExecutionActor [f28db7demyWorkflow.myTask:NA:1]: echo "hello world"
[2021-03-24 12:14:47,18] [info] BackgroundConfigAsyncJobExecutionActor [f28db7demyWorkflow.myTask:NA:1]: executing: /bin/bash /Users/liji/cromwell/cromwell-executions/myWorkflow/f28db7de-821c-4f82-8084-7a808bc0c781/call-myTask/execution/script
[2021-03-24 12:14:50,14] [info] BackgroundConfigAsyncJobExecutionActor [f28db7demyWorkflow.myTask:NA:1]: job id: 28617
[2021-03-24 12:14:50,16] [info] BackgroundConfigAsyncJobExecutionActor [f28db7demyWorkflow.myTask:NA:1]: Status change from - to Done
[2021-03-24 12:14:51,65] [info] WorkflowExecutionActor-f28db7de-821c-4f82-8084-7a808bc0c781 [f28db7de]: Workflow myWorkflow complete. Final Outputs:
{
"myWorkflow.myTask.out": "hello world"
}
[2021-03-24 12:14:51,73] [info] WorkflowManagerActor WorkflowActor-f28db7de-821c-4f82-8084-7a808bc0c781 is in a terminal state: WorkflowSucceededState
[2021-03-24 12:14:58,13] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.
{
"outputs": {
"myWorkflow.myTask.out": "hello world"
},
"id": "f28db7de-821c-4f82-8084-7a808bc0c781"
}
[2021-03-24 12:15:00,18] [info] Workflow polling stopped
[2021-03-24 12:15:00,20] [info] 0 workflows released by cromid-7316809
[2021-03-24 12:15:00,20] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds
[2021-03-24 12:15:00,21] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds
[2021-03-24 12:15:00,21] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds
[2021-03-24 12:15:00,22] [info] JobExecutionTokenDispenser stopped
[2021-03-24 12:15:00,22] [info] Aborting all running workflows.
[2021-03-24 12:15:00,24] [info] WorkflowStoreActor stopped
[2021-03-24 12:15:00,24] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds
[2021-03-24 12:15:00,24] [info] WorkflowLogCopyRouter stopped
[2021-03-24 12:15:00,25] [info] WorkflowManagerActor All workflows finished
[2021-03-24 12:15:00,25] [info] WorkflowManagerActor stopped
[2021-03-24 12:15:00,76] [info] Connection pools shut down
[2021-03-24 12:15:00,76] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds
[2021-03-24 12:15:00,76] [info] Shutting down JobStoreActor - Timeout = 1800 seconds
[2021-03-24 12:15:00,77] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds
[2021-03-24 12:15:00,77] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds
[2021-03-24 12:15:00,77] [info] Shutting down DockerHashActor - Timeout = 1800 seconds
[2021-03-24 12:15:00,77] [info] Shutting down IoProxy - Timeout = 1800 seconds
[2021-03-24 12:15:00,77] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2021-03-24 12:15:00,77] [info] SubWorkflowStoreActor stopped
[2021-03-24 12:15:00,77] [info] JobStoreActor stopped
[2021-03-24 12:15:00,77] [info] CallCacheWriteActor stopped
[2021-03-24 12:15:00,78] [info] WriteMetadataActor Shutting down: 0 queued messages to process
[2021-03-24 12:15:00,78] [info] KvWriteActor Shutting down: 0 queued messages to process
[2021-03-24 12:15:00,78] [info] IoProxy stopped
[2021-03-24 12:15:00,78] [info] DockerHashActor stopped
[2021-03-24 12:15:00,79] [info] ServiceRegistryActor stopped
[2021-03-24 12:15:00,88] [info] Database closed
[2021-03-24 12:15:00,88] [info] Stream materializer shut down
[2021-03-24 12:15:00,89] [info] WDL HTTP import resolver closed