NOTE_NLP表
未完成
The NOTE_NLP table will encode all output of NLP on clinical notes. Each row represents a single extracted term from a note.
NOTE_NLP表将对临床记录上的NLP所得的所有输出进行编码。每行代表一个笔记中的单个提取术语。
Field | Required | Type | Description | 描述 |
---|---|---|---|---|
note_nlp_id | Yes | integer | A unique identifier for each term extracted from a note. | 从注释中提取的每个术语的唯一标识符。 |
note_id | Yes | integer | A foreign key to the Note table note the term was | Note表的外键注意术语是 |
section_concept_id | Yes | integer | A foreign key to the predefined Concept in the Standardized Vocabularies representing the section of the extracted term. | 标准词汇表中预定义概念的外键,表示提取术语的部分。 |
snippet | No | varchar(250) | A small window of text surrounding the term. | 围绕该术语的一小段文字。 |
offset | No | varchar(50) | Character offset of the extracted term in the input note. | 输入注释中提取的术语的字符偏移量。 |
lexical_variant | Yes | varchar(250) | Raw text extracted from the NLP tool. | 从NLP工具中提取的原始文本。 |
note_nlp_concept_id | Yes | integer | A foreign key to the predefined Concept in the Standardized Vocabularies reflecting the normalized concept for the extracted term. Domain of the term is represented as part of the Concept table. | 标准词汇表中预定义概念的外键,反映了提取术语的规范化概念。该术语的域表示为Concept表的一部分。 |
note_nlp_source_concept_id | Yes | integer | A foreign key to a Concept that refers to the code in the source vocabulary used by the NLP system | Concept的外键,指的是NLP系统使用的源词汇表中的代码 |
nlp_system | No | varchar(250) | Name and version of the NLP system that extracted the term.Useful for data provenance. | 提取该术语的NLP系统的名称和版本。对于数据来源有用。 |
nlp_date | Yes | date | The date of the note processing.Useful for data provenance. | 笔记处理的日期。对数据来源有用。 |
nlp_datetime | No | datetime | The date and time of the note processing. Useful for data provenance. | 票据处理的日期和时间。对数据来源有用。 |
term_exists | No | varchar(1) | A summary modifier that signifies presence or absence of the term for a given patient. Useful for quick querying. | 总结修饰符,表示给定患者的术语的存在与否。用于快速查询。 |
term_temporal | No | varchar(50) | An optional time modifier associated with the extracted term. (for now “past” or “present” only). Standardize it later. | 与提取的术语关联的可选时间修饰符。(现在只是“过去”或“现在”)。稍后将其标准化。 |
term_modifiers | No | varchar(2000) | A compact description of all the modifiers of the specific term extracted by the NLP system. (e.g. “son has rash” ? “negated=no,subject=family, certainty=undef,conditional=false,general=false”). | NLP系统提取的特定术语的所有修饰符的简洁描述。(例如“儿子有皮疹”?“否定=否,受试者=家庭,确定性=未定,条件=假,一般=假”)。 |
共识
No. | Convention Description | 共识 |
---|---|---|
1 | Term_exists is defined as a flag that indicates if the patient actually has or had the condition. Any of the following modifiers would make Term_exists false: Negation = trueSubject = [anything other than the patient]Conditional = true/li>Rule_out = trueUncertain = very low certainty or any lower certaintiesA complete lack of modifiers would make Term_exists true. For the modifiers that are there, they would have to have these values: Negation = falseSubject = patientConditional = falseRule_out = falseUncertain = true or high or moderate or even low (could argue about low) | Term_exists被定义为指示患者是否确实患有或患有该病症的标志。以下任何修饰符都会使Term_exists为false:否定=真主题= [除患者以外的任何事物]Conditional = true / li>Rule_out = true不确定=非常低的确定性或任何较低的确定性完全没有修饰符会使Term_exists成为现实。对于那里的修饰符,它们必须具有以下值:否定=假受试者=患者条件=假Rule_out = false不确定=真或高或中等甚至低(可以争论低) |
2 | Term_temporal is to indicate if a condition is “present” or just in the “past”. The following would be past: History = trueConcept_date = anything before the time of the report | Term_temporal用于指示条件是“存在”还是仅存在于“过去”中。以下将过去:历史=真Concept_date =报告时间之前的任何内容 |
3 | Term_modifiers will concatenate all modifiers for different types of entities (conditions, drugs, labs etc) into one string. Lab values will be saved as one of the modifiers. A list of allowable modifiers (e.g., signature for medications) and their possible values will be standardized later. | Term_modifiers将不同类型的实体(条件,药物,实验室等)的所有修饰符连接成一个字符串。Lab值将保存为其中一个修饰符。允许的修饰符列表(例如,药物的签名)及其可能的值将在稍后标准化。 |
本系列在介绍目前世界上最适用于临床科研+卫生经济学的标准医疗大数据格式(未经严谨考证,但有相关研究发表在专业期刊上),俨然是真实世界研究方案里面最接进成熟的基础建设方案。感兴趣的介绍请移步B站观看视频。
OHDSI——观察性健康医疗数据科学与信息学,是一个世界性的公益型非盈利研究联盟,主要研究全方位医学大数据分析的开源解决方案,旨在通过大规模数据分析和挖掘来提升临床医学数据价值,实现跨学科、跨行业的多方合作。目前,目前,已有来自美国、加拿大、澳大利亚、英国等几十个国家地区的上百个组织机构,高校,医院和公司企业参与了OHDSI全球协作网络,如斯坦福、哈佛、杜克大学医学院,强生、诺华、甲骨文、IBM公司,拥有超过6亿人口的临床数据规模,累计协作研究发表了上百篇论文。
我们在这里邀请国内对相关工作感兴趣、愿共同学习的好学人士参与到中文兴趣小组,互通有无,一起弥补跨行业、跨学科的知识积累。前期主要以对OHDSI在github上的开源工作进行翻译、交流、学习为主,并会对医疗大数据、医学统计学、生物信息学等学科知识建立学习互助、互督的机制。有兴趣的请看文档,微信群二维码在内:OHDSI中文兴趣小组共识&OHDSI介绍
OHDSI秉持开源、开放的宗旨,加快全球医学数据研究的步伐,本文内容原创来自Github(https://github.com/OHDSI/CommonDataModel/wiki),若有利益冲突,请在本页面留言,真-侵删。