DICOM数据操作指南
DICOM数据
大家只要接触过医疗影像,那么对于DICOM数据就不会陌生。DICOM数据是常用的医疗影像存储格式,比如CT,X光。
DICOM数据操作的环境
语言:Python
库:pydicom
PyDICOM础数据格式介绍
- Dataset
dataset.Dataset
is the main object you will work with directly. Dataset is derived from Python’sdict
, so it inherits (and overrides some of) the methods ofdict
. In other words, it is a collection of key:value pairs, where the key is the DICOM (group,element) tag (as a Tag object, described below), and the value is a DataElement instance (also described below).
- DataElement
The
dataelem.DataElement
class is not usually used directly in user code, but is used extensively bydataset.Dataset
.dataelem.DataElement
is a simple object which stores the following things:
- tag – a DICOM tag (as a Tag object)
- VR – DICOM value representation – various number and string formats, etc
- VM – value multiplicity. This is 1 for most DICOM tags, but can be multiple, e.g. for coordinates. You do not have to specify this, the DataElement class keeps track of it based on value.
- value – the actual value. A regular value like a number or string (or list of them), or a Sequence.
- Tag
The Tag class is derived from Python’s int, so in effect, it is just a number with some extra behaviour:
- Tag enforces that the DICOM tag fits in the expected 4-byte (group,element)
- A Tag instance can be created from an int or from a tuple containing the (group,element) separately:
>>> from pydicom.tag import Tag
>>> t1=Tag(0x00100010) # all of these are equivalent
>>> t2=Tag(0x10,0x10)
>>> t3=Tag((0x10, 0x10))
>>> t1
(0010, 0010)
>>> t1==t2, t1==t3
(True, True)
- Tag has properties group and element (or elem) to return the group and element portions
- The is_private property checks whether the tag represents a private tag (i.e. if group number is odd).
- Sequence
Sequence is derived from Python’s list. The only added functionality is to make string representations prettier. Otherwise all the usual methods of list like item selection, append, etc. are available.
基础数据操作
读取dicom数据
import pydicom
meta_data = pydicom.dcmread("dicom文件路径")
由于pydicom库将读取的dicom数据转换为类对象。所以一些常用的dicom meat信息可以直接使用工厂方法读取其属性。下面介绍一部分很常用的dicom属性
#使用dir方法可以查看meta_data类中的相关属性
print(dir(meta_data))
########### 结果 ##########
['AccessionNumber', 'AcquisitionDate', 'AcquisitionNumber',
'AcquisitionTime', 'BitsAllocated', 'BitsStored', 'Columns',
'ContentDate', 'ContentTime', 'ContrastBolusAgent',
'ContrastBolusRoute', 'ContributingEquipmentSequence',
'ConvolutionKernel', 'DataCollectionDiameter', 'ExposureTime',
'FrameOfReferenceUID', 'GantryDetectorTilt', 'HighBit',
'ImageOrientationPatient', 'ImagePositionPatient', 'ImageType',
'InstanceCreationDate', 'InstanceCreationTime', 'InstanceNumber',
'InstitutionAddress', 'InstitutionName', 'KVP', 'Manufacturer',
'ManufacturerModelName', 'Modality', 'OperatorsName', 'PatientAge',
'PatientBirthDate', 'PatientID', 'PatientName', 'PatientPosition',
'PatientSex', 'PatientWeight', 'PhotometricInterpretation', 'PixelData',
'PixelRepresentation', 'PixelSpacing', 'PositionReferenceIndicator',
'ReconstructionDiameter', 'ReferringPhysicianName',
'RelatedSeriesSequence', 'RescaleIntercept', 'RescaleSlope', 'Rows',
'SOPClassUID', 'SOPInstanceUID', 'SamplesPerPixel', 'ScanOptions',
'SeriesDate', 'SeriesDescription', 'SeriesInstanceUID', 'SeriesNumber',
'SeriesTime', 'SliceLocation', 'SliceThickness', 'SpecificCharacterSet',
'StationName', 'StudyDate', 'StudyDescription', 'StudyID',
'StudyInstanceUID', 'StudyTime', 'TemporalPositionIndex',
'WindowCenter', 'WindowWidth', 'XRayTubeCurrent', '__contains__',
'__delattr__', '__delitem__', '__dir__', '__enter__', '__eq__', '__exit__',
'__format__', '__getattr__', '__getattribute__', '__getitem__', '__init__',
'__iter__', '__len__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__',
'__str__', '__subclasshook__', '__weakref__', '_character_set',
'_dataset_slice', '_pretty_str', '_slice_dataset', 'add', 'add_new', 'clear',
'convert_pixel_data', 'data_element', 'decode', 'decompress', 'dir',
'elements', 'ensure_file_meta', 'fix_meta_info', 'formatted_lines', 'get',
'get_item', 'group_dataset', 'is_original_encoding', 'iterall', 'keys',
'pixel_array', 'pop', 'popitem', 'remove_private_tags', 'save_as',
'set_original_encoding', 'setdefault', 'top', 'trait_names', 'update',
'values', 'walk']
常用dicom属性方法
- PixelData - 存储了dicom中图像信息(原始二进制文件)
- PixelSpacing - 每个像素点实际的长度与宽度,单位(mm)
- SliceThickness - 每层切片的厚度,单位(mm)
- SliceLocation - 读取的dicom文件所在的Z轴位置。
PS:如果是一个case的文件夹,可以通过这个meta信息对该case的切片进行排序。- Rows - 该dicom数据的长度
- Cols - 该dicom数据的宽度
得到dicom的图像
# 原始二进制文件
pixel_bytes = meta_data.PixelData
# pixel_bytes 通常没法直接查看
# CT值组成了一个矩阵
pix = meta_data.pixel_array
# 查看dicom的图像
import numpy as np
import matplotlib.pyplot as plt
# 如果使用jupyter notebook需要加上下面这句
%matplotlib inline
plt.figure()
plt.imshow(pix)
plt.show()
修改dicom CT值矩阵
# 对pix相关操作.PS:此时pix的类型是numpy.array只要使用相关numpy的操作来操作pix矩阵就行
# 接下来要将操作完的结果保存回去
# 此时需要修改的是dicom数据中的原始二进制数据
meta_data.PixelData = pix.tobytes()
meta_data.Rows, meta_data.Columns = pix.shape
修改meta数据中的其他属性
- 增
meta_data.add(data_element格式)
- 删
delattr(meta_data, '属性值')
del meta_data.属性值
- 改
meta_data.属性值 = 新的值
meta_data[字典key] = data_element格式的数据