This project utilizes the pydicom
and fastcore
libraries. It borrows ideas (and some code) from the fastai.medical.imaging
library (source).
The metadata preprocessing and series selection algorithm are recreated from the paper by Gauriau et al. (reference below), in which a Random Forest classifier is trained to predict the sequence type (e.g. T1, T2, FLAIR, ...) of series of images from brain MRI. Such a tool may be used to select the appropriate series of images for input into a machine learning pipeline.
Reference: Gauriau R, et al. Using DICOM Metadata for Radiological Image Series Categorization: a Feasibility Study on Large Clinical Brain MRI Datasets. Journal of Digital Imaging. 2020 Jan; 33:747–762. (link to paper)
git clone
the repositorycd
into the repopip install .
(include the-e
flag for an editable install)
Read a DICOM file:
from pydicom.data import get_testdata_file
path = Path(get_testdata_file("MR_truncated.dcm"))
ds = path.dcmread()
ds.file_meta
Import a select subset of DICOM metadata into a pandas.DataFrame
. The subset is defined in dicomtools.core
and is based on the metadata used for the series selection algorithm in the paper referenced above.
df = pd.DataFrame.from_dicoms([path]).drop('fname', axis=1)
df.T