This project utilizes the pydicom and fastcore libraries. It borrows ideas (and some code) from the fastai.medical.imaging library (source).
The metadata preprocessing and series selection algorithm are recreated from the paper by Gauriau et al. (reference below), in which a Random Forest classifier is trained to predict the sequence type (e.g. T1, T2, FLAIR, ...) of series of images from brain MRI. Such a tool may be used to select the appropriate series of images for input into a machine learning pipeline.
Reference: Gauriau R, et al. Using DICOM Metadata for Radiological Image Series Categorization: a Feasibility Study on Large Clinical Brain MRI Datasets. Journal of Digital Imaging. 2020 Jan; 33:747–762. (link to paper)
git clonethe repositorycdinto the repopip install .(include the-eflag for an editable install)
Read a DICOM file:
from pydicom.data import get_testdata_file
path = Path(get_testdata_file("MR_truncated.dcm"))
ds = path.dcmread()
ds.file_meta
Import a select subset of DICOM metadata into a pandas.DataFrame. The subset is defined in dicomtools.core and is based on the metadata used for the series selection algorithm in the paper referenced above.
df = pd.DataFrame.from_dicoms([path]).drop('fname', axis=1)
df.T