Collaborative Research Project 12
Principal investigators: Paul Kinahan (University of Washington), Andrey Fedorov (Harvard University), and Dan Sullivan (Duke University)
Determining COVID-19 image data quality, provenance, and harmonization.
Updated January 19, 2024
It is generally agreed that the development of accurate medical image analysis algorithms is critically dependent on access to large volumes of images. However, success in development and validation of algorithms is also impacted by the quality (bias and variance) of the data, as well as defining the relevant image quality parameters and their provenance. This becomes even more important when considering heterogeneous sources of data, such as those contributed to MIDRC, as well as the variety of different methods that can be implemented in each of the processing steps of the MIDRC data ingestion pipeline.
Additionally, with this large variety of input data, primarily DICOM images, and also the variety of collection methods, a needed but as yet fully defined field is developing an understanding of true data characteristics and their impact on final results and analysis methods. It is necessary to track the meta-information on the data itself (i.e., its provenance), including what is known or not known about the quantitative aspects of the image data at each step of ingestion into the MIDRC pipeline.
Closely related to providing useful and thoughtfully curated information about MIDRC imaging data to the end users, is the use of agreed upon terminology with defined meanings.This collaborative research project (CRP) is currently working to define MIDRC data quality standards, as they apply to study description parameters of contributed data modalities and body parts, and their associated MIDRC Data Explorer labels. CRP 12 examines structured DICOM data objects and processes, thoroughly investigates the internal MIDRC data harmonization processes and develops consistent terminology for structured DICOM data objects.
Current plans include
Develop structured DICOM data objects and processes for long COVID and beyond XR and CT.
Link to the MIDRC harmonization steps described in Project TDP3.
Develop consistent terminology structured DICOM data objects for long COVID and beyond XR and CT
Members:
Andrey Fedorov, PhD, Brigham and Women’s, Paul Kinahan, PhD, (lead), University of Washington, Zihan Li, University of Washington, Daniel Sullivan, MD, Duke University, William Shuman, MD, University of Washington, Emily Townley, AAPM