Annotations and Labeling Working Group.
Last updated November 2023
Focuses on state-of-the-art annotation pipeline which automates the labeling process using natural language processing, large language models, and computer vision to generate pre-labeled imaging data upon ingestion to MIDRC.
Annotations are available in the MIDRC data explorer under the ‘Annotations’ tab.
• Annotations/Labels are often a required element in the performance of supervised learning in medical imaging
• When labeled by trained experts, annotations can serve as “ground truth” to train and subsequently test performance of trained AI models in supervised learning approaches.
• Labels can be generated by human experts, by semi-automated image analysis software, or by machine learning algorithms.
• Labels can be created at many levels: patient, exam, series, image, or selected pixels.
• As images and clinical data are ingested into MIDRC, a proportion of the image sets will be earmarked for labeling/markup prior to publication. The annotations will be linked to MIDRC’s public imaging data and will be incorporated into sequestered test/benchmarking data.