National Cancer Institute
Cancer Imaging Program

Lung Imaging Database Consortium (LIDC)

Preliminary clinical studies show that spiral CT scanning of the lungs can improve early detection of lung cancer in high-risk individuals. However, more clinical data are needed before public health recommendations can be made for population-based screening. Image processing algorithms have the potential to assist in lesion detection on spiral CT studies, and to assess the stability or change in lesion size on serial CT studies. The use of such computer-assisted algorithms could significantly enhance the sensitivity and specificity of spiral CT lung screening, as well as lower costs by reducing physician time needed for interpretation.

The intent of the Lung Imaging Database Consortium (LIDC) initiative is to support a consortium of institutions to develop consensus guidelines for a spiral CT lung image resource and to construct a database of spiral CT lung images. The investigators funded under this initiative are creating a set of guidelines and metrics for database use and developing a database as a test-bed and showcase for those methods. The database will be available to researchers and users through the Internet and will have wide utility as a research, teaching, and training resource.

Specifically, the LIDC initiative aims to provide:

  • a reference database for the relative evaluation of image processing or CAD algorithms and
  • a flexible query system that will provide investigators the opportunity to evaluate a wide range of technical parameters and de-identified clinical information within this database that may be important for research applications.

It is anticipated that this resource will stimulate further database development for image processing and CAD evaluation for applications that include cancer screening, diagnosis, and image guided intervention, and treatment. Therefore, the NCI encourages investigator-initiated grant applications that will utilize the database in their research. NCI also encourages investigator-initiated grant applications that provide tools or methodology that may improve or complement the mission of the LIDC.

See the Program Announcement: RFA: CA-01-001 LUNG IMAGE DATABASE RESOURCE FOR IMAGING RESEARCH 1 for more information.

Contact regarding programmatic issues:
Barbara Y. Croft, Ph.D.
Cancer Imaging Program, NCI
6130 Executive Blvd., Suite 6000
Bethesda, MD 20892
Telephone: 301-496-9531
Fax: 301-480-3507
E-mail: bc129b@nih.gov 2

Steering Committee

The consortium is guided by a steering committee consisting of 2 members from each of the five awarded institutions and 2 members from the Cancer Imaging Program.

LIDC Institutions

Cornell University 3
David Yankelevitz
dyankele@pop.med.cornell.edu 4

University of California, Los Angeles 5
Mike McNitt-Gray
mmcnittgray@mednet.ucla.edu 6

University of Chicago 7
Sam Armato
s-armato@uchicago.edu 8

University of Iowa 9
Geoffrey McLennan
Geoffrey-McLennan@Uiowa.edu 10

University of Michigan 11
Chuck Meyer
cmeyer@umich.edu 12

Mission & Goals

Mission

The mission of the Lung Image Database Consortium (LIDC) is sharing of lung images, especially low-dose helical CT scans of adults screened for lung cancer, and related technical and clinical data for development and testing of computer-aided cancer screening and diagnosis technology.

Principal Goals

To establish standard formats and processes for managing lung images and related technical and clinical data for use in the development and testing of computer-aided diagnostic algorithms.

To develop an image database as a web-accessible international research resource for the development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis using helical computed tomography (CT).

Implementation Objectives

1. The primary purpose of this project is to develop an image database for the evaluation of CAD methods for lung nodule detection and diagnosis using helical CT.
  1.1 The database is to contain helical CT images of representative cases selected from lung cancer screening studies or diagnostic studies. The database should enable the correlation of performance of CAD methods for detection and classification of lung nodules with spatial, temporal and pathological ground truth. The database is to be web-accessible by the imaging research community as soon as possible. It should provide a resource for the training and development of CAD methods. The Consortium will document evaluation metrics that are valid for various CAD tasks as reported in the literature and that can be used to assess investigator-developed CAD methods for lung nodule detection and classification (benign/malignant). These quantitative methods are intended to facilitate comparison of the relative performance of published CAD methods.
  1.2 The database is envisioned to be a single repository through which investigators can:
(a) define subsets of data for individual research purposes using a query system, and
(b) define consistent reference subsets of data to evaluate the relative performance of CAD methods using a specified or recommended query method.
  1.3 The fields in the relational database should have sufficient detail to allow a wide range of search parameters. All patient-identifying information will be removed and the data anonymized, so that such information cannot be tracked. Database fields should include, for example:
(a) specifications of the CT system used to generate the image and its image acquisition protocols;
(b) case parameters and reconstruction methods for representative normal and cancer cases, including serial studies that are important for evaluation of CAD methods;
(c) spatial and pathological ground truth data, to allow a cross-correlation with computed results;
(d) the possibility for use of a flexible query system to allow for the evaluation of other future performance parameters for CAD, other image processing methods and related observer studies.
2. Secondary goals for this database include:
  2.1 Storage of raw CT image data, to permit exploration of alternative image reconstruction methods or different reconstructed slice thicknesses that may affect the performance of CAD methods, or to permit the evaluation of CAD methods that incorporate the physical performance characteristics of the CT system,
  2.2 Storage of images acquired through other tomographic modalities such as PET, to explore improved means for classification of lung nodules, and
  2.3 Storage of digital chest images to permit the development and evaluation of CAD methods for this modality.

LIDC Committees

Below are the LIDC committees and corresponding members.

Spiral CT Scanning, Dose, and Quality Control Committee
Mike McNitt-Gray (Chair)
Sam Armato
Heber MacMahon
Ella Kazerooni
David Yankelevitz
Geoff McLennan

Inclusion/Exclusion Criteria Committee
Sam Armato (Chair)
Heber MacMahon
Ella Kazerooni
David Gur
Mike McNitt-Gray
Geoff McLennan

Spatial Ground Truth Issues Committee
Chuck Meyer (Chair)
David Gur
Eric Hoffman
Larry Clarke
Lori Dodd
Mike McNitt-Gray
Sam Armato
Tony Reeves

Pathologic Ground Truth Committee
Geoff McLennan (Chair) Local pathologists

Database Committee
Chuck Meyer (Chair)
Carey Floyd
Denise Warzel
Matthew Brown
Roger Engelmann
Sam Armato

Common Metrics; Software Tools Committee
Sam Armato (Chair)
Mike McNitt-Gray
Tony Reeves

Past Funding Opportunities Committee
Eric Hoffman (Chair) Tony Reeves

Internet Availability Committee
Eric Hoffman (Chair)
Tony Reeves
Chuck Meyer
Mike McNitt-Gray
Mike Vannier

Industry Liaison Committee
Claudia Henschke (Chair)
Deni Aberle
Larry Clarke
Eric Hoffman
Mike McNitt-Gray
Ella Kazerooni

FDA Liaison Committee
Nick Petrick
Bob Wagner
Geoff McLennan

ACRIN Liaison Committee
Carl Jaffe (Chair)
Deni Aberle
Geoff McLennan

Publications Committee
Geoff McLennan (Chair)
David Gur
Barbara Croft

Statistics Committee
Mike McNitt-Gray (Chair)
David Gur
Bob Wagner
Lori Dodd
Charlie Metz
Jim Sayre
Heang-Ping Chan
Sam Armato
Curt Langlotz
Geoff McLennan

NIH Group
Larry Clarke
Barbara Croft
Dan Sullivan
Carl Jaffe
Houston Baker
Terry Yoo
Abraham Levy

Reports, Presentations & Publications

Reports & Presentations: Lung Imaging 13
Reports and presentations produced for or as a result of CIP-supported workshops and other activities.

Publications: Lung Imaging 14
CIP publications that appeared in peer-reviewed journals or meeting proceedings.

Cornell University

Development of a CAD Assessment of a PN Database
David Yankelevitz, M.D.
Cornell University

Grant Number: U01CA091100

Lung cancer is the leading cause of cancer death worldwide, both in men and women, with an estimate of over 164,000 new cases and over 156,000 deaths in 2000 in the United States alone. A principal reason for this high mortality is that lung cancer typically is first detected at an advanced stage where the prospects for cure are quite low. However, in those cases where it is found in an early stage, the prospects for cure are quite high. Recognition of these facts is a primary driver behind the development of improved screening and diagnostic tools. We propose to form a collaborative group of institutions to develop a large, high-quality internet-accessible spiral computed tomography (CT) image database of pulmonary nodules. This will serve as an important resource for researchers interested in developing improved methods for early detection and screening for lung cancer.

Specifically this proposal plans to 1) develop the criteria for inclusion of nodules within the database, 2) develop ground truth or pathologic diagnosis of each nodule, 3) populate the database with the appropriate nodule candidates as described above, 4) develop common data elements (CDEs) to classify each case, 5) develop criteria for measuring performance standards of various CADs, and 6) develop an overall management plan for the consortium. The database developed in this consortium will be an important resource for research and teaching purposes. It will represent a standard that can be used for testing new CAD systems. With the rapid advances in computer science and engineering, a high-quality database that is continually evolving will be an invaluable resource.

The design of this research proposal is somewhat novel in that we aim not only to collaborate with others on the design and content of the image database, but intend also to attach demographic and pathologic data to each case so that a broad community of research can be served. Our overall management plan seeks to aggressively identify collaborative partners from a variety of sources including similar or related industry, for example the Visible Human Project. Working groups will include radiology, CAD development, and informatics and our outreach efforts will include patient advocacy and early users of the database.

University of California, Los Angeles

Lung Imaging Database Resource for Imaging Research
Mike McNitt-Gray, Ph.D.
University of California, Los Angeles

Grant Number: U01CA091103

The aim of this research is to create a database resource for images that will be used in analyses related to the detection and characterization of lung cancer using spiral CT. There has been significant interest in the last few years in using spiral CT lung scanning for lung cancer screening of patients at high risk. Early detection and intervention may significantly reduce the mortality rates of lung cancer and improve patient prognoses. In addition, there is significant interest in the characterization of solitary or small multiple nodules detected using lung cancer screening and conventional thoracic CT exams. This is because the presence of nodules within the lungs is not a reliable indicator of cancer. In fact, 50-80 percent of nodules detected by current methods are benign; this percentage may even climb as smaller nodules are detected with very sensitive screening techniques under consideration.

Therefore, detection of suspicious objects in the lung parenchyma, while a very necessary step, is not sufficient for patient management. Additional imaging or processing of the CT images may provide information that is useful in establishing the diagnosis of the individual patient and determining the next step in patient management. However, research in this area has been limited by the difficulties in collecting cases on which image processing algorithms may be robustly developed and tested. This is because it is difficult to establish diagnostic truth for such key elements as lesion location and lesion diagnosis.

The establishment of a lung imaging database creates a resource for the development and evaluation of methods for detecting and characterizing lung cancer. When made available to researchers all over the world, this resource would significantly reduce development time because it would allow imaging researchers to focus on the their areas of expertise without having to focus on case collection, establishing diagnostic truth and all of the other infrastructure issues that detract from development. This database would also allow direct and objective comparisons of techniques because common metrics would be applied to identical cases. This will allow the image processing field to move forward and to move from design to clinical implementation much faster. The specific aims to accomplish this are: SA-1 To develop the necessary consensus and standards for an image database resource related to the detection, characterization and evaluation of lung cancer using spiral CT imaging. SA-2 To construct, populate and test the database of spiral CT lung image data and ancillary data including the information necessary about diagnostic truth for each case. SA-3 To provide a means for documentation and distribution of this database to researchers through the internet.

University of Chicago

Standard Database for CT Lung Images
Samuel Armato, Ph.D.
University of Chicago

Grant Number: U01CA091090

The broad, long-term objective of this research project is to create a publicly available standard database of spiral computed tomography (CT) lung images. This lung image database will become an essential resource for the development of computer-aided diagnostic (CAD) techniques designed to help radiologists identify lung cancer in CT scans. The need for a standard lung image database is based on two recent developments. The first is the advancement of multi-slice CT scanners, which acquire images of multiple anatomic sections during each gantry rotation. Consequently, these scanners may generate an extensive amount of image data. The second development is the growing awareness among the American public and clinicians of the potential benefits of lung cancer screening using a low-dose spiral CT protocol. These developments are expected to dramatically increase the burden on radiologists. Moreover, primary interpretation from softcopy display will become a practical necessity.

What emerges from this scenario is a requirement for automated image processing methods that provide radiologists with quantitative information about suspicious abnormalities in the CT image data. Radiologists will then incorporate this information into their diagnostic decision-making process, with the expectation that cancer-detection sensitivity may be improved while decreasing both observer variability and interpretation time. Creation of a standard lung image database is critical to the endeavor of imaging research. This proposal addresses the important clinical and technical issues relevant to the creation of such a database.

The specific aims of the research are: (1) to identify the clinical requirements that must be imposed on a standard CT lung image database, (2) to address the technical issues and criteria involved with case selection for the CT lung image database, (3) to collect cases for the CT lung image database as a member of the Lung Image Database Consortium, (4) to develop strategies for the assessment of image processing and CAD methods using the CT lung image database, and (5) to investigate the effect of image reconstruction, multi-modality image registration, and registration of images acquired at different times on the utility of the CT lung image database.

As a member of the Consortium, we would demonstrate the flexibility necessary to reach consensus on the creation of a database that will serve as a standard resource for imaging research. The ideas presented in this proposal are expected to stimulate the efforts of the Consortium toward that goal.

University of Iowa

Lung Image Databases with Pathologic Correlates
Geoffrey McLennan, M.D.
University of Iowa

Grant Number: U01CA091085

This application is in response to a specific request to establish a generalized CT-derived database representing ground truth in lung cancer and is not hypothesis-driven. Our broad goal is to help in the building of this database, and through that effort assist with the methodical development of appropriate lung cancer screening tools and protocols. Our group, with recognized experience in cooperative national projects, and with a broad perspective, will provide for the consortium: a well characterized group of study subjects with lung cancer, and with common lung cancer mimics such as histoplasmosis, supported by excellent radiologists and pathologists; expertise in the development of CT imaging protocols; a functional electronic transfer system for CT data sets from multiple sites, analysis and archiving of such data sets; expertise in DICOM standards, and in the issuing of web-based reports; methods for temporal matching of CT data points, important in the longitudinal follow-up of patients, and in matching excised inflated lobe data and histopathological data to the original patient CT; expertise in computational morphology, (i.e., the mathematical description of complex structures, their visualization, and their derived CT images).

We intend to apply this to a subset of resected lung tumors to help define pathological and CT ground truth. Image reconstruction algorithms. This is critically important for the identification and implementation of needed improvements in CT methods to maximize the chance of detection of subtle early lesions within the lung parenchyma and airways. Data from two different CT manufacturers multi-slice helical CT scanners. With mathematically derived virtual lung models, including early lung cancer development, for use in design of scanning and reconstruction methods.

University of Michigan

Lung Image Database
Charles R. Meyer, Ph.D.
University of Michigan

Grant Number: U01CA091099

This grant application was awarded in response to RFA CA-01-001, the Lung Image Database Consortium (LIDC) Resource for Imaging Research. As a member of the LIDC will participate in formulating the multi-institutional lung imaging database acquisition and quality control specification, and begin collecting cases and populating a local database according to specifications resulting from the multi-institutional development of consensus guidelines.

Our clinical collaborators at the University of Michigan participating in this project have already had significant experience recruiting lung patients for another lung database project, the National Emphysema Treatment Trials (NETT). In this project Michigan ranked second in the number of patients screened for the study, and first in the enrollment of patients that passed the screen.

The Department of Radiology and the University Hospitals are committed to the acquisition of new generation CT and PET scanners over the duration of the LICD project. In direct support of the goals of a previously funded P01 as well as those of the LIDC, we have purchased a 4-CPU PowerEdge Dell server running RedHat Linux, configured with over 0.4 TB of RAID storage, all running on an uninterruptible power supply. The Raid and system disks are Past Funding Opportunities onto a LTO tape via ARCserveIT backup software. We have tested the installation of the Apache web server, PHP scripting language, and MySQL database. The system also supports the execution of AVS5, an application development environment for the manipulation and visualization of 3D data.

For the LIDC project we will implement the scrubbing of DICOM headers of unique patient identifiers, the population of a SQL database, and storage of associated CT datasets. Appropriate security and encryption has been addressed as well. The LIDC database will be available for public sharing through direct Internet access from our lab. The user web-interface will support identification of subsets of CT scans using SQL that may be downloaded for training/testing of CAD algorithms.



Table of Links

1http://grants.nih.gov/grants/guide/rfa-files/RFA-CA-01-001.html
2bc129b@nih.gov
3http://dev1.cancer.gov/programsandresources/InformationSystems/LIDC/page6
4dyankele@pop.med.cornell.edu
5http://dev1.cancer.gov/programsandresources/InformationSystems/LIDC/page7
6mmcnittgray@mednet.ucla.edu
7http://dev1.cancer.gov/programsandresources/InformationSystems/LIDC/page8
8s-armato@uchicago.edu
9http://dev1.cancer.gov/programsandresources/InformationSystems/LIDC/page9
10Geoffrey-McLennan@Uiowa.edu
11http://dev1.cancer.gov/programsandresources/InformationSystems/LIDC/page10
12cmeyer@umich.edu
13http://dev1.cancer.gov/reportsandpublications/ReportsandPresentations/LungImagi
ng
14http://dev1.cancer.gov/reportsandpublications/publications/lungimaging