Skip to Content
Cancer Imaging Program (CIP)
Contact CIP
Show menu
Search this site
Last Updated: 10/28/16

New NCI Initiatives in Computer Aided Diagnosis

SPIE, Feb 2000 Proceedings

Laurence P. Clarke, Barbara Y. Croft, and Edward Staab

Cancer Imaging Program, NCI, Bethesda, MD 20892


The National Cancer Institute (NCI) is interested in supporting the development of an image database for lung cancer screening using spiral X-ray CT. A cooperative agreement is envisioned (NCI UOI support mechanism) that will involve applications from investigators who are interested in joining a consortium of institutions to construct such a database as a public resource. The intent is to develop standards for generating the database resource and to allow this database to be used for evaluating computer aided diagnostic (CAD) software methods. Initial interest is focused on spiral CT of the lung because of the recent interest in using this imaging modality for lung cancer screening for patients at high risk, where early intervention may significantly reduce cancer mortality rates. The use of CAD methods is rapidly emerging for this large-scale cancer screening application as these methods have the potential of improving the efficiency of screening. Lung imaging is a good physical model in that it involves the use of 3D CAD methods that require critical software optimization for both detection and classification (benign verses malignant cancer). In addition, the detection of change in the CT images over time, or changes in lung nodule size, has the potential to provide either improved early cancer detection or improved classification


Biomedical imaging is increasingly electronic in terms of image acquisition and display. Image detectors have become so sensitive that the amount of information acquired is far greater than can be displayed at any one time, using contemporary display methods. Research on image processing methods is essential to fully exploit the information that has been acquired. Furthermore, ability to extract quantitative information from images is increasingly important and requires image processing. Investigators working on image processing rarely have access to or the resources necessary to create large databases of images on which to develop and test their work. In addition, the comparison and evaluation of image processing techniques against each other require common data sets and standardized methods for evaluation. The need for medical image databases as research resources for medical image processing has been a frequently identified priority at various NIH workshops [National Cancer Institute (NCI), the June 1999 BECON Symposium, the National Library of Medicine (NLM) and the Human Brain Project]. Also, the largest professional radiology organization, the Radiological Society of North America (RSNA), has identified image databases as a critical need for research and education and will devote considerable resources (approximately $500K to $1M per year) to support infrastructure for such electronic image databases. Similarly, NSF has co-sponsored several image databases with NIH to encourage the participation of NSF investigators and research synergy between the federal agencies. Image databases are analogous in many ways to tissue banks that NCI has funded for many years. Previous attempts to create groups of related images have not been successful primarily because of lack of standards or consensus on critical issues.

An important opportunity exists for NCI to respond to these database needs. Potential archives of images are being created as by-products of several NCI-funded therapeutic or diagnostic clinical trials. No provisions have yet been made to develop the standards or infrastructure necessary to make these collections of images available to researchers who could benefit from using them. This initiative would provide the means to take advantage of this opportunity.

An important example of software to be evaluated using these image databases is referred to as computer-aided diagnosis (CAD). Computer aided diagnosis is a general term used for a variety of artificial intelligence techniques applied to medical images. CAD methods are being rapidly developed at several academic and industry sites, particularly for large-scale breast, lung, and colon cancer screening studies. X-ray imaging for breast, lung and colon cancer screening are good physical and clinical models for the development of CAD methods, related image database resources, and the development of common metrics and methods for evaluation. For large-scale screening applications CAD methods are an important for: (a) improving the sensitivity of cancer detection, (b) reducing observer variation in image interpretation, (c) increasing the efficiency of reading large image arrays, (d) improving efficiency of screening by identifying suspect lesions or identifying normal images, and (e) facilitating remote reading by experts (e.g., telemammography).

Image processing tools are also being developed for temporal analysis of serial images, with the aim of detecting early subtle changes that might not be obvious to the reading physician. Temporal analysis requires additional consensus on the development of reference standards (electronic ground truth), software modules for registration of serial images and related image segmentation. In addition, CAD techniques can improve the specificity of cancer detection by assigning a quantitative estimate of the probability that a detected lesion is benign or malignant. Another promising application of CAD is predicting which cases are most suitable for a particular treatment option.

The proposed database should contain several image data sets. For example, follow up high resolution X ray-CT images in addition to the screening images should be included. Multimodality imaging (X-ray CT, MRI or PET/SPECT) may also be important for improved diagnostic image interpretation and complement the evaluation of CAD methods. Implementation of multimodality image registration software for lung imaging is therefore also important. X-ray CT image reconstruction methods may also impact the performance of CAD methods and hence the collection of raw image data may also need to be considered along with an examination of the specific impact of image processing methods on the CT images, as well as the CAD performance. This initiative should therefore address serial and multi-modality image registration as a secondary goal, using limited image data subsets. Similarly, the impact of image reconstruction methods should be a secondary goal, with the generation of image subsets, and subject to the ability to access raw CT data.

An example of the establishment of a successful image database as an international resource is the National Library of Medicine (NLM) Visible Human Project (VHP) initiated in 1989. This project as partly supported by NSF. This project has proven to be far more important than imagined at its inception. It created a resource of CT and MRI images of male and female human cadavers, along with photographs of the corresponding cadaver sections. This resource is continually used around the world for research and educational purposes. However, only normal subjects were used for the VHP. The database does not contain pathology and therefore poses very different problems in terms of anatomical reference standards (ground truth) and software performance evaluation. Data analysis has been restricted to specific software modules, namely image registration, segmentation, and display of normal anatomy. Furthermore, results have primarily been evaluated using visual assessment without the establishment of standard methods or common metrics for evaluation of software performance.

An example of an image database resource in development is the NIH collaborative effort called the Human Brain Project (HBP), begun in 1993, that seeks to create common tools to facilitate research on neuroinformatics. This project was also partly supported by NSF. Some of these tools are image processing algorithms for neuroimaging such as brain CT, PET and MRI. However, investigators funded individually by the HBP have recently recognized the same barriers and challenges as identified by the above-described workshops, and are now working on the establishment of standards for database generation and common metrics and statistical methods for evaluation of the performance of software methods. The HBP investigators are doing this through inter-institutional and international collaboration facilitated by NIH. In this NCI initiative, we propose a mechanism to promote the establishment of such standards from the onset. The solutions that the HBP develops for brain imaging will not transfer directly to other organ systems, but encourage exchange of information would be encouraged between the investigators supported by this initiative and the HBP.


2.1 Specific goals

  • Provide an environment to develop consensus criteria for: (a) preparation and submission of cases that are representative of clinical practice, (b) spatial determination of reference standards ("ground truth") for lung nodule(s) electronically in 3D, including pathology review, (c) common metrics and software for statistical validation of the performance of CAD methods, (d) common metrics for evaluation of temporal analysis software tools and (e) common metrics for evaluation of serial and multimodality image registration.
  • Provide a common research resource to the medical imaging community to: (a) permit early identification of promising software methods from the diverse pool of emerging tools, (b) stimulate the development of more advanced 3D CAD methods including temporal analysis and related image registration methods, (c) accelerate research timelines for R and D for CAD, and (d) reduce the risk for diagnostic software development by academia and/or industry for lung cancer screening.
  • Allow Internet access to the database by the broad imaging research community. This effort would stimulate inter-disciplinary research collaboration, including researchers in academia, government and industry. Internet access may be accomplished by using the resources of the NCI CIT Services, which has agreed to develop the necessary infrastructure.

2.2 NCI UO1 Steering Committee.

The steering committee will be the main governing board of the proposed project. This committee will be composed of the principal investigator(s) of each funded group, a second member of each funded group, the NCI Project Scientist and an NCI Program staff representative. The principal investigators from each Group, and the NCI Project Scientist will have one vote. The chairperson, who will be someone other than an NCI staff member, will be selected by the Steering

The Steering Committee will have primary responsibility for implementation of the overall goals of the Consortium, reviewing the scope of the original applications submitted and organizing research tasks for each participating site where necessary. The Steering Committee’s responsibilities will include efforts to reachconsensus on the spiral CT imaging protocols in collaboration with the clinical trial groups, to reach a criteria to populate the image databases, to monitor the accrual of cases, to monitor image quality control, to review ground truth and related pathological confirmation to ensure commonality of methods at each site, to expand the databases if improved CT imaging sensors such as volume CT become available, and to reach a consensus on the metrics and statistical methods for software evaluation.

The Steering Committee will be responsible for the image format and transfer to NIH for Internet general access. The Committee will continue to monitor the implementation of the data release plan for the life of the awards. It will also be responsible for the collaboration with other NCI-supported clinical trials where applicable (e.g., ACRIN), to establish a consensus for the CT imaging protocols, access to image databases and to ensure appropriate selection of imaging protocols. It will also be encouraged to interact with professional societies such as RSNA, ACR, NSF and other NIH cooperative groups generating image data bases (Human Brain Project, Visible Human Project) to ensure acceptance of standards proposed for evaluation of the image data bases. The Steering Committee will facilitate the conduct and monitoring of studies and reporting study results.

2.3 Conclusions.

The intent of this proposed initiative is to support a consortium of institutions to develop the necessary consensus and standards for a lung CT image database resource, and to construct a database of spiral CT lung images. The optimization of CAD methods for emerging imaging methods, including recent advances in functional and molecular imaging, will require many different databases of images from different organs and from different modalities. Each application poses different problems for optimization of software. However, the main obstacle has been the lack of a process to develop consensus and standards for assembling and evaluating the databases. The investigators funded under this proposed initiative should have the unique opportunity to create a set of standards and metrics for database evaluation and a database as a testbed and showcase to promote the use of CAD methods. The database(s) will have wide utility beyond the scope of this proposed project, for example, as a teaching and training resource for both human readers and CAD. The successful completion of this project should lead to the creation of more research databases and a more scientific evaluation and comparison of digital image processing methods for emerging imaging modalities

2.4 References.

  1. URL for the cited workshops: