Skip to main content
An official website of the United States government

TCGA by the Numbers

The Cancer Genome Atlas (TCGA) project was led by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It produced a tremendous amount of data about the genomic characteristics of many different types of cancer. Researchers within and outside the program have used the data to improve tumor classification, find clues for customizing cancer treatments, and much more.

TCGA produced over 2.5 TB of data. To put this into perspective, one petabyte of data is equal to 212,000 DVDs. TCGA data describes 33 different tumor types including 10 rare cancers based on paired tumor and normal tissue sets collected from 11,000 patients using seven different data types. TCGA results and findings. Improved our understanding of the genomic underpinnings of cancer. For example, a TCGA study found the basil-like subtype of breast cancer to be similar to the serous subtype of ovarian cancer

If you would like to reproduce some or all of this content, see Reuse of NCI Information for guidance about copyright and permissions. In the case of permitted digital reproduction, please credit the National Cancer Institute as the source and link to the original NCI product using the original product's title; e.g., “TCGA by the Numbers was originally published by the National Cancer Institute.”

Email