Using CTD² Data
The Cancer Target Discovery and Development (CTD²) Network is a “community resource project,” meaning members of the Network are required to release data to the broader research community. All data generated by this initiative are released in agreement with the data release policy developed by its members in concordance with National Institutes of Health data release policy. The release of CTD² data to the scientific community is intended to maximize the translational impact of these findings. In addition to using the raw data for investigational purposes, researchers outside the Network are encouraged to use CTD² datasets to develop novel methods and tools. Data users must acknowledge the CTD² Network.
Access Data
Data generated by CTD² Network members can be found on the CTD² Data Portal. Users must use data with discretion and acknowledge the CTD² Network.
- CTD² Data Portal
- Browse and search through summarized CTD² research findings with published evidence in an interactive environment.
Getting Help with CTD² Data
- What is the CTD² Data Portal?
- What Data are Available in the CTD² Data Portal?
- How do I Navigate the CTD² Data Portal?
- How Are Data Files Formatted?
- How Can I Analyze the Data?
- How Do I Acknowledge CTD² Data?
What is the CTD² Data Portal?
The CTD² Data Portal is an open-access data portal that serves as the single access point for downloading CTD² data. It is managed by NCI’s Data Coordinating Center (DCC). Along with each project dataset in the Data Portal, users will find links to a summary of the corresponding Center’s overall research goals, a description of technologies used to generate the data, and project contact information.
What data are available in the CTD² Data Portal?
The Network employs a variety of high-throughput and bioinformatics/computational methods to validate cancer targets identified in large-scale genomics data. While each Network Center has its own set of specialties, open collaboration across the groups is designed to maximize translational impact. For example, several Network Centers specialize in identifying small molecules that modulate validated cancer targets (e.g., for use as probes or therapeutics), while other groups specialize in testing these small molecules in animal models. Raw datasets from Network experiments are freely available to download and use at a researcher’s discretion.
As the Network continues to innovate, their experimental approaches evolve along with the types of data that are made available through the Data Portal. Below are examples of approaches Network members have applied in their research. This list is not comprehensive.
- small molecule screening
- protein–protein interaction identification
- RNA interference (RNAi) and clustered regularly interspaced short palindromic repeats (CRISPR)/cas9 screening
- genome-wide loss-of-function and gain-of-function screening
- targeted candidate gene validation
- judiciously applied mouse-based screening
- project title: displays project description
- institute: links to a broad description of each Center’s goals and aims
- experimental approaches: links to a summary of the experimental approach or directs users to a publication with corresponding methodology
- data files: direct users to a page where they can download the data
- contact: opens an email addressed to a project representative, so users can send specific inquiries
How are data files formatted?
In order for all data to be usable and uniform, the CTD² Network follows common data format guidelines defined by Network members.
- Data files are in the .GCT file format or neutral format (e.g., CSV, tab-delimited)
- Metadata is documented either using headers (e.g., GEO- Soft format) or separate documentation (e.g., README files)
- When the submission includes many data types, files are deposited as a compressed archive (e.g., .zip, .tar, .tgz) that will allow downloading of the whole package at once
How can I analyze the data?
CTD² generates massive datasets that cannot be analyzed manually and may be of limited use to researchers with little bioinformatics support. Automated analytical tools allow a deeper mining of the data. While CTD² does not endorse any specific data mining tool, the Network members curated a list they found useful for analyzing and visualizing the datasets. Visit the CTD² Analytical Tools page to learn more.
How do I acknowledge CTD² Data?
The CTD² Network requests that researchers who use CTD² data acknowledge it as follows:
“The results published here are in whole or part based upon data generated by Cancer Target Discovery and Development (CTD²) Network (https://ccg.cancer.gov/ccg/research/functional-genomics/ctd2/data-portal) established by the National Cancer Institute’s Center for Cancer Genomics.”
For a more detailed explanation of the publication guidelines, visit the CTD² Publication Guidelines page.