FASMIC: An Integrated Bioinformatics Resource for Functional Annotation of Somatic Mutations in Cancer
, by Jun Li, Ph.D. and Han Liang, Ph.D.
Next-generation sequencing studies identified thousands of somatic mutations in tumor cells. The availability of large-scale cancer molecular alteration data is used to identify “driver” somatic mutations that play a critical role in the initiation and development of human cancer. While different bioinformatic algorithms have been developed to computationally predict the functional effect of somatic mutations in protein coding regions, the accuracy of these methods is still limited and not only because somatic changes in non-coding regions are not evaluated. Some recent studies have systematically experimentally characterized the function of somatic mutations using pools of cDNAs. However, the interpretation of the results is confounded by the competition between cDNAs with different mutations in the pooled assays which is a critical concern. To overcome this issue, we developed a sensitive, efficient, and systematic approach to annotate the functional effect of >1,000 somatic genomic aberrations1. To help the broader research community better use our results, we have built a user-friendly, open-access data portal, FASMIC (functional annotation of somatic mutations in cancer), which is publicly available at http://ibl.mdanderson.org/fasmic/.
Currently, FASMIC has curated >1,000 mutations including 923 missenses, 74 indel, 27 silent, and 25 nonsense mutations from 95 genes, among which 21 have >10 mutations for each gene. A few clinically actionable cancer genes, such as EGFR, BRAF, PIK3CA and ERBB2 have ~100 mutations per gene, which provide a good opportunity for researchers to study their allelic series. Overall, we have functionally annotated 1,049 mutations into different classes, including activating, inactivating, informative, non-inhibitory, inhibitory, and neutral mutations. The functional annotations were derived from two cell line models Ba/F3 and MCF10A based on the readout of the growth factor-independent cell-viability assays. In addition to the functional annotation, we also quantified protein expression using Reverse Phase Protein Arrays (RPPA) of >250 MCF10A cell lines that stably expressed different mutations. This protein profiling dataset is a valuable resource for elucidating the mechanisms underlying the phenotypic effects of the select mutations.
The best way to access our mutation functional annotation data is through FASMIC, a web-based platform for researchers to query and visualize the data in an interactive manner. The main interface on the home page is a simple input box, in which users can search mutations of their interest by the corresponding gene symbol. Those queried mutations are displayed in a table view. All basic information for each mutation is displayed in different columns, including genomic position, amino acid change, functional annotation, and more importantly the details of availability status of the associated mutation data (green color indicates that the information is available). After selecting a mutation using the green icons under the column “Details”, the associated mutation data is presented under the table in six separated tabs: (i) “MS/Summary” shows detailed profile information about each selected mutation. The information includes genome build, genomic position, DNA change, variant type, variant classification, and functional annotations in different cell models, etc. (ii) “3D/3D View” is a 3D animation of the corresponding protein structure with the mutated residues highlighted in red color. One can rotate the 3D structure and find the 3D position of the selected mutation, which may provide important information for inferring the functional impact of the selected mutation based on the structural stability or the protein-protein interaction site, and further help better understand the function of the mutation in the system level. (iii) “MF/Mut.Freq.” shows the frequencies of each mutation across different cancer types in an interactive bar plot (downloadable using the menu icon). The mutation frequency data was obtained from The Cancer Genome Atlas project. (iv) “FP/Func.Pred.” provides a table view showing the predicted functional effect. The functional scores were calculated based on 12 popular cutting-edge prediction algorithms (listed in the figure). This module helps researchers assess how well the predictions are aligned with the FASMIC annotation, and further provides more evidence to support the annotated functions. The color represents different functional levels. The red color means disease causing, pink means moderate, and white color means no effect. (v) “PE/RPPA” is a sorted scatter plot showing the differential protein expression level of a mutant relative to the wild-type gene across >300 protein markers. This plot shows the potential functional consequence of the selected mutation; and (vi) “PM/Literature” module lists the publications that are relevant to the queried mutation, which helps researchers easily find the previously published studies about the selected mutation.
In summary, FASMIC is a comprehensive database for functional impact of somatic mutations in cancer. It provides the functional annotation along with protein expression, mutation frequency, 3D structure, function prediction, and literature study to help researchers explore every detail about each mutation. This web resource will help identify potential driver mutations, discover novel biomarkers, improve prediction algorithm, and develop new drugs. We expect FASMIC to be a valuable resource to advance precision cancer medicine and develop novel cancer therapies.
Reference
- Ng PK, et al. Systematic Functional Annotation of Somatic Mutations in Cancer. Cancer Cell. 2018 Mar 12;33(3):450-462.e10. (PMID: 29533785)