Biomarkers to Distinguish Aggressive Cancers from Non-aggressive or Non-progressing Cancer

Abbreviated Name

Christopher Li Supplement 2012

Lead Investigator

Li, Christopher — Fred Hutchinson Cancer Center

Coordinating Investigator

Li, Christopher — Fred Hutchinson Cancer Center

Involved Investigators

Li, Christopher — Fred Hutchinson Cancer Center

Abstract

Distinguishing aggressive cancers from non-aggressive or non-progressing cancers is an issue of both clinical and public health importance particularly for those cancers with an available screening test. With respect to breast cancer, mammographic screening has been shown in randomized trials to reduce breast cancer mortality, but given the limitations of its sensitivity and specificity some breast cancers are missed by screening. These so called interval detected breast cancers diagnosed between regular screenings are known to have a more aggressive clinical profile. In addition, of those cancers detected by mammography some are indolent while others are more likely to recur despite treatment. The pilot study proposed herein is highly responsive to the EDRN supplement titled “Biomarkers to Distinguish Aggressive Cancers from Nonaggressive or Non-progressing Cancers” in that it addresses both of the research objectives related to these issues outlined in the notice for this supplement: Aim 1: To identify biomarkers in tumor tissue related to risk of interval detected vs. mammography screen detected breast cancer focusing on early stage invasive disease. We will compare gene expression profiles using the whole genome-cDNA-mediated Annealing, Selection, extension and Ligation (DASL) assay of 50 screen detected cancers to those of 50 interval detected cancers. Through this approach we will advance our understanding of the molecular characteristics of interval vs. screen detected breast cancers and discover novel biomarkers that distinguish between them. Aim 2: To identify biomarkers in tumor tissue related to risk of cancer recurrence among patients with screen detected early stage invasive breast cancer. Using the DASL assay we will compare gene expression profiles from screen detected early stage breast cancer that either recurred within five years or never recurred within five years. These two groups of patients will be matched on multiple factors including tumor stage and treatments received. Our goal with this comparison is to identify novel biomarkers that discriminate between tumors that recur and are more aggressive compared to those that are less aggressive and do not recur. This project will evaluate well characterized tumor tissue specimens using a robust high dimensional laboratory approach and generate preliminary data that will motivate a larger scale study of high translational relevance.

Aims

screen detected breast cancer focusing on early stage invasive disease. We will compare gene expression profiles using the whole genome-cDNA-mediated Annealing, Selection, extension and Ligation (DASL) assay of 50 screen detected cancers to those of 50 interval detected cancers. Through this approach we will advance our understanding of the molecular characteristics of interval vs. screen detected breast cancers and discover novel biomarkers that distinguish between them. Aim 2: To identify biomarkers in tumor tissue related to risk of cancer recurrence among patients with screen detected early stage invasive breast cancer. Using the DASL assay we will compare gene expression profiles from screen detected early stage breast cancer that either recurred within five years or never recurred within five years. These two groups of patients will be matched on multiple factors including tumor stage and treatments received. Our goal with this comparison is to identify novel biomarkers that discriminate between tumors that recur and are more aggressive compared to those that are less aggressive and do not recur.

Analytic Method

Gene expression data will be pre-processed, normalized and cleaned as described in our protocol (Appendix 1). We will perform two levels of analysis: gene-set level and gene level, to identify gene sets and genes that are associated with interval vs. screen detected disease and those that are associated with recurrent vs. non-recurrent disease. In general, we will account for multiple testing by controlling the false discovery rate (FDR).23 We will use a 5% FDR when assessing statistical significance across all samples, and a 1% FDR (or lower) when performing subgroup analyses. For our gene-level analysis, we will use linear regression for each gene to identify genes showing differential expression in our comparisons of interest. Matching variables are adjusted as covariates in the linear regression models. For gene set-level analyses, we will first rank genes from high to low based on their association in each comparison, then for each gene set we will calculate an enrichment score that reflects how much the gene set is represented with genes that differentiate between our comparison groups (http://www.broad.mit.edu/GSEA).24,25 The statistical significance of enrichment scores will be evaluated by calculating enrichment scores relative to each of the null distributions formed by: 1) permuting exposure status within each matched set and 2) permuting genes. Using both types of null distributions gives us gene sets associated with a given comparison as well as those particularly enriched with associated genes. Performing the analysis in two tiers will help us to identify not only the genes that are individually most likely to differentiate our comparison groups, but also those that may have only moderate effect individually but collectively as a gene set may strongly discriminate between our comparison groups. This will enhance the power to detect all the associated genes or gene sets. We will construct a panel of genes that discriminate between our comparison groups based on significantly associated genes or gene sets using regularization techniques, which have been shown to improve prediction and interpretation considerably compared to ordinary regression models without regularization. Specifically, we will use the elastic net26 regularization method as it has a desirable feature well suited to our data, i.e., encouraging genes in the same pathway to be selected as a group in the model.27 10- fold cross validation will be used to determine the appropriate amount of regularization, and hence the panel of the biomarkers that are associated with the outcome. While microarray technology allows simultaneous evaluation of expression levels of thousands of genes, only a fraction is expected to be associated with the exposures of interest. With respect to statistical power, the minimum detectable effect size (MDES) is determined by the number (m) of truly altered genes out of the total p (here 25,000) genes studied, in additio

Publications

No publications available at this time for this protocol.

Biomarkers

No biomarkers available at this time for this protocol.

Data Collections

No data collections available at this time for this protocol.

Team Project

Start Date

Jul 1 2012

Estimated Finish Date

Jun 30 2014

Protocol ID

362

Protocol Type

Collaboration

Fields of Research

Genomics

Collaborative Group

Breast and Gynecologic Cancers Research Group

Cancer Types

Malignant neoplasm of breast