Molecular Classification of Cancer by Gene Expression Data
Classification of patient samples is a crucial aspect of cancer diagnosis and treatment. While cancer classification has improved in recent years, there has been no general systematic approach for identifying new cancer classes (class discovery) or for assigning tumors to known classes (class prediction). However, with the advent of new technologies for measuring the expression levels of many thousands of genes in tumor cells, there is hope for the development of new classification methods. This talk describes a simple but robust weighted-voting method for performing class prediction by computational analysis of gene expression data. As a test case, the method is shown to correctly classify bone marrow and blood samples from acute leukemia patients. The prediction method can also be used to validate newly proposed cancer classes, whether discovered by clustering expression data or by more traditional techniques. This application is illustrated by showing how expression analysis (with no previous medical knowledge) could have revealed the key distinctions among leukemias. Furthermore, this approach is not limited to cancer classification, but can be used predict _any_ distinction evident at the gene expression level. Potential applications include discovering novel classes of disease, predicting patient responses to treatment, and determining drug toxicity.