Finding genes associated with lung cancer in microarray data
Microarrays allow the simultaneous measurement of thousands of gene expression values in tissue samples. They are being used in cancer research and could potentially be used as a diagnostic tool. Currently, the cost of each microarray is very high, so microarray data sets tend to have many more attributes (genes) than examples (tissue samples). We experiment with techniques for extracting a small set of genes that can be further investigated by cancer researchers or used to develop a predictive test for classifying tissue samples. One technique combines Approximate Distance Clustering with Nearest Shrunken Mean; the other uses the more classical techniques of variance ratios, K-medians, and hierarchical clustering.