Learning Expressive Models of Transcription Regulatory Cis-Elements from Sequence and Expression Data
The expression of genes is known to be a function, in part, of transcription factor binding elements in their promoter regions. The automatic characterization of particular transcriptional modules is an important open problem, which has received a great deal of attention from several groups, resulting in a variety of promoter models and learning algorithms.
I will discuss approaches that we have developed for the task of learning such modules from sequence and expression data. I will show the advantage of using very expressive models for this task-- ones that characterize not only binding sites themselves, but their spatial relationships to each other and to the location of the gene, as well as their logical relationship to the function of the module itself. I will also discuss methods for framing the task as a regression problem.