By Deborah Stungis
Expression Array Data from the functional genomics project at the Los Alamos National Laboratory is being analysed with data-mining techniques in search of relevant expression correlations. Data is in the form of an expression array, that is, a vector representation of trait manifestation caused by particular genes under varied experimental conditions. We are asked to find maximal-size subsets that are highly correlated.
We are interested in techniques that will help to both identify genes that interact in significant ways and ‘filter out’ those that do not. Proposed analysis methods so far include Association Rule Mining and Fuzzy Clustering. Possible exploration of temporal data (i.e. tissue sampled over time) has also been discussed, in which case general systems problem solving (GSPS) methods would be applied (a preliminary description of how GSPS could be used for this problem is available).
The domain of the problem data is still in the defining stage. We are currently working to clarify the source of data with the functional genomics project and how to represent the specific data appropriately in the frameworks mentioned above.
Chen, J.J.W. et al [1998]. "Profiling expression patterns and isolated differentially expressed genes by cDNA microarray system with colorimetry detection". Genomics, 51, 313-324.
Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo [1994]. "Efficient algorithms for discovering association rules". In Knowledge Discovery in Databases (KDD'94), 181 - 192, Seattle, Washington, July 1994. AAAI Press.
L.Pickert, I.Reuter, F. Klawonn and E.Wingender [1998], "Transcription regulatory region analysis using signal detection and fuzzy clustering", Bioinformatics, 14(3):244-251.
L. Pickert, I. Reuter, F. Klawonnand E.Wingender. "Transcription Regulatory Region Analysis Using Signal Detection and Fuzzy Clustering". Bioinformatics 14 (1998), 244-251
Zaki, M. J. and M. Ogihara [1998]. "Theoretical Foundations of Association Rules". 3rd SIGMOD'98 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), pp 7:1-7:8, Seattle, WA, June 1998.