|
Professor, College of Computing |
Sham Navathe |
|
Abstract: DNA microarrays can screen thousands of genes in a single experiment identifying altered expression levels for hundreds of genes. Many of these altered genes are often outside the field of expertise of the investigator. Interpretation of such large quantities of information by non-experts using traditional literature research (reading biomedical journals and searching related databases one gene at a time) is slow and inaccurate. This inefficiency hampers the understanding and discovery of the subtle functional relationships between the genes of interest. This talk will present our work that helps in interpretation of the results of microarray experiments. Keywords are identified using a statistical algorithm and are extracted from MEDLINE citations containing specific gene names. This talk will outline the various investigations we have done related to the extraction of meaningful keywords associated with genes and the clustering of these genes based on similarity of function from the entire Medline database We have developed our own clustering algorithm called BEA_PARTITION which has been shown to outperform the common algorithms like K-means, Hierarchical clustering etc. Tests have been conducted with various datasets from microarray experiments from neurological and cardiovascular diseases. We showed the potential for discovering new functional information about genes that is hitherto not represented in public databases. We have further applied the extracted keywords as features to help in classifying literature related to epidemiology at CDC. Current research is directed toward enriching our approach with biomedical ontologies. We hope that this research has broader utility and that it will be an asset to the growing research community utilizing experimental DNA microarray data. This work is in collaboration with Ying Liu, UT-Dallas, and Profs. Dingledine (Pharmacology) and Ciliax (Neurology) of Emory University.
|
|