Uncovering Protein-Protein Interactions in the Bibliome

Alaa Abi-Haidar^1,6, Jasleen Kaur¹, Ana G. Maguitman², Predrag Radivojac¹, Andreas Retchsteiner³, Karin Verspoor⁴, Zhiping Wang⁵, Luis M. Rocha^1,6,*

¹School of Informatics, Indiana University, 1900 East Tenth Street, Bloomington IN 47408, USA
²Universidad Nacional del Sur, Bahia Blanca, Argentina
³Center for Genomics and Bioinformatics, Indiana University, USA
⁴Information Sciences Group, Los Alamos National Laboratory, USA
⁵Biostatistics, School of Medicine, Indiana University, USA
⁶FLAD Computational Biology Collaboratorium, Instituto Gulbenkian de Ciencia, Portugal
^*To whom correspondence should be addressed: rocha@indiana.edu

Citation: A. Abi-Haidar, J. Kaur, A. Maguitman, P. Radivojac, A. Retchsteiner, K. Verspoor, Z. Wang, and L.M. Rocha [2007]."Uncovering Protein-Protein Interactions in the Bibliome". Proceedings of the Second BioCreative Challenge Evaluation Workshop (ISBN 84-933255-6-2), pp. 247-255.

The pre-print is available in Adobe Acrobat (.pdf) format only. Supplemental materials are also available. Due to mathematical notation and graphics, only the abstract is presented here.

We participated in three of the Protein-Protein Interaction (PPI)subtasks: Protein Interaction Article Sub-task 1 (IAS), Protein Interaction Pairs Sub-task 2 (IPS), and Protein Interaction Sentences Sub-task 3 (ISS). Our approach includes a feature detection method based on a spam-detection algorithm. For IAS we submitted three runs from distinct classification methods: the novel Variable Threshold Protein Mention Model, Support Vector Machines, and an integration method based on measures of uncertainty and a nearest neighbor predictor on the eigenvector space obtained via the Singular Value Decomposition of the feature/abstract matrix. For IPS and ISS we used the features discovered from IAS abstracts as well as features from training IPS and ISS data to predict appropriate passages and pairs. We also used the number of protein mentions in a passage as a feature.

Keywords:Protein interaction, text mining, bibliome informatics, support vector machines, singular value decomposition, spam detection, uncertainty measures, proximity graphs, complex networks.

For the full paper please download the pdf version

For more information contact Luis Rocha at rocha@indiana.edu. Check the Web Design Credits, for due credit.
Last Modified: May 21, 2007