Uncovering Protein-Protein Interactions in the Bibliome


Alaa Abi-Haidar1,6, Jasleen Kaur1, Ana G. Maguitman2, Predrag Radivojac1, Andreas Retchsteiner3, Karin Verspoor4, Zhiping Wang5, Luis M. Rocha1,6,*

1School of Informatics, Indiana University, 1900 East Tenth Street, Bloomington IN 47408, USA
2Universidad Nacional del Sur, Bahia Blanca, Argentina
3Center for Genomics and Bioinformatics, Indiana University, USA
4Information Sciences Group, Los Alamos National Laboratory, USA
5Biostatistics, School of Medicine, Indiana University, USA
6FLAD Computational Biology Collaboratorium, Instituto Gulbenkian de Ciencia, Portugal
*To whom correspondence should be addressed: rocha@indiana.edu

Citation: A. Abi-Haidar, J. Kaur, A. Maguitman, P. Radivojac, A. Retchsteiner, K. Verspoor, Z. Wang, and L.M. Rocha [2007]."Uncovering Protein-Protein Interactions in the Bibliome". Proceedings of the Second BioCreative Challenge Evaluation Workshop (ISBN 84-933255-6-2), pp. 247-255.

The pre-print is available in Adobe Acrobat (.pdf) format only. Supplemental materials are also available. Due to mathematical notation and graphics, only the abstract is presented here.

Abstract.

We participated in three of the Protein-Protein Interaction (PPI)subtasks: Protein Interaction Article Sub-task 1 (IAS), Protein Interaction Pairs Sub-task 2 (IPS), and Protein Interaction Sentences Sub-task 3 (ISS). Our approach includes a feature detection method based on a spam-detection algorithm. For IAS we submitted three runs from distinct classification methods: the novel Variable Threshold Protein Mention Model, Support Vector Machines, and an integration method based on measures of uncertainty and a nearest neighbor predictor on the eigenvector space obtained via the Singular Value Decomposition of the feature/abstract matrix. For IPS and ISS we used the features discovered from IAS abstracts as well as features from training IPS and ISS data to predict appropriate passages and pairs. We also used the number of protein mentions in a passage as a feature.

Keywords:Protein interaction, text mining, bibliome informatics, support vector machines, singular value decomposition, spam detection, uncertainty measures, proximity graphs, complex networks.

For the full paper please download the pdf version


For more information contact Luis Rocha at rocha@indiana.edu. Check the Web Design Credits, for due credit.
Last Modified: May 21, 2007