Protein Protein Interaction Affix Dictionary (PPIAD) and BioMethod Lexicon

Introduction

Protein-protein interaction (PPI) is a topic of central interest for biologists. A large number of experimentally detected PPIs are mentioned in scientific literature and the scientists are interested in extracting them for their experimental work, which is a gigantic task given the enormous amount of literature. Therefore, automated methods were developed for the task using text mining technique. The recent BioCreative challenge revealed difficulties involved in the identification of correct interactor proteins from full text scientific articles. In order to overcome the problem, we use previously unexplored clues related to affixes of the target proteins. For example, experimental biologists often use specific fusion protein or protein-tags such as -GST, -His, -Myc, etc. for detecting and visualizing the interaction information.

We carried out detailed study of affixes and their association with interactor proteins. We also explored whether affixes can be used for detecting PPI related articles and interaction detection method.

Materials and Method

Construction of Lexical Resources

Construction of Protein Protein Interaction Affix Dictionary (PPIAD)

PPIAD is constructed based on examination of 3000 previously referred interaction evidence passges. It comprises 89 suffixes, 176 prefixes and 12 that are both suffixes and prefixes adding up to total of 277 interaction relevant affixes. The affixes are divided into 36 classes containing 26 super-affixes and 10 combined or sub-affixes. Further, each affix is manually linked to experimental qualifiers represented by associated PSI-MI ontology concepts according to their definitions. [
Download PPIAD]

Construction of BioMethod Lexicon

In order to overcome the limited scope and lexical coverage of terms contained in the PSI-MI ontology, we build the BioMethod Lexicon, a collection of experimental method terms important for protein interaction and gene regulation relations, and characterized method term co-mentions with affix tag classes. [Download Dictionary of PPI Detection Method]

Construction of PPI Affix Patterns

In order to detection, protein protein interaction pairs using affixes, we manually constructed a collection of 799 affix relevant interaction expression patterns. [Download Dictionary of PPI Affix Patterns]

Tagging Affixes

The affixes were tagged based on PPIAD dictionary via a simple perl script. Out of 6300 interaction evidence sentences, we tagged 1946 sentences (31%) with at least one interaction relevant prefix.

Analysis and Key Results

How to Cite?

Krallinger M., Tendulkar A. V., Litneir F., Chatr-aryamontri A. and Valencia, A. (2010)
The PPI affix dictionary (PPIAD) and BioMethod Lexicon: importance of affixes and tags for recognition of entity mentions and experimental protein interactions, BMC Bioinformatics, 11(5):O1.

Team

Contact