Most proteins interact with small-molecule ligands such as metabolites or drug compounds. Over the past several decades, many of these interactions have been captured in high-resolution atomic structures. From a geometric point of view, most interaction sites for grasping these small-molecule ligands, as revealed in these structures, form concave shapes, or “pockets”, on protein surface. An efficient method for comparing these pockets could greatly assist the classification of ligand-binding sites, prediction of protein molecular function, and design of novel drug compounds. We introduce a computational method, APoc (Alignment of Pockets), for large-scale structural comparison of protein pockets. A scoring function, Pocket Similarity Score (PS-score), is derived to measure the level of similarity between pockets. Statistical models are employed to estimate the significance of PS-score based on millions of comparisons of random pockets. Apoc is a general, robust method that may be applied to pockets identified by various approaches, such as ligand-binding sites as observed in experimental complex structures, or predicted pockets identified by a pocket detection method. We curate large benchmark data sets to evaluate the performance of APoc and present interesting examples to demonstrate the usefulness of the method. Details of APoc can be found in the following references:
Mu Gao and Jeffrey Skolnick, 2013, APoc: large-scale identification of similar protein pockets. Bioinformatics, 29(5):597-604. PDF.
Jeffrey Skolnick and Mu Gao, 2013, Interplay of physics and evolution in the likely origin of protein biochemical function. Proc Natl Acad Science. 110:9344-9349. PDF.
This software is freely available to ALL users.
- Random Background RS1 (5,371 single chain proteins)
- Random Background RS2 (20,000 pocket pairs, gzip)
- Benchmark Subject set (38,066 pocket pairs)
- Benchmark Control set (38,066 pocket pairs)
- Benchmark set structures (PDB format, 480 MB, gzip)
This page is maintained by Mu Gao.