Volume 25 Issue 1 - October 11, 2013 PDF
AutoBind: automatic extraction of protein–ligand-binding affinity data from biological literature
Darby Tien-Hao Chang1, Chao-Hsuan Ke2, Jung-Hsin Lin3,4 and Jung-Hsien Chiang2,*
1 Department of Electrical Engineering, 2 Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 70101,3 School of Pharmacy, National Taiwan University, Taipei 10051 and 4 Institute of Biomedical Sciences, Academia Sinica, Taipei 11529, Taiwan
Font Enlarge
Determination of the binding affinity of a protein-ligand complex is important to quantitatively specify whether a particular small molecule will bind to the target protein. In the past decades, several databases of protein-ligand binding affinities have been created via visual extraction from literature. However, such approaches are time-consuming and most of these databases are updated only a few times per year. Hence, there is an immediate demand for an automatic extraction method with high precision for binding affinity collection.

Recently, we have created a new database of protein-ligand binding affinity data, AutoBind, based on automatic information retrieval. We first compiled a collection of 1586 articles where the binding affinities have been marked manually. Based on this annotated collection, we designed four sentence patterns that are used to scan full-text articles as well as a scoring function to rank the sentences that match our patterns. The proposed sentence patterns can effectively identify the binding affinities in full-text articles. Our assessment shows that AutoBind achieved 84.22% precision and 79.07% recall on the testing corpus. Currently, 13616 protein-ligand complexes and the corresponding binding affinities have been deposited in AutoBind from 17221 articles.

Table 1. Comparison of AutoBind to other databases
 Protein ComplexNumber of entries
(Affinity Data)
Last updated dateFirst published year
AutoBind3792913616February 13, 2013This work
PDBBind79867986September 22, 20112004
Binding MOAD16955563020102005
BindingDB30561817March, 20102001

  1. Wang R., Fang X., Lu Y., Wang S., The PDBbind Database:  Collection of Binding Affinities for Protein−Ligand Complexes with Known Three-Dimensional Structures. Journal of Medicinal Chemistry 2004; 47: 2977-2980.
  2. Hu L., Benson M.L., Smith R.D., Lerner M.G., Carlson H.A., Binding MOAD (Mother Of All Databases). Proteins: Structure, Function, and Bioinformatics 2005; 60: 333-340.
  3. Liu T., Lin Y., Wen X., Jorissen R.N., Gilson M.K., BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Research 2007; 35: D198-D201.
  4. Block P., Sotriffer C.A., Dramburg I., Klebe G., AffinDB: a freely accessible database of affinities for protein–ligand complexes from the PDB. Nucleic Acids Research 2006; 34: D522-D526.
  5. Puvanendrampillai D., Mitchell J.B.O., Protein Ligand Database (PLD): additional understanding of the nature and specificity of protein–ligand complexes. Bioinformatics 2003; 19: 1856-1857.
< Previous
Next >
Copyright National Cheng Kung University