For each protein with multiple ligands we superimposed the protein backbones via RMS fit. In each case (except DHFR, see details below) the protein sequences are almost identical and thus the pairing of residues straightforward. After aligning the backbones, the coordinates of the ligand are taken (aligned coordinates). For each ligand we provide the following sets of coordinates, distinguished by filename postfix as follows:
| _kpl |
aligned coordinates of the ligand |
| _kpl_h |
aligned coordinates of the ligand, hydrogens added |
| _min_h |
arbitrary coordinates of the ligand, energy minimized |
The prefix is in every case the PDB identifier of the protein-ligand complex or the CASP identifier of the respective protein-ligand complex. The preparation and minimization of structures has been performed using the SYBYL molecular modelling software [1]. The structures provided are stored in SYBYL mol2 file format. Partial charges have been calculated using the Gasteiger Method [2].
In the following, we define atom types, protonation rules and the specificities of each protein-ligands group.
Definition of atom types:
| Carbonic acid |
C.2 O.co2 O.co2 |
| Amide |
C.2 O.2 N.pl3 H |
| Sulfonamide |
S.o2 O.2 O.2 N.pl3 H |
| Protonated amine |
N.4 |
| Aromatic amine |
N.pl3 |
| Phosphonate linker |
O.3 P.3 O.co2 O.co2 |
| Phosphinate linker |
P.3 O.co2 O.co2 |
| Terminal phosphato |
O.3 P.3 O.3 O.co2 O.co2 |
| Terminal phosphonato |
P.3 O.3 O.co2 O.co2 |
| Hydroxamic acid |
C.2 O.2 N.pl3 H O.3 |
| Guanidino / Amidino |
C.cat N.pl3 |
Definition of protonation rules:
| Carbonic acid |
deprotonated as COO- |
| Aliphatic amine |
protonated as NH+ |
| Aromatic amine |
neutral, trigonal planar |
| Alcohol |
neutral as OH |
| Phosphonate linker |
deprotonated as OPO2- |
| Phosphinate linker |
deprotonated as PO2- |
| Terminal phosphato |
2-fold deprotonated as OPO32- |
| Terminal phosphonato |
2-fold deprotonated as PO32- |
| Hydroxamic acid |
deprotonated as CONHO- (Thermolysin: bound to Zn) |
| Thiol |
deprotonated as S- (Thermolysin: bound to Zn) |
| Guanidino / Amdino |
protonated as C.cat+ |
Protein-ligands groups:
- Carboxypeptidase
5 PDB structures: 1cbx, 2ctc, 3cpa, 6cpa, 7cpa
Comment: nothing specific
- Concanavalin
1 PDB structure: 5cna
1 CASP structure: t0013
Comment: nothing specific
- Dihydrofolatreductase
2 PDB structures: 1dhf, 4dfr
Comment: The protein chains are not identical in this case. The alignment has been generated as follows: With two independent methods, (a) Modeller [3] and (b) Align structures using homology within SYBYL we obtained almost identical results. 43 conserved amino acids have been recognized. These have been paired as listed below and superimposed using RMSD fit of the corresponding C-a atoms.
| pair |
1dhf |
4dfr |
distance |
| 1 |
GLU183 |
GLU157 |
1.14 |
| 2 |
PHE179 |
PHE153 |
0.63 |
| 3 |
TYR177 |
TYR151 |
1.08 |
| 4 |
PHE148 |
PHE125 |
0.31 |
| 5 |
PRO149 |
PRO126 |
0.80 |
| 6 |
ASP145 |
ASP122 |
2.19 |
| 7 |
THR146 |
THR123 |
1.38 |
| 8 |
GLU143 |
GLU120 |
1.38 |
| 9 |
ILE138 |
ILE115 |
0.64 |
| 10 |
THR136 |
THR113 |
0.64 |
| 11 |
LYS132 |
LYS109 |
0.43 |
| 12 |
LEU133 |
LEU110 |
0.66 |
| 13 |
VAL120 |
VAL99 |
0.95 |
| 14 |
TYR121 |
TYR100 |
0.46 |
| 15 |
GLY116 |
GLY95 |
0.59 |
| 16 |
GLY117 |
GLY96 |
0.93 |
| 17 |
ALA96 |
ALA81 |
1.08 |
| 18 |
ASP94 |
ASP79 |
1.44 |
| 19 |
SER92 |
SER77 |
0.95 |
| 20 |
LEU75 |
LEU62 |
0.30 |
| 21 |
SER76 |
SER63 |
0.91 |
| 22 |
ASN72 |
ASN59 |
0.36 |
| 23 |
GLY69 |
GLY56 |
0.76 |
| 24 |
ARG70 |
ARG57 |
0.56 |
| 25 |
ARG65 |
ARG52 |
0.95 |
| 26 |
PRO66 |
PRO53 |
0.32 |
| 27 |
LEU67 |
LEU54 |
0.54 |
| 28 |
SER59 |
SER49 |
0.91 |
| 29 |
ILE60 |
ILE50 |
0.92 |
| 30 |
THR56 |
THR46 |
0.69 |
| 31 |
TRP57 |
TRP47 |
0.25 |
| 32 |
VAL50 |
VAL40 |
0.43 |
| 33 |
ILE51 |
ILE41 |
0.59 |
| 34 |
MET52 |
MET42 |
0.28 |
| 35 |
GLY53 |
GLY43 |
0.66 |
| 36 |
THR38 |
THR35 |
0.57 |
| 37 |
ARG36 |
ARG33 |
1.14 |
| 38 |
PHE34 |
PHE31 |
0.57 |
| 39 |
LEU27 |
LEU24 |
0.78 |
| 40 |
ILE16 |
ILE14 |
0.92 |
| 41 |
GLY17 |
GLY15 |
1.14 |
| 42 |
ALA9 |
ALA7 |
0.64 |
| 43 |
ILE7 |
ILE5 |
0.33 |
Weighted Root Mean Square Distance: 0.8602
Elastase
5 PDB structures: 1ela, 1elb, 1elc, 1eld, 1ele
2 CASP structures: t0035, t0036
Comment: nothing specific
Endothiapepsin
5 PDB structures: 2er7, 4er1, 4er2, 5er1, 5er2
Comment: nothing specific
Fructose
1 PDB structure: 4fbp
1 CASP structure: t0039
Comment: nothing specific
Glycogenphosphorylase
4 PDB structures: 1gpy, 3gpb, 4gpb, 5gpb
Comment: nothing specific
HIV-Protease
10 PDB structures: 1hiv, 1hos, 1ivp, 1ivq, 2mip, 4hvp, 4phv, 5hvp, 8hvp, 9hvp
Comment: nothing specific
Immunoglobulin
5 PDB structures: 1dbb, 1dbj, 1dbk, 1dbm, 2dbl
Comment: nothing specific
Rhinovirus
8 PDB structures: 2r04, 2r06, 2r07, 2rm2, 2rr1, 2rs1, 2rs3, 2rs5
Comment: nothing specific
Streptavidin
5 PDB structures: 1srf, 1srg, 1srh, 1sri, 1srj
Comment: nothing specific
Thermolysin
8 PDB structures: 1tlp, 1tmn, 2tmn, 3tmn, 4tln, 4tmn, 5tln, 5tmn
4 structures published in [4] : cbz, ppp, rthior, thior
Comment: nothing specific
Thrombin
3 PDB structures: 1dwc, 1dwd, 1ett
Comment: nothing specific
Trypsin
7 PDB structures: 1pph, 1tnh, 1tni, 1tnj, 1tnk, 1tnl, 3ptb
Comment: nothing specific
References:
- SYBYL molecular modeling software, Tripos Inc, 1699 South Hanley Rd, Suite 303, St Louis, MO 63144
- Gasteiger, J; Marsili M. Tetrahedron 1980, 36, 3219-3228
- Sali, A; Blundell, TL. J. Mol. Biol. 1993, 234, 779-815
- Böhm, HJ; Klebe, G. Angew. Chem. Int. Ed. Engl. 1996, 35, 2588-2614
Kester, WR; Matthews, BW. Biochemistry 1977, 16, 2506-2516
Roderick, SL; Fournie-Zaluski, MC; Roques, BP; Matthews, BW.
Biochemistry 1989, 28, 1493-1497
|