DOCK6.1 Tutorial
DOCK6.1 Tutorial
1)
Author: John E. Kerrigan, Ph.D. University of Medicine & Dentistry of New Jersey Robert Wood Johnson Medical School 675 Hoes Lane Piscataway, NJ 08854 U.S.A. (732) 235-4473 phone (732) 235-5252 fax [email protected] http://www2.umdnj.edu/~kerrigje
1
Organization of Workspace: Create a project directory (use the unix mkdir command). Use this directory as your working directory. mkdir proj1 cd proj1 CAUTION: Be sure that you have enough disk quota for your docking job. The dms program, sphgen and DOCK use a fair amount of disk space. To check your quota use the following commands: If logged on to one of the linux workstations 1st login to the main server to check your quota. ssh [email protected] quota v Exit when finished. Type exit then hit the Enter key. Docking in DOCK is divided into four stages: STAGE 1. STAGE 2. STAGE 3. STAGE 4. Ligand Preparation Site Characterization Scoring Grid Calculation Docking
Download the dock_input.tgz file from the course webpage (http://www2.umdnj.edu/~kerrigje/structbio_II.htm) How DOCK works in a nutshell. First, you begin with an x-ray crystal structure of a drug/receptor complex. The active site is identified or defined a priori. Points within this site known as spheres are used to define the volume or space within the active site pocket where the drug binds. The purpose of the spheres is to generate an unbiased grid of sphere centers that reflects the actual shape of the active site (i.e. the protein/macromolecule dictates the shape of the pocket; not the drug) using the grid program. [1, 2] Sphere centers are matched with ligand (drug) atoms to generate orientations of the ligand in the active site within the program DOCK. [2, 3] The orientation of the ligand is scored using a shape-scoring function and/or an energy function (e.g. Ebind = Evdw + Eelec). The shape score is an empirical van der Waals attractive energy. As a final step, the orientation may be energy minimized using a rigid-body simplex minimization. [4] Research Problem: Inhibition of Factor Xa We will study the inhibition of coagulation factor, factor Xa, as this enzyme has a multitude of inhibitor data. The x-ray crystal structure (1FJS) will be used as a modeling template. [5] STAGE 1. Prepare the ligand.
Obtain the PDB file, 1FJS.pdb, from the protein data bank (http://www.pdb.org/pdb/home/home.do ). We will recommend the UCSF Chimera software package [6] for build of the ligand and protein (receptor) models from the x-ray crystal structure coordinates (PDB file). However, we need to set special ANCHOR and RIGID atom sets. For this we need to use the Sybyl (Tripos, Inc.) software package. Open Sybyl and perform the following File > Read Read File > Files: (select 1FJS.pdb) > File Type: M1: (select with mouse) > File to read: 1FJS.pdb (select with mouse) > Click OK > Option Center the Molecule > Click CENTER_VIEW > OK Use the middle mouse button to translate the structure and the right mouse button to rotate the structure. Build/Edit > Extract Atom Expression > Substructures > Others: A/Z34500 > Click OK and OK Molecule Area > M2: > Click OK Remove the enzyme from display area M1: Build/Edit > Zap (Delete) Molecule > Molecule Expression > Select M1: click OK Zoom into the drug using the middle and right mouse buttons (press simultaneously) and use the middle mouse button to translate (center). Color by atom type. View > Color > By Atom Type Name the drug molecule Build/Edit > Name Molecule > ZK807834 > OK Set the Depth Cue to zero using the depth cue button on the left hand menu. Label atoms by type using the checkmark button on the left hand menu (your drug is in display area D2) There are several atom types we must fix before proceeding.
Initial Atom Types Build/Edit > Modify > Atom > Option > ONLY_TYPE Click OK Atom Expression > Using the Mouse: Select all atoms requiring atom type correction (see the final corrected atom types in the figure below). Your atom type labels should appear as in the figure below.
Build/Edit > Add > Hydrogens Compute > Charges > Gasteiger-Huckel Click on No when the dialog asks you if you want to change formal charges. Before you save the ligand, you must declare the Anchor atoms as a static set and declare the certain bonds in the structure as RIGID. The ANCHOR set Build/Edit > Define > Static Set > Option > ATOM Click OK In Atom Expression, select the following asterisked atoms
H H N H N H
OH O * N * O F O O
N N CH3
F CH3
Name this set ANCHOR. The RIGID set Build/Edit > Define > Static Set > Option > BOND Click OK In Bond Expression, select the following highlighted bond
H H N H O F O O N N H
OH N O F CH3
N N CH3
Name this set RIGID. Using File > Save As Save the ligand as zk807834.mol2; Format: MOL2.
STAGE 2. Site Characterization. We must remove the ligand and any solvent from the receptor (protein). We will use chimera for preparation of the receptor. Start Chimera by typing chimera in the command line in the unix shell. Load 1FJS.pdb. File > Open > 1FJS.pdb Delete the nonstandard residues (water, ligand, etc.). Select > Residue > all nonstandard Actions > Atoms/Bonds > delete Use the Dock Prep tool in Structure editing to prepare the mol2 file. Tools > Structure Editing > Dock Prep
Add hydrogens
Edit the insertion code residue ID numbers. Insertion code is used to keep residue numbering consistent from species to species. However, it can be a pain for molecular modeling programs. The sphgen program will choke on insertion code residue ID numbers as these numbers have a letter appended to them (e.g. 61A). Edit the fxa.pdb file in any text editor and remove the letters from these residue ID numbers (replace with a space). Run the dms program and create the molecular surface data for the sphere calculation. Type man dms to obtain more information about the dms program and its run options. dms fxa.pdb a n w 1.4 v o fxa.ms Next, we must run sphgen. In order to do so, we must have a file known as INSPH as input for the sphgen program. The contents of INSPH are as follows (Only use the information between the hashed lines!). INSPH -------------------------------------------fxa.ms R X 0.0 4.0 1.4 fxa.sph ---------------------------------------Run sphgen simply by typing sphgen in the unix shell as follows. sphgen
Next we will use the coordinates of the ligand from the crystal structure to select the relevant spheres for the grid and docking computations. For this we will use a program called sphere_selector. The command line format is: sphere_selector file.sph ligand.mol2 #.# where #.# is the number of angstroms out from the ligand that you want to include spheres. sphere_selector fxa.sph zk807834.mol2 6.0 The output from this operation is always a file named selected_spheres.sph. Use the showsphere program to make a pdb file of the selected spheres. Type showsphere <enter> Name of the sphere cluster file: selected_spheres.sph Cluster number to process: 1 Generate surfaces? N Name for output pbd file: sel_sph.pdb
Type showbox <enter>. automatically construct box to enclose spheres [Y/N] ? Y extra margin to also be enclosed (angstroms)? 10 sphere fileselected_spheres.sph cluster number 1 output filename? site_box.pdb STAGE 3. Scoring Grid Calculation Download grid.in file from the course web page and use it for your grid input. See contents of grid.in which follows. Run grid. nohup grid i grid.in o grid.out > grid.out & Contents of grid.in
compute_grids grid_spacing output_molecule contact_score energy_score energy_cutoff_distance atom_model attractive_exponent repulsive_exponent distance_dielectric dielectric_factor bump_filter bump_overlap receptor_file box_file vdw_definition_file score_grid_prefix yes 0.3 no no yes 9999 a 6 12 yes 4 yes 0.75 fxa.mol2 site_box.pdb /usr/local/dock6/parameters/vdw_AMBER_parm99.defn grid
STAGE 4. Docking Now run DOCK. Download dock.in file from the course web page and use it for you dock input. See the dock.in file which follows.
Run DOCK using the following command nohup dock6 i dock.in o dock.out > dock.out & Contents of dock.in
ligand_atom_file limit_max_ligands skip_molecule read_mol_solvation calculate_rmsd orient_ligand automated_matching receptor_site_file max_orientations critical_points chemical_matching use_ligand_spheres flexible_ligand min_anchor_size pruning_use_clustering pruning_max_orients pruning_clustering_cutoff use_internal_energy internal_energy_att_exp internal_energy_rep_exp internal_energy_dielectric use_clash_overlap bump_filter score_molecules contact_score_primary contact_score_secondary grid_score_primary grid_score_secondary grid_score_rep_rad_scale grid_score_vdw_scale grid_score_es_scale grid_score_grid_prefix minimize_ligand minimize_anchor minimize_flexible_growth use_advanced_simplex_parameters simplex_max_cycles simplex_score_converge simplex_cycle_converge simplex_trans_step simplex_rot_step simplex_tors_step simplex_anchor_max_iterations simplex_grow_max_iterations simplex_final_min simplex_secondary_minimize_pose simplex_random_seed atom_model vdw_defn_file /usr/local/dock6/parameters/vdw_AMBER_parm99.defn flex_defn_file /usr/local/dock6/parameters/flex.defn flex_drive_file /usr/local/dock6/parameters/flex_drive.tbl ligand_outfile_prefix write_orientations num_primary_scored_conformers_rescored num_secondary_scored_conformers_written rank_primary_ligands rank_secondary_ligands zk807834.mol2 no no no no yes yes selected_spheres.sph 500 no no no yes 40 yes 100 100 yes 6 12 4.0 no no yes no no yes yes 1 1 1 grid yes yes yes no 1 0.1 1.0 1.0 0.1 10.0 500 500 no no 0 all
zk807_out no 1 1 no no
10
Here are our results From the tail of dock.out (tail -25 dock.out):
---------------------------------------------------------------Molecule: ZK807834 Secondary Score Grid Score: vdw: es: -72.438095 -54.504940 -17.933155
The structure in gray is from the x-ray crystal structure and the structure in green is the docked structure. From the figure above we see that by and large DOCK has returned the same orientation as from the crystal structure. There are a few differences in conformation, however. Next, we will dock a small database of compounds (we used the zk807834 compound from the crystal structure as a template to build these compounds in Sybyl 7.0; hence, zk807834 also appears in this database). The database is in mol2 format. Download pyridins.mol2 from the course webpage.
11
Contents of dock_db.in
ligand_atom_file limit_max_ligands skip_molecule read_mol_solvation calculate_rmsd orient_ligand automated_matching receptor_site_file selected_spheres.sph max_orientations critical_points chemical_matching use_ligand_spheres flexible_ligand min_anchor_size pruning_use_clustering pruning_max_orients pruning_clustering_cutoff use_internal_energy internal_energy_att_exp internal_energy_rep_exp internal_energy_dielectric use_clash_overlap bump_filter score_molecules contact_score_primary contact_score_secondary grid_score_primary grid_score_secondary grid_score_rep_rad_scale grid_score_vdw_scale grid_score_es_scale grid_score_grid_prefix minimize_ligand minimize_anchor minimize_flexible_growth use_advanced_simplex_parameters simplex_max_cycles simplex_score_converge simplex_cycle_converge simplex_trans_step simplex_rot_step simplex_tors_step simplex_anchor_max_iterations simplex_grow_max_iterations simplex_final_min simplex_secondary_minimize_pose simplex_random_seed atom_model vdw_defn_file /usr/local/dock6/parameters/vdw_AMBER_parm99.defn flex_defn_file /usr/local/dock6/parameters/flex.defn pyridins.mol2 no no no no yes yes 500 no no no yes 40 yes 100 100 yes 6 12 4.0 no no yes no no yes yes 1 1 1 grid yes yes yes no 1 0.1 1.0 1.0 0.1 10.0 500 500 no no 0 all
12
nohup dock6 i dock_db.in o dock.out > dock_db.out & The database docking gives slightly different results for the zk807834 compound. This run was performed on a 2.2 GHz Linux workstation. Here are the structures of the compounds in the database:
H H N H O * N * O F N H
H H
O NH2
N H O F O
OH N O F N CH3
N N CH3
ZK807834
O OCH2CH3
H H N H
O F
OH N O F N O O
N N CH3
COMPD V
H H N H
OH N O
N N H
COMPD IV
13
Ranked Scores:
----------------------------------Molecule: COMPD_V Secondary Score Grid Score: vdw: es: -75.869560 -57.236599 -18.632963
----------------------------------Molecule: ZK-807834 Secondary Score Grid Score: vdw: es: -71.737541 -52.500782 -19.236761
----------------------------------Molecule: COMPD_IV Secondary Score Grid Score: vdw: es: -68.862778 -52.818031 -16.044750
----------------------------------Molecule: COMPD002 Secondary Score Grid Score: vdw: es: -59.259888 -49.601242 -9.658647
----------------------------------Molecule: COMPD001 Secondary Score Grid Score: vdw: es: -55.440678 -44.884964 -10.555713
The pyridins_out_scored.mol2 database can be viewed using chimera (see http://www.cgl.ucsf.edu/chimera/) or Sybyl.
CMPD001 CMPD002 CMPD_IV CMPD_V ZK807834 6.55 5.28 1.93 -0.96 -0.96 -55.4 -59.3 -68.9 -75.9 -71.7
14
-50 -2 -1 -55 0 1 2 3 4 5 6 7
Eint (kcal/mol)
-60
-65
-70
-75
-80
log Ki
The Ki data for compounds 001 and 002 were obtained from a review paper. [7] All other Ki data were obtained from the structure paper. [5] Our principal outlier is compound ZK807834. The other compounds in the series fall into line in this plot very well. Normally, we would sample more than just five compounds to establish a trend. This example is just illustrative of the application of the DOCK method and to caution one about the use of the simple scoring function (Eint = Evdw + Eelec) used in DOCK. References: 1. Meng, E.C., B.K. Shoichet, and I.D. Kuntz, Automated docking with grid-based energy evaluation. J. Comp. Chem., 1992. 13: p. 505-524. 2. Shoichet, B.K., D.L. Bodian, and I.D. Kuntz, Molecular docking using shape descriptors. J. Comp. Chem., 1992. 13(3): p. 380-397. 3. Kuntz, I.D., et al., A geometric approach to macromolecule-ligand interactions. J. Mol. Biol., 1982. 161: p. 269-288. 4. Meng, E.C., et al., Orientational sampling and rigid-body minimization in molecular docking. Proteins, 1993. 17(3): p. 266-278. 5. Adler, M., et al., Preparation, Characterization and the Crystal Structure of the Inhibitor Zk807834 (Ci-1031) Complexed with Factor Xa. Biochemistry, 2000. 39(41): p. 1253412542. 6. Pettersen, E., et al., UCSF Chimera - A Visualization System for Exploratory Research and Analysis. J. Comput. Chem., 2004. 25: p. 1605-1612. 7. Kontogiorgis, C. and D. Hadjipavlou-Litina, Current Trends in Quantitative Structure Activity Relationships on FXa inhibitors: Evaluation and Comparative Analysis. Med. Res. Rev., 2004. 24(6): p. 687-747.
15