.. -*- coding: utf-8 -*- .. _multiseq: =========================================== Structural Bioinformatics with VMD MultiSeq =========================================== Open :program:`VMD`. With the `MultiSeq plugin`_ it provides a convenient interface to do structural bioinformatics. Loading structures of AdK from different organisms ================================================== We will manually select one PDB code from each organism. (Normally, one would do a more careful selection, e.g., taking resolution into account). We want to analyze AdK structures with PDB codes **4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx**. Manual download of pre-packaged structures ------------------------------------------ 1. For this practical, download a zip file with all PDB files: :download:`adk_pdbs.zip <_downloads/adk_pdbs.zip>`. 2. Uncompress the file (try double clicking). 3. Load each structure into VMD by using :menuselection:`File --> Load New Molecule`. ----- You might be able to automate loading of the pdb files with Tcl code (this is advanced usage): Open :menuselection:`Extensions --> Tk Console` and type the following: .. code-block:: tcl # "path/to/adk_pdbs" must be the file system path to the unzipped # directory # Ask your instructor for help. cd path/to/adk_pdbs Now you can load all the files from the command line. .. code-block:: tcl set pdbcodes {4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx} foreach pdb $pdbcodes { set pdbfile ${pdb}.pdb puts "Loading $pdbfile..." mol new $pdbfile } Manual download from the Protein Databank ----------------------------------------- Download the following structures with PDB codes **4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx**. 1. Open the `Download `_ dialog 2. Copy and paste the PDB codes into the box "Download: Coordinates & Experimental Data":: 4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx 3. Select the checkmark - for **PDB** only. - for **uncompressed** 4. *Launch Download* (and save the files to a directory where you do the work for the practical) 5. Load each structure into VMD by using :menuselection:`File --> Load New Molecule`. Automatic download from the Protein Databank -------------------------------------------- .. warning:: As of November 2017 the following is not working with any version of VMD prior to `VMD 1.9.4 alpha `_ because of reorganization of file locations in the Protein Databank. Use the "manual" download recipe above. Open :menuselection:`Extensions --> Tk Console` and type .. code-block:: tcl set pdbcodes {4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx} foreach pdb $pdbcodes {puts "Loading $pdb..."; mol new $pdb} (4np6 excluded because it does not parse easily, and I did not have time to check what was wrong.) Using multiseq -------------- In VMD, load :menuselection:`Extensions --> Analysis --> MultiSeq`. (This can take a moment when it downloads updates.) Manually delete all chains B, C, ... (highlight and :kbd:`delete`.) .. Delete 4np6 (STAMP complains). Hide rep. Perform a STAMP structural alignment: In Multiseq choose :menuselection:`Tools --> Stamp Structural Alignment`. .. For changing all representations to Tube: .. 1. go to top molecule .. 2. change active rep to Tube (and color) .. 3. :menuselection:`Extensions --> Visualization --> Clone Representation`: .. From Top to All Structural conservation ======================= :math:`Q` (``Qres``) is a measure of structural similarity. :math:`Q` is a parameter that indicates structural identity. :math:`Q` accounts for the fraction of similar native contacts between the aligned residues in two proteins [Eastwood2001]_. :math:`Q=1` implies that structures are identical. When :math:`Q` has a low score (0.1-0.3), structures are not aligned well, i.e., only a small fraction of the Cα atoms superimpose. :math:`Q` per residue is the contribution from each residue to the overall :math:`Q` value of aligned structures. 1. In Multiseq window, choose :menuselection:`View --> Coloring --> Qres`. 2. Observe the coloring in the sequence alignment and the graphics window (projected on structures) Note that the CORE domain has high ``Qres``. This indicates that it superimposes well in all structures. Sequence conservation ===================== Color by *Sequence Identity*. Note the residues that are 100% conserved (:menuselection:`Search --> Select Residues...`: Where Sequence Idenity >= 100): - R, K - G, P Switch to *1ake* and create a new rep for ``chain A and resname AP5`` (use *CPK* or *VDW* and color by *name*) What is the role of the conserved R (Arg) and K (Lys)? (MultiSeq :menuselection:`View --> Highlight Color --> ResType`). Phylogenetic tree ================= You need an alignment to create a tree. A phylogenetic tree displays evolutionary relationships. :menuselection:`Tools --> Phylogenetic Tree`. - using Percent Identity - label with full organism name Note that this tree is based on the structural alignment and the conformational change that is visible obscures some of the evolutionary relationships. References ========== .. [Eastwood2001] Eastwood, M.P., C. Hardin, Z. Luthey-Schulten, and P.G. Wolynes. “Evaluating the protein structure-prediction schemes using energy landscape theory.” IBM J. Res. Dev. 45: 475-497, 2001. URL: http://www.research.ibm.com/journal/rd/453/eastwood.pdf .. _`MultiSeq plugin`: http://www.ks.uiuc.edu/Research/vmd/plugins/multiseq/