Structural Bioinformatics with VMD MultiSeq

Open VMD. With the MultiSeq plugin it provides a convenient interface to do structural bioinformatics.

Loading structures of AdK from different organisms

We will manually select one PDB code from each organism. (Normally, one would do a more careful selection, e.g., taking resolution into account).

We want to analyze AdK structures with PDB codes 4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx.

Manual download of pre-packaged structures

  1. For this practical, download a zip file with all PDB files: adk_pdbs.zip.
  2. Uncompress the file (try double clicking).
  3. Load each structure into VMD by using File ‣ Load New Molecule.

You might be able to automate loading of the pdb files with Tcl code (this is advanced usage):

Open Extensions ‣ Tk Console and type the following:

# "path/to/adk_pdbs" must be the file system path to the unzipped
# directory
# Ask your instructor for help.
cd path/to/adk_pdbs

Now you can load all the files from the command line.

set pdbcodes {4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake  4pzl 1zin 4k46 1s3g  3be4 3fb4 3tlx}
foreach pdb $pdbcodes {
    set pdbfile ${pdb}.pdb
    puts "Loading $pdbfile..."
    mol new $pdbfile
}

Manual download from the Protein Databank

Download the following structures with PDB codes 4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx.

1. Open the Download dialog

  1. Copy and paste the PDB codes into the box “Download: Coordinates & Experimental Data”:

    4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt
    1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx
    
  2. Select the checkmark - for PDB only. - for uncompressed

  3. Launch Download (and save the files to a directory where you do the work for the practical)

  4. Load each structure into VMD by using File ‣ Load New Molecule.

Automatic download from the Protein Databank

Warning

As of November 2017 the following is not working with any version of VMD prior to VMD 1.9.4 alpha because of reorganization of file locations in the Protein Databank. Use the “manual” download recipe above.

Open Extensions ‣ Tk Console and type

set pdbcodes {4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake  4pzl 1zin 4k46 1s3g  3be4 3fb4 3tlx}
foreach pdb $pdbcodes {puts "Loading $pdb..."; mol new $pdb}

(4np6 excluded because it does not parse easily, and I did not have time to check what was wrong.)

Using multiseq

In VMD, load Extensions ‣ Analysis ‣ MultiSeq. (This can take a moment when it downloads updates.)

Manually delete all chains B, C, … (highlight and delete.)

Perform a STAMP structural alignment: In Multiseq choose Tools ‣ Stamp Structural Alignment.

Structural conservation

\(Q\) (Qres) is a measure of structural similarity.

\(Q\) is a parameter that indicates structural identity. \(Q\) accounts for the fraction of similar native contacts between the aligned residues in two proteins [Eastwood2001]. \(Q=1\) implies that structures are identical. When \(Q\) has a low score (0.1-0.3), structures are not aligned well, i.e., only a small fraction of the Cα atoms superimpose. \(Q\) per residue is the contribution from each residue to the overall \(Q\) value of aligned structures.

  1. In Multiseq window, choose View ‣ Coloring ‣ Qres.
  2. Observe the coloring in the sequence alignment and the graphics window (projected on structures)

Note that the CORE domain has high Qres. This indicates that it superimposes well in all structures.

Sequence conservation

Color by Sequence Identity.

Note the residues that are 100% conserved (Search ‣ Select Residues…: Where Sequence Idenity >= 100):

  • R, K
  • G, P

Switch to 1ake and create a new rep for chain A and  resname AP5 (use CPK or VDW and color by name)

What is the role of the conserved R (Arg) and K (Lys)? (MultiSeq View ‣ Highlight Color ‣ ResType).

Phylogenetic tree

You need an alignment to create a tree. A phylogenetic tree displays evolutionary relationships.

Tools ‣ Phylogenetic Tree.

  • using Percent Identity
  • label with full organism name

Note that this tree is based on the structural alignment and the conformational change that is visible obscures some of the evolutionary relationships.

References

[Eastwood2001]Eastwood, M.P., C. Hardin, Z. Luthey-Schulten, and P.G. Wolynes. “Evaluating the protein structure-prediction schemes using energy landscape theory.” IBM J. Res. Dev. 45: 475-497, 2001. URL: http://www.research.ibm.com/journal/rd/453/eastwood.pdf