Open VMD. With the MultiSeq plugin it provides a convenient interface to do structural bioinformatics.
We will manually select one PDB code from each organism. (Normally, one would do a more careful selection, e.g., taking resolution into account).
We want to analyze AdK structures with PDB codes 4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx.
adk_pdbs.zip
.You might be able to automate loading of the pdb files with Tcl code (this is advanced usage):
Open
and type the following:# "path/to/adk_pdbs" must be the file system path to the unzipped
# directory
# Ask your instructor for help.
cd path/to/adk_pdbs
Now you can load all the files from the command line.
set pdbcodes {4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx}
foreach pdb $pdbcodes {
set pdbfile ${pdb}.pdb
puts "Loading $pdbfile..."
mol new $pdbfile
}
Download the following structures with PDB codes 4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx.
1. Open the Download dialog
Copy and paste the PDB codes into the box “Download: Coordinates & Experimental Data”:
4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt
1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx
Select the checkmark - for PDB only. - for uncompressed
Launch Download (and save the files to a directory where you do the work for the practical)
Load each structure into VMD by using
.Warning
As of November 2017 the following is not working with any version of VMD prior to VMD 1.9.4 alpha because of reorganization of file locations in the Protein Databank. Use the “manual” download recipe above.
Open
and typeset pdbcodes {4jzk 1ak2 2c9y 1aky 2ak3 1zd8 2ar7 1p3j 3gmt 1ake 4pzl 1zin 4k46 1s3g 3be4 3fb4 3tlx}
foreach pdb $pdbcodes {puts "Loading $pdb..."; mol new $pdb}
(4np6 excluded because it does not parse easily, and I did not have time to check what was wrong.)
In VMD, load
. (This can take a moment when it downloads updates.)Manually delete all chains B, C, … (highlight and delete
.)
Perform a STAMP structural alignment: In Multiseq choose
.\(Q\) (Qres
) is a measure of structural similarity.
\(Q\) is a parameter that indicates structural identity. \(Q\) accounts for the fraction of similar native contacts between the aligned residues in two proteins [Eastwood2001]. \(Q=1\) implies that structures are identical. When \(Q\) has a low score (0.1-0.3), structures are not aligned well, i.e., only a small fraction of the Cα atoms superimpose. \(Q\) per residue is the contribution from each residue to the overall \(Q\) value of aligned structures.
Note that the CORE domain has high Qres
. This indicates that it
superimposes well in all structures.
Color by Sequence Identity.
Note the residues that are 100% conserved (
: Where Sequence Idenity >= 100):Switch to 1ake and create a new rep for chain A and resname AP5
(use CPK or VDW and color by name)
What is the role of the conserved R (Arg) and K (Lys)? (MultiSeq
).You need an alignment to create a tree. A phylogenetic tree displays evolutionary relationships.
.
Note that this tree is based on the structural alignment and the conformational change that is visible obscures some of the evolutionary relationships.
[Eastwood2001] | Eastwood, M.P., C. Hardin, Z. Luthey-Schulten, and P.G. Wolynes. “Evaluating the protein structure-prediction schemes using energy landscape theory.” IBM J. Res. Dev. 45: 475-497, 2001. URL: http://www.research.ibm.com/journal/rd/453/eastwood.pdf |