# Trajectory analysis¶

The Universe binds together the static topology (which atoms, how are they connected, what un-changing properties do the atoms possess (such as partial charge), ...) and the changing coordinate information, which is stored in the trajectory.

The length of a trajectory (number of frames) is

len(u.trajectory)


The standard way to assess each time step (or frame) in a trajectory is to iterate over the Universe.trajectory attribute (which is an instance of Reader class):

for ts in u.trajectory:
print("Frame: %5d, Time: %8.3f ps" % (ts.frame, u.trajectory.time))
print("Rgyr: %g A" % (u.atoms.radiusOfGyration(), ))


The time attribute contains the current time step. The Reader only contains information about one time step: imagine a cursor or pointer moving along the trajectory file. Where the cursor points, there’s you current coordinates, frame number, and time.

Normally you will collect the data in a list or array, e.g.

Rgyr = []
protein = u.selectAtoms("protein")
for ts in u.trajectory:
Rgyr = np.array(Rgyr)


Note

It is important to note that the coordinates and related properties calculated from the coordinates such as the radius of gyration change while selections such as protein in the example do not change when moving through a trajectory: You can define the selection once and the recalculate the property of interest for each frame of the trajectory.

The data can be plotted to give the graph below:

# quick plot
from pylab import *
plot(Rgyr[:,0], Rgyr[:,1], 'r--', lw=2, label=r"$R_G$")
xlabel("time (ps)")
ylabel(r"radius of gyration $R_G$ ($\AA$)")


What does the shape of the $$R_G(t)$$ time series indicate?

## Exercises 4¶

1. Take the functions to calculate $$\theta_\text{NMP}$$ and $$\theta_\text{LID}$$

import numpy as np
from numpy.linalg import norm

def theta_NMP(u):
"""Calculate the NMP-CORE angle for E. coli AdK in degrees"""
C = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
B = u.selectAtoms("resid 90:100 and (backbone or name CB)").centerOfGeometry()
A = u.selectAtoms("resid 35:55 and (backbone or name CB)").centerOfGeometry()
BA = A - B
BC = C - B
theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))

def theta_LID(u):
"""Calculate the LID-CORE angle for E. coli AdK in degrees"""
C = u.selectAtoms("resid 179:185 and (backbone or name CB)").centerOfGeometry()
B = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
A = u.selectAtoms("resid 125:153 and (backbone or name CB)").centerOfGeometry()
BA = A - B
BC = C - B
theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))


and calculate the time series $$\theta_\text{NMP}(t)$$ and $$\theta_\text{LID}(t)$$.

Plot them together in one plot.

2. Plot $$\theta_\text{NMP}(t)$$ against $$\theta_\text{LID}(t)$$.

What does the plot show?

Why could such a plot be useful?

The code to generate the figure contains theta_LID() and theta_NMP().

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 import numpy as np from numpy.linalg import norm def theta_NMP(u): """Calculate the NMP-CORE angle for E. coli AdK in degrees""" C = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry() B = u.selectAtoms("resid 90:100 and (backbone or name CB)").centerOfGeometry() A = u.selectAtoms("resid 35:55 and (backbone or name CB)").centerOfGeometry() BA = A - B BC = C - B theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC))) return np.rad2deg(theta) def theta_LID(u): """Calculate the LID-CORE angle for E. coli AdK in degrees""" C = u.selectAtoms("resid 179:185 and (backbone or name CB)").centerOfGeometry() B = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry() A = u.selectAtoms("resid 125:153 and (backbone or name CB)").centerOfGeometry() BA = A - B BC = C - B theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC))) return np.rad2deg(theta) if __name__ == "__main__": import MDAnalysis from MDAnalysis.tests.datafiles import PSF, DCD import matplotlib import matplotlib.pyplot as plt u = MDAnalysis.Universe(PSF, DCD) data = np.array([(u.trajectory.time, theta_NMP(u), theta_LID(u)) for ts in u.trajectory]) time, NMP, LID = data.T # plotting degreeFormatter = matplotlib.ticker.FormatStrFormatter(r"%g$^\circ$") fig = plt.figure(figsize=(6,3)) ax1 = fig.add_subplot(121) ax1.plot(time, NMP, 'b-', lw=2, label=r"$\theta_{\mathrm{NMP}}$") ax1.plot(time, LID, 'r-', lw=2, label=r"$\theta_{\mathrm{LID}}$") ax1.set_xlabel(r"time $t$ (ps)") ax1.set_ylabel(r"angle $\theta$") ax1.yaxis.set_major_formatter(degreeFormatter) ax1.legend(loc="best") ax2 = fig.add_subplot(122) ax2.plot(NMP, LID, 'k-', lw=3) ax2.set_xlabel(r"NMP-CORE angle $\theta_{\mathrm{NMP}}$") ax2.set_ylabel(r"LID-CORE angle $\theta_{\mathrm{LID}}$") ax2.xaxis.set_major_formatter(degreeFormatter) ax2.yaxis.set_major_formatter(degreeFormatter) ax2.yaxis.tick_right() ax2.yaxis.set_label_position("right") fig.subplots_adjust(left=0.12, right=0.88, bottom=0.2, wspace=0.15) for ext in ('svg', 'pdf', 'png'): fig.savefig("NMP_LID_angle_projection.{0}".format(ext)) 

Note that one would normally write the code more efficiently and generate the atom groups once and then pass them to a simple function to calculate the angle

def theta(A, B, C):
"""Calculate the angle between BA and BC for AtomGroups A, B, C"""
B_center = B.centroid()
BA = A.centroid() - B_center
BC = C.centroid() - B_center
theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))


## Bells and whistles¶

Quick data acquisition

Especially useful for interactive analysis in ipython –pylab using list comprehensions (implicit for loops):

protein = u.selectAtoms("protein")
data = np.array([(u.trajectory.time, protein.radiusOfGyration()) for ts in u.trajectory])
time, RG = data.T
plot(time, RG)


More on the trajectory iterator

>>> u.trajectory[50]
< Timestep 51 with unit cell dimensions array([  0.,   0.,   0.,  90.,  90.,  90.], dtype=float32) >
>>> ts.frame
51


You can also slice trajectories, e.g. if you want to start at the 10th frame and go to 10th before the end, and only use every 5th frame:

for ts in u.trajectory[9:-10:5]:
print(ts.frame)
...


(although doing this on Gromacs XTC and TRR trajectories is currently much slower than for DCDs.)

Note

Trajectory indexing and slicing uses 0-based indices (as in standard Python) but MDAnalysis numbers frames starting with 1 (for historical reasons and according to the practice of all MD codes).