Trajectory analysis

The Universe binds together the static topology (which atoms, how are they connected, what un-changing properties do the atoms possess (such as partial charge), ...) and the changing coordinate information, which is stored in the trajectory.

The length of a trajectory (number of frames) is

len(u.trajectory)

The standard way to assess each time step (or frame) in a trajectory is to iterate over the Universe.trajectory attribute (which is an instance of Reader class):

for ts in u.trajectory:
    print("Frame: %5d, Time: %8.3f ps" % (ts.frame, u.trajectory.time))
    print("Rgyr: %g A" % (u.atoms.radiusOfGyration(), ))

The time attribute contains the current time step. The Reader only contains information about one time step: imagine a cursor or pointer moving along the trajectory file. Where the cursor points, there’s you current coordinates, frame number, and time.

Normally you will collect the data in a list or array, e.g.

Rgyr = []
protein = u.selectAtoms("protein")
for ts in u.trajectory:
   Rgyr.append((u.trajectory.time, protein.radiusOfGyration()))
Rgyr = np.array(Rgyr)

Note

It is important to note that the coordinates and related properties calculated from the coordinates such as the radius of gyration change while selections such as protein in the example do not change when moving through a trajectory: You can define the selection once and the recalculate the property of interest for each frame of the trajectory.

The data can be plotted to give the graph below:

# quick plot
from pylab import *
plot(Rgyr[:,0], Rgyr[:,1], 'r--', lw=2, label=r"$R_G$")
xlabel("time (ps)")
ylabel(r"radius of gyration $R_G$ ($\AA$)")

What does the shape of the \(R_G(t)\) time series indicate?

Exercises 4

  1. Take the functions to calculate \(\theta_\text{NMP}\) and \(\theta_\text{LID}\)

    import numpy as np
    from numpy.linalg import norm
    
    def theta_NMP(u):
        """Calculate the NMP-CORE angle for E. coli AdK in degrees"""
        C = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
        B = u.selectAtoms("resid 90:100 and (backbone or name CB)").centerOfGeometry()
        A = u.selectAtoms("resid 35:55 and (backbone or name CB)").centerOfGeometry()
        BA = A - B
        BC = C - B
        theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
        return np.rad2deg(theta)
    
    def theta_LID(u):
        """Calculate the LID-CORE angle for E. coli AdK in degrees"""
        C = u.selectAtoms("resid 179:185 and (backbone or name CB)").centerOfGeometry()
        B = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
        A = u.selectAtoms("resid 125:153 and (backbone or name CB)").centerOfGeometry()
        BA = A - B
        BC = C - B
        theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
        return np.rad2deg(theta)
    

    and calculate the time series \(\theta_\text{NMP}(t)\) and \(\theta_\text{LID}(t)\).

    Plot them together in one plot.

  2. Plot \(\theta_\text{NMP}(t)\) against \(\theta_\text{LID}(t)\).

    What does the plot show?

    Why could such a plot be useful?

The code to generate the figure contains theta_LID() and theta_NMP().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import numpy as np
from numpy.linalg import norm

def theta_NMP(u):
    """Calculate the NMP-CORE angle for E. coli AdK in degrees"""
    C = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
    B = u.selectAtoms("resid 90:100 and (backbone or name CB)").centerOfGeometry()
    A = u.selectAtoms("resid 35:55 and (backbone or name CB)").centerOfGeometry()
    BA = A - B
    BC = C - B
    theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
    return np.rad2deg(theta)

def theta_LID(u):
    """Calculate the LID-CORE angle for E. coli AdK in degrees"""
    C = u.selectAtoms("resid 179:185 and (backbone or name CB)").centerOfGeometry()
    B = u.selectAtoms("resid 115:125 and (backbone or name CB)").centerOfGeometry()
    A = u.selectAtoms("resid 125:153 and (backbone or name CB)").centerOfGeometry()
    BA = A - B
    BC = C - B
    theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
    return np.rad2deg(theta)

if __name__ == "__main__":
    import MDAnalysis
    from MDAnalysis.tests.datafiles import PSF, DCD
    import matplotlib
    import matplotlib.pyplot as plt

    u = MDAnalysis.Universe(PSF, DCD)
    data = np.array([(u.trajectory.time, theta_NMP(u), theta_LID(u)) for ts in u.trajectory])
    time, NMP, LID = data.T


    # plotting
    degreeFormatter = matplotlib.ticker.FormatStrFormatter(r"%g$^\circ$")
    fig = plt.figure(figsize=(6,3))

    ax1 = fig.add_subplot(121)
    ax1.plot(time, NMP, 'b-', lw=2, label=r"$\theta_{\mathrm{NMP}}$")
    ax1.plot(time, LID, 'r-', lw=2, label=r"$\theta_{\mathrm{LID}}$")
    ax1.set_xlabel(r"time $t$ (ps)")
    ax1.set_ylabel(r"angle $\theta$")
    ax1.yaxis.set_major_formatter(degreeFormatter)
    ax1.legend(loc="best")

    ax2 = fig.add_subplot(122)
    ax2.plot(NMP, LID, 'k-', lw=3)
    ax2.set_xlabel(r"NMP-CORE angle $\theta_{\mathrm{NMP}}$")
    ax2.set_ylabel(r"LID-CORE angle $\theta_{\mathrm{LID}}$")
    ax2.xaxis.set_major_formatter(degreeFormatter)
    ax2.yaxis.set_major_formatter(degreeFormatter)
    ax2.yaxis.tick_right()
    ax2.yaxis.set_label_position("right")

    fig.subplots_adjust(left=0.12, right=0.88, bottom=0.2, wspace=0.15)

    for ext in ('svg', 'pdf', 'png'):
        fig.savefig("NMP_LID_angle_projection.{0}".format(ext))

Note that one would normally write the code more efficiently and generate the atom groups once and then pass them to a simple function to calculate the angle

def theta(A, B, C):
    """Calculate the angle between BA and BC for AtomGroups A, B, C"""
    B_center = B.centroid()
    BA = A.centroid() - B_center
    BC = C.centroid() - B_center
    theta = np.arccos(np.dot(BA, BC)/(norm(BA)*norm(BC)))
    return np.rad2deg(theta)

Bells and whistles

Quick data acquisition

Especially useful for interactive analysis in ipython –pylab using list comprehensions (implicit for loops):

protein = u.selectAtoms("protein")
data = np.array([(u.trajectory.time, protein.radiusOfGyration()) for ts in u.trajectory])
time, RG = data.T
plot(time, RG)

More on the trajectory iterator

One can directly jump to a frame by using “indexing syntax”:

>>> u.trajectory[50]
< Timestep 51 with unit cell dimensions array([  0.,   0.,   0.,  90.,  90.,  90.], dtype=float32) >
>>> ts.frame
51

You can also slice trajectories, e.g. if you want to start at the 10th frame and go to 10th before the end, and only use every 5th frame:

for ts in u.trajectory[9:-10:5]:
    print(ts.frame)
    ...

(although doing this on Gromacs XTC and TRR trajectories is currently much slower than for DCDs.)

Note

Trajectory indexing and slicing uses 0-based indices (as in standard Python) but MDAnalysis numbers frames starting with 1 (for historical reasons and according to the practice of all MD codes).

Table Of Contents

Previous topic

Working with AtomGroups

Next topic

Intermediate Level MDAnalysis hacks

This Page