Practical Session 02

These files will be available for the duration of the course.

See also Introduction to the commandline and the Introduction to Unix (the IntroductiontoUNIX.pdf can be downloaded from this directory).

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory  -  
[TXT]skeleton.sh2013-01-17 10:01 1.3K 
[TXT]assignment_04.txt2013-01-17 18:24 13K 
[   ]P02a.pdf2013-01-17 10:45 549K 
[   ]P02.pdf2013-01-17 18:44 561K 

Unix commands

Unix-like operating systems are Linux and Mac OS X. You will get the most out of a computer running Unix by interacting with it via the command line. The command line is a text console where you type commands and the operating system displays text messages. You can run such a "terminal session" via terminal program (such as xterm, kterm,, iTerm) or remotely via ssh. The command line is provided by a "shell", a program that interfaces between the user and the "kernel", the code that manipulates the hardware.

Our preferred shell is the bash shell.

Any Unix comes with a set of programs that fulfill a wide range of purposes. The Unix philosophy is for each tool to do one thing but to do it well. Therefore, it is very useful to know a range of tools and what they can do. Unix systems come in many flavors and they often ship with slightly different tools or tools that have the same name but different usage. This can be annoying and the best you can do is to learn about the systems that you work with most of the time. (It might help to know that Mac OS X is a "BSD Unix" which is different from Linux — just in case you don't find some Linux commands in Mac OS X.

There are plenty of command references and "cheat sheets" available on the internet, for instance

Regular expressions

The grep -E command uses them (The sed and the search/replace function in vim use something very similar called "basic regular expressions"; see the end of re_format for details). Simple REGEX (as used by grep -E or egrep):
  word          matches "word" literally anywhere
  ^word         matches "word" at beginning of line
  word$         matches "word" at end of line
  a *b          matches ab, a b, a   b, i.e. ' *' is zero or more
                spaces (generally, 'X*' matches zero or more X)
  a +b          matches a b, a  b, ..., i.e. ' +' is one or more
                spaces (NOTE: in "basic regular expressions"
                as used in grep this is  '\+' or '{1,0}, i.e. 'a \+b')
                (generally, 'X+' matches one or more X)
  a[A-Z]b       matches aAb, aBb, ..., aZb  (range expression)
  a[0-9][0-9]b          a00b, a01b, a02b, ..., a99b
  a[A-Z]*b      ab, aAb, ..., aZb
  a[A-Za-z]b    aAb,..., aZb, aab, ..., azb
  a[^A-Z]b      aab, axb, ab, a+b, ... ([^...] is a negation)
  a.b           matches aXb a3b a_b a b  but not ab: '.' stands for
                a single character
  a...b         a123b aXYZb etc: ... are three characters
  a.*b          ab a1b a12b a123b etc: .* is zero or more characters
                (this is used very often)
  a.+b          a1b a12b but not ab: .\+ is one or more characters

  a|b           matches a or b (there's no equivalent basic regex)

  (abc)|(xyz)   matches either the pattern abc or xyz: parentheses
                can be used to group patterns
(Regular expressions are amazingly useful but it takes some time to learn them. See 'man re_format' for the bare bones and various tutorials on the internet. The above barely scratches the surface.) Note that the so called "extended regular expressions" of egrep or grep -E are a bit nicer to use (see egrep(1)) but they won't work in sed or vi.

Protein databank

Instead of using the web browser you can also download any PDB file from the command line. Files can be found at For instance, in order to get 1AKE you can use

or if wget is not installed, try curl
curl -O

Uncompress (unzip) the file with

gunzip 1ake.pdb.gz