Python installations at NSC

NSC's clusters always come with the CentOS standard Python. That is the Python version you will get by default when you login in to one of our clusters:

[pla@triolith1 ~]$ which python
/usr/bin/python

On CentOS 6, it is Python 2.6, on CentOS 5 Python 2.4. Typically, you would want a more recent version of Python together with the usual scientific libaries such as NumPy, SciPy, Matplotlib etc. So instead we recommend that you load one of our Python modules. For example, on Triolith, we currently have:

[pla@triolith1 ~]$ module avail | grep python
python/2.7.3-smhi-1                                  2013/03/25 15:44:04
python/2.7.4-snic-1                                  2013/04/23 12:44:03
python/2.7.6                                         2013/11/25 11:34:36
python/recommendation                    default     2013/11/14  9:17:58

After loading the module, you will have a new Python installation in your PATH, where e.g. NumPy will work:

[pla@triolith1 ~]$ module load python/2.7.6
[pla@triolith1 ~]$ python
Python 2.7.6 (default, Nov 24 2013, 16:51:51) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>>> numpy.linspace( 0, 2, 9 )   
array([ 0.0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0])

What if I need other Python packages?

The NSC modules typically provide a skeleton of Python packages that might be difficult for users to install themselves, such as optimized versions of NumPy and SciPy, but for technical reasons, we cannot install all the packages that everyone needs in the same installation. Instead, we recommend that you install extra packages in your own home directory. The best way to do that is by creating a so-called virtualenv (virtual Python environment). With a virtualenv, you can make a copy of our installations and install extra packages on top of them.

Suppose you need a special Python package, e.g. "Flask" (a micro web framework):

[pla@triolith1 ~]$ module load python/2.7.6
[pla@triolith1 ~]$ virtualenv python
[pla@triolith1 ~]$ cd python
(python)[pla@triolith1 ~]$ . bin/activate
(python)[pla@triolith1 ~]$ pip install flask
...(lots of output)...
Successfully installed flask Werkzeug Jinja2 itsdangerous markupsafe
Cleaning up...
(python)[pla@triolith1 ~]$

This creates a new Python installation in your home directory ~/python. You can install any package you want there with the pip install command. To make this Python your default version, you either need to add ~/python/bin to your $PATH, or run the activate command before you want to start using it.

source ~/python/bin/activate 

How do I control which version of Python my scripts use?

In your Python script, change the first line from:

#!/usr/bin/python

To this:

#!/usr/bin/env python

That will make the script pick up the Python from the currently loaded module.

Why doesn't my Python program write to slurm.out?

To see the output from Python script in a running job in real-time, you have to instruct Python to not "buffer" its output. Otherwise, all the output from your script will get written to the slurm.out file when the job has finished. To get the expected behavior, simply add the -u command line flag when you start the script.

python -u myscript.py

For an executable script, you can change the first line to:

#!/usr/bin/env python -u