IT-Service: Using Python with Linux
The programming language Python is installed on all Linux machines at GSI/FAIR. It can be used by anyone who has access to such a machine, which only requires a Linux account.
Working with Python oftentimes requires installing third-party code called modules. GSI/FAIR takes IT security very seriously and therefore does not install every code onto its farm machines, but his page gives you methods to satisfy the needs of your program.
If you require Python modules on farm machines which are not yet available there are four possible ways to get this fixed. See the sections below for each method.
- Check if the module has been packaged by Debian and if it is available for the current release.
- Install it in your home directory and use the current interpreter from Debian.
- Use virtual environments to install your modules independently from the ones already available on your Linux installation.
- Use a complete Python distribution to manage the used interpreter together with all modules.
Python module packages in Debian are named python-MODULENAME, e.g. python-sphinx. As a general rule the IT department can only provide modules and versions that are available in the official Debian package repositories. If the module you want is available as a Debian package you can check the available version by running
apt-cache policy PACKAGENAME.
$ apt-cache policy python-sphinx python-sphinx: Installed: 1.2.3+dfsg-1 Candidate: 1.2.3+dfsg-1 Version table: *** 1.2.3+dfsg-1 0 500 http://mirror.gsi.de/distrib/debian/ jessie/main amd64 Packages 100 /var/lib/dpkg/status
The output of the command states that Debian package version 1.2.3+dfsg-1 of python-sphinx is currently installed which corresponds to upstream version 1.2.3. In the following case Debian package version 0.11-1 is available but not installed; 0.11-1 corresponds to upstream 0.11.
$ apt-cache policy python-sphinx-issuetracker python-sphinx-issuetracker: Installed: (none) Candidate: 0.11-1 Version table: 0.11-1 0 500 http://mirror.gsi.de/distrib/debian/ jessie/main amd64 Packages
If you require a python library that is available but not installed, send an e-mail to linux-service @ gsi.de and include the name of the package and your local machine.
If you decide to install python libraries to your home directory you will probably need to extend your python load path. This can be done via the environment variable PYTHONPATH which by default is not set. As long as it is not set, your python interpreter will usually only look for modules in certain system directories. You can find out what these are by running the following command:
$ python -c "import sys; print sys.path" ['', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', 'HOMEDIR/.local/lib/python2.7/site-packages', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7']
If you install modules to e.g.
~/mypythonmodules and want them to be included in your load path you can do it like this:
$ PYTHONPATH=~/mypythonmodules python -c "import sys; print sys.path" ['', 'HOMEDIR/mypythonmodules', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', 'HOMEDIR/.local/lib/python2.7/site-packages', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PIL', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.7']
To avoid setting PYTHONPATH before every command you can set it in your current shell by running
export PYTHONPATH=~/mypythonmodules or set it for every new shell by adding that line to your
If you install modules in your home directory and forget to set PYTHONPATH you might experience errors like
ImportError: No module named mymodule.
Install via pip
The default way to run
pip would be to execute
pip install MODULENAME however this will probably fail because without further options
pip will use system directories. By using the switch
--user pip will instead install to
~/.local/lib/pythonX.X/site-packages which is already included in PYTHONPATH. The directory can be changed with the option
--install-option=--prefix=DIRECTORY or just
-prefix DIRECTORY on newer versions.
$ pip install --prefix=~/python sphinxcontrib-bibtex Downloading/unpacking sphinxcontrib-bibtex Downloading sphinxcontrib-bibtex-0.3.4.tar.gz (50Kb): 50Kb downloaded Running setup.py egg_info for package sphinxcontrib-bibtex ...skipped output ... Requirement already satisfied (use --upgrade to upgrade): PyYAML >=3.01 in /usr/lib/python2.7/dist-packages (from pybtex>=0.17->sphinxcontrib-bibtex) ...skipped output ... Skipping installation of HOMDEDIR/python/lib/python2.7/site-packages/sphinxcontrib/__init__.py (namespace package) ...skipped output ... Successfully installed sphinxcontrib-bibtex latexcodec pybtex pybtex-docutils six oset Cleaning up...
This example demonstrates a few things:
- pip will start downloading the package and then check its dependencies
- If pip recognized that some dependencies are already installed (in the example PyYAML) it will skip installation
pipwill not install
__init__.pyfiles for so called namespace packages. This will cause
import sphinxcontrib.bibtexto fail, since
sphinxcontribdoes not have a
__init__.pyfile and will therefore not be recognized as a module
~/python/lib/python2.7/site-packagesis the directory you will need to add to your PYTHONPATH
After installation you will need to do the following steps:
- Set PYTHONPATH
- Add an empty
__init__.pyfor all namespace packages
- Optional: If the modules also install scripts to
~/python/binyou might want to extend your PATH
$ export PYTHONPATH=~/python/lib/python2.7/site-packages $ touch $PYTHONPATH/sphinxcontrib/__init__.py $ # Optional: set PYTHONPATH in .bashrc $ echo 'export PYTHONPATH=~/python/lib/python2.7/site-packages' >>~/.bashrc $ # Optional: set PATH $ export PATH=~/python/bin:$PATH
The main purpose of Python virtual environments is to create isolated environments for Python projects. This means that each project can have its own dependencies, regardless of what dependencies every other project has.
On GSI Linux desktops the virtualenv command is available for Python version 2 and 3. The following three packages should be installed on your machine:
You can use the following commands to install a module (NAME) into a virtual environment.
- Create environment directory:
- Create environment:
virtualenv --system-site-packages /data.local1/NAME
- Activate environment:
source /data.local1/NAME/bin/activate(for Bash or Zsh)
The command has to be called every time you want to use the environment. It should have modified your shell prompt into something like
- Install module:
pip install --upgrade NAME
This can take a while because pip has to download all the dependencies and compile everything.
- Leave environment:
When it's not possible to install new version of modules due to an old version of the interpreter, the only remaining possibility is to use a Python distribution like Anaconda. It basically provides an archive distribution comprising of the latest version of Python (both 3.x and 2.x), multiple modules and a package management system called Conda. We highly recommend to use a reduced version of Anaconda, called Miniconda which just include the Python interpreter and the Conda package manager.
The following steps can be followed to install Miniconda with Python 3.7 and create your virtual environments with it.
- Download the Miniconda installer (for Linux) from here.
- Start the installer in your terminal (
bash Miniconda3-latest-Linux-x86_64.sh) and accept the license agreement
- By default the installer will install Miniconda under
$HOME/miniconda3, but you are free to choose another path, for example
/data.localif available on your desktop.
If possible, avoid to install Conda and related modules under your Lustre directory. Depending on the modules needed, the installation will be broken down in hundreds, sometimes thousands of small files and this will greatly hamper the performance of Lustre for all the users.
- The installer will then proceed to install additional packages. It will then ask if you want to initialize your environment. If you choose to do so, the installer will add the following lines at the very end of your .bashrc file:
# >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/home/myuser/miniconda3/bin/conda' 'shell.bash' 'hook' 2>/dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/home/myuser/miniconda3/etc/profile.d/conda.sh" ]; then . "/home/myuser/miniconda3/etc/profile.d/conda.sh" else export PATH="/home/myuser/miniconda3/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<<
- For the changes to take effect, it is better if you close your terminal window and open a new one. A simple test would be to run the Conda package manager on the command line in the new terminal session (the version may be different for you):
:~$ conda --version conda 4.7.12
Install Python modules
- List all installed packages in current virtual environment:
- Install or uninstall modules:
conda install MODULENAME,
conda uninstall MODULENAME
- Update and search for a specific module:
conda update MODULENAME,
conda search MODULENAME
Use environments with Conda
By default, when Miniconda is installed, it will automatically define a Python environment called base, which you should also see added to your terminal prompt in the console window, unless you have customized it in a different way. To avoid polluting your Miniconda environment, you should create a separated one with the following commands. Instead of mytest you are of course free to use whichever name you see fit. Again, the versions reported below may change depending on the version of Conda you are using.
:~$ conda create --name mytest python=3.7 [...] Collecting package metadata (current_repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/myuser/miniconda3/envs/mytest added / updated specs: - python=3.7 The following packages will be downloaded: package | build ---------------------------|----------------- ca-certificates-2019.11.27 | 0 124 KB certifi-2019.11.28 | py37_0 153 KB ld_impl_linux-64-2.33.1 | h53a641e_7 568 KB openssl-1.1.1d | h7b6447c_3 2.5 MB pip-19.3.1 | py37_0 1.6 MB python-3.7.6 | h0371630_2 44.9 MB setuptools-44.0.0 | py37_0 520 KB sqlite-3.30.1 | h7b6447c_0 1.1 MB wheel-0.33.6 | py37_0 42 KB ------------------------------------------------------------ Total: 51.4 MB The following NEW packages will be INSTALLED: _libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main ca-certificates pkgs/main/linux-64::ca-certificates-2019.11.27-0 certifi pkgs/main/linux-64::certifi-2019.11.28-py37_0 ld_impl_linux-64 pkgs/main/linux-64::ld_impl_linux-64-2.33.1-h53a641e_7 libedit pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0 libffi pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4 libgcc-ng pkgs/main/linux-64::libgcc-ng-9.1.0-hdf63c60_0 libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-9.1.0-hdf63c60_0 ncurses pkgs/main/linux-64::ncurses-6.1-he6710b0_1 openssl pkgs/main/linux-64::openssl-1.1.1d-h7b6447c_3 pip pkgs/main/linux-64::pip-19.3.1-py37_0 python pkgs/main/linux-64::python-3.7.6-h0371630_2 readline pkgs/main/linux-64::readline-7.0-h7b6447c_5 setuptools pkgs/main/linux-64::setuptools-44.0.0-py37_0 sqlite pkgs/main/linux-64::sqlite-3.30.1-h7b6447c_0 tk pkgs/main/linux-64::tk-8.6.8-hbc83047_0 wheel pkgs/main/linux-64::wheel-0.33.6-py37_0 xz pkgs/main/linux-64::xz-5.2.4-h14c3975_4 zlib pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3 Proceed ([y]/n)? y Downloading and Extracting Packages [...]
The new environment is now installed but it is not activated yet. The following commands are available:
- Activate an environment:
conda activate mytest.
- Deactivation and returning to the base environment:
- Remove an environment and all its installed modules:
conda remove --name mytest --all
- Installing and removing modules works with the same Conda commands explained above but they will be confined to the environment.
Availability and support
- Support email: linux-service @ gsi.de