Packages and installation

last modified

2023–10–17

The term ‘package’ is ambiguous. It can refer to an operating system package, e.g. a .deb file installed by dpkg, a Python distribution package installed by pip or conda, or a container of modules with respect to the Python import system. In this document, the unqualified term ‘package’ refers to a Python distribution package, and the two other uses are referred to by ‘OS package’ and ‘import package’.

Prefixes

A prefix is a set of directories which contain everything needed to run Python code in a specific context, including virtual or Conda environments. The current prefix can be obtained by sys.prefix.

OS prefix

Without activated Python environment, Python code is run by an OS Python interpreter, typically /usr/bin/python3, having access to packages in the OS prefix /usr/.

In particular, the Python standard library is in /usr/lib/python<version>/, other Python packages in /usr/local/lib/python<version>/dist-packages, and Python applications and related executables are in /usr/local/bin/.

For Python code run by a user, this is augmented by the quasi-prefix ~/.local/, where additional Python packages can be installed below ~/.local/lib/python<version>/ and Python applications in ~/.local/bin/.

Following PEP 668, the OS prefix (including its augmentation by ~/.local/) is considered ‘externally managed’ by the OS package manager, and the user should not modify it using pip, which displays an error message if attempted. Though discouraged, it is still possible to modify /usr/ with sudo pip install --break-system-packages, or ~/.local/ with pip install --user --break-system-packages.

Instead, Python packages should be installed using the OS package manager if possible, and using pipx otherwise. pipx transparently creates a separate environment for each application in ~/.local/pipx/venvs, installs it and its dependencies there, and creates scripts in ~/.local/bin/.

A Python script written by the user should either only use Python packages installed via the OS package manager, or it should be packaged as an application and installed withs its dependencies via pipx ., possibly with the --editable option for development.

Environment prefixes

With activated Python environment, Python code is run by a Python interpreter installed in the environment, typically <prefix>/bin/python, having access to the environment prefix <envs directory>/<env name>/, for example:

~/anaconda3/envs/<env name>/
~/.conda/envs/<env name>/
~/.local/pipx/venvs/<env name>/
~/.virtualenvs/<env name>/

When an environment is activated, among other things PATH is modified by prepending <prefix>/bin, so that Python interpreters from the environment are found first. The same directory contains Python applications installed in that environment; during installation, the application script’s shebang is modified to #!<prefix>/bin/python.

Prefix structure

The structure of a prefix is defined by an ‘install scheme’, a mapping from a set of eight identifiers to directories with different purposes:

identifier contents of directory
stdlib Python standard library
platstdlib Python standard library (platform-specific)
purelib additionally installed Python packages
platlib additionally installed Python packages (platform-specific)
include header files for the Python C-API
platinclude header files for the Python C-API (platform-specific)
scripts Python application script files and other executables
data data files

The current install scheme can be obtained by sysconfig.get_paths(). Examples:

without activated environment
{
    'stdlib':       '/usr/lib/python<version>',
    'platstdlib':   '/usr/lib/python<version>',
    'purelib':      '/usr/local/lib/python<version>/dist-packages',
    'platlib':      '/usr/local/lib/python<version>/dist-packages',
    'include':      '/usr/include/python<version>',
    'platinclude':  '/usr/include/python<version>',
    'scripts':      '/usr/local/bin',
    'data':         '/usr/local'
}
with activated Conda environment
{
    'stdlib':       '<prefix>/lib/python<version>',
    'platstdlib':   '<prefix>/lib/python<version>',
    'purelib':      '<prefix>/lib/python<version>/site-packages',
    'platlib':      '<prefix>/lib/python<version>/site-packages',
    'include':      '<prefix>/include/python<version>',
    'platinclude':  '<prefix>/include/python<version>',
    'scripts':      '<prefix>/bin',
    'data':         '<prefix>'
}
with activated venv environment (created by pipx)
{
    'stdlib':       '/usr/lib/python<version>',
    'platstdlib':   '<prefix>/lib/python<version>',
    'purelib':      '<prefix>/lib/python<version>/site-packages',
    'platlib':      '<prefix>/lib/python<version>/site-packages',
    'include':      '/usr/include/python<version>',
    'platinclude':  '/usr/include/python<version>',
    'scripts':      '<prefix>/bin',
    'data':         '<prefix>'
}

Install schemes often do not distinguish between platform-specific and non-platform-specific files, and additionally installed Python packages are typically in a subdirectory dist-packages or site-packages of the standard library directory. It is also possible to combine OS directories with environment directories.

Binary distribution format ‘wheel’ and its installation

A wheel is a zip file with a name of the form:
{distribution}-{version}-{python}-{abi}-{platform}.whl.
Optionally, a build number can be included between version and python.

distribution name of the packaged software
version version of the packaged software
python Python language implementation and version, e.g. py3
abi application binary interface, none for pure Python
platform platform the package is built for, any for pure Python

The zip file contains the implementation files, metadata in a subdirectory {distribution}-{version}.dist-info/, and it can contain data files in a subdirectory {distribution}-{version}.data/.

During installation the zip file is unpacked either into the prefix’s purelib or platlib (depending on whether it is a pure Python package or not), i.e. typically <prefix>/lib/python<version>/site-packages. Implementation files and {distribution}-{version}.dist-info/ remain in this directory. Python files in the zip file’s root thereby become accessible as import modules, and directories as import packages.

If a subdirectory {distribution}-{version}.data/ exists, the contents of its subdirectories named after the install scheme identifiers are moved into the corresponding directories specified in the install scheme, and the empty subdirectory {distribution}-{version}.data/ is removed.

This can be used e.g. to install Python application script files and other executables via a subdirectory {distribution}-{version}.data/scripts/. Since in an environment the identifier data typically points to the environment directory itself, arbitrary locations within it can be targeted, e.g. to install a Jupyter kernel specification with {distribution}-{version}.data/data/share/jupyter/kernels or a JupyterLab extension with {distribution}-{version}.data/data/share/jupyter/labextensions.

Binary distribution format ‘conda’ and its installation

TODO