Skip to article frontmatterSkip to article content

What is a Package?

Another word with too many meanings!

The word “package”

Ugh, another heavily overloaded word! Like einsetzen[1], packaging has a lot of different interpretations. Some of us might already have an idea:

A Python package is something you can import e.g. import numpy

An imaginary friend

Any other ideas? Enter my other imaginary friend

A Python package is something you can install, e.g. pip install numpy

My other imaginary friend

I really can’t afford for these friends to fall out[2]. How do we rectify these different definitions? We are going to start by taking the formal definitions of the word “package” given by the authority on packaging in Python — The PyPA. To the PyPA, the term package means one of:

Import package
A Python module that you can import using an import statement.
Distribution package
An archive (e.g. ZIP, or tar) containing a collection of Import packages and additional metadata (such as which other distribution packages are required by this distribution).

Exploring import packages

One of the earliest things you will have learned on your Python journey is how to import other people’s code.

Solution to Exercise 1 #

Code that you wrote, late at night, without documentation because

the _code is the documentation

Is this you?

With this self-deprecating line of thinking, we should clearly prefer using other people’s code. Let’s try importing NumPy[3]!

import numpy

💥 Nice. When we run import, where is the code coming from? Python gives us some handy attributes that we can look at

numpy.__path__
['/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/numpy']

That looks like a directory to me! Let’s take a closer look

path = numpy.__path__[0]
!ls {path}
__config__.py		_distributor_init.pyi	ctypeslib	matrixlib
__config__.pyi		_expired_attrs_2_0.py	doc		polynomial
__init__.cython-30.pxd	_expired_attrs_2_0.pyi	dtypes.py	py.typed
__init__.pxd		_globals.py		dtypes.pyi	random
__init__.py		_globals.pyi		exceptions.py	rec
__init__.pyi		_pyinstaller		exceptions.pyi	strings
__pycache__		_pytesttester.py	f2py		testing
_array_api_info.py	_pytesttester.pyi	fft		tests
_array_api_info.pyi	_typing			lib		typing
_configtool.py		_utils			linalg		version.py
_configtool.pyi		char			ma		version.pyi
_core			conftest.py		matlib.py
_distributor_init.py	core			matlib.pyi

OK, so it’s not magic. 🧙 Phew. Crisis averted, we can all go home.

But how does the code get there?

The next question is obvious! How does the code get there?? From our previous discussion we can infer that someone may have installed NumPy, such as with pip.

%pip show numpy
Name: numpy
Version: 2.3.1
Summary: Fundamental package for array computing in Python
Home-page: https://numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
Fetching long content....
Note: you may need to restart the kernel to use updated packages.

That confirms it — NumPy is importable and pip knows about it. But, how does pip install the package? Where does it get it from? How do you put your own “package” there? What is the meaning of life?

These are all great questions.

Our first look at a distribution package

By now, we’ve covered the fact that if you can import a Python module, it’s an Import package. We also had an alternative definition of a package:

A Python package is something you can install, e.g. pip install numpy

My other imaginary friend

We learned that the PyPA call this a Distribution package. You can easily find out what a distribution package looks like! Let’s take NumPy for example! The pip program has a download option:

A very important disclaimer from the 2000s, about packaging.

Figure 1:A very important disclaimer from the 2000s, about packaging.

%pip download numpy
Collecting numpy
  Using cached numpy-2.3.1-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (62 kB)
Using cached numpy-2.3.1-cp312-cp312-manylinux_2_28_x86_64.whl (16.6 MB)
Saved ./numpy-2.3.1-cp312-cp312-manylinux_2_28_x86_64.whl
Successfully downloaded numpy
Note: you may need to restart the kernel to use updated packages.

We can see that some file was downloaded to the working directory:

!ls numpy*
numpy-2.3.1-cp312-cp312-manylinux_2_28_x86_64.whl

What is this? What’s a .whl extension used for? What are all those tags like -cp311? All will be revealed. Let’s first ask what kind of file this is.

!file numpy*.whl
numpy-2.3.1-cp312-cp312-manylinux_2_28_x86_64.whl: Zip archive data, at least v2.0 to extract, compression method=store

It’s a ZIP file! That means we can extract it, and look inside.

!unzip -q numpy*.whl
!ls -d numpy*/
numpy-2.3.1.dist-info/	numpy.libs/  numpy/

There are three folders! Let’s peek at the numpy/ directory:

!ls numpy/
__config__.py		_expired_attrs_2_0.py	doc		polynomial
__config__.pyi		_expired_attrs_2_0.pyi	dtypes.py	py.typed
__init__.cython-30.pxd	_globals.py		dtypes.pyi	random
__init__.pxd		_globals.pyi		exceptions.py	rec
__init__.py		_pyinstaller		exceptions.pyi	strings
__init__.pyi		_pytesttester.py	f2py		testing
_array_api_info.py	_pytesttester.pyi	fft		tests
_array_api_info.pyi	_typing			lib		typing
_configtool.py		_utils			linalg		version.py
_configtool.pyi		char			ma		version.pyi
_core			conftest.py		matlib.py
_distributor_init.py	core			matlib.pyi
_distributor_init.pyi	ctypeslib		matrixlib

No surprises there, that looks like the same files we saw in Notebook-code. What about the directory with the .dist-info suffix?

!ls numpy*.dist-info/
LICENSE.txt  METADATA  RECORD  WHEEL  entry_points.txt

This is not a Python Import package. We’ve encountered something new — these files are Python distribution data. More on that later!

Footnotes
  1. Better German speakers than I have reliably informed me that einsetzen is a very overloaded term in German!

  2. What, did you think imaginary friends grown on trees?

  3. This tutorial takes place at a particular moment in space-and-time (one where I’ve installed NumPy already...)