The result? Our runtime image just got 6x smaller! Six times! From > 1.1 GB to 170 MB.
See (above this annotation) the most optimized & CI friendly Python Docker build with Poetry (until this issue gets resolved)
The result? Our runtime image just got 6x smaller! Six times! From > 1.1 GB to 170 MB.
See (above this annotation) the most optimized & CI friendly Python Docker build with Poetry (until this issue gets resolved)
setuptools is the most popular (at 50k packages), Poetry is second at 41k, Hatchling is third at 8.1k. Other tools to cross 500 users include Flit (4.4k), PDM (1.3k), Maturin (1.3k, build backend for Rust-based packages).
Popularity of Python package managers in 2024
cat requirements.txt | grep -E '^[^# ]' | cut -d= -f1 | xargs -n 1 poetry add
Use poetry init
to create a sample pyproject.toml
, and then trigger this line to export requirements.txt
into a pyproject.toml
the fact that the Poetry developers intentionally introduced failures to people’s CI/CD pipelines to motivate them to move away from Poetry’s legacy installer… Though we didn’t rely on the installer in our pipelines, this was the death knell for the continued use of Poetry.
Video on this topic: https://youtu.be/Gr9o8MW_pb0
So which should you use, pip or Conda? For general Python computing, pip and PyPI are usually fine, and the surrounding tooling tends to be better. For data science or scientific computing, however, Conda’s ability to package third-party libraries, and the centralized infrastructure provided by Conda-Forge, means setup of complex packages will often be easier.
From my experience, I would use Mambaforge or pyenv and Poetry.
Without accounting for what we install or add inside, the base python:3.8.6-buster weighs 882MB vs 113MB for the slim version. Of course it's at the expense of many tools such as build toolchains3 but you probably don't need them in your production image.4 Your ops teams should be happier with these lighter images: less attack surface, less code that can break, less transfer time, less disk space used, ... And our Dockerfile is still readable so it should be easy to maintain.
See sample Dockerfile above this annotation (below there is a version tweaked even further)
But the problem with Poetry is arguably down to the way Docker’s build works: Dockerfiles are essentially glorified shell scripts, and the build system semantic units are files and complete command runs. There is no way in a normal Docker build to access the actually relevant semantic information: in a better build system, you’d only re-install the changed dependencies, not reinstall all dependencies anytime the list changed. Hopefully someday a better build system will eventually replace the Docker default. Until then, it’s square pegs into round holes.
Problem with Poetry/Docker
Third, you can use poetry-dynamic-versioning, a plug-in for Poetry that uses Git tags instead of pyproject.toml to set your application’s version. That way you won’t have to edit pyproject.toml to update the version. This seems appealing until you realize you now need to copy .git into your Docker build, which has its own downsides, like larger images unless you’re using multi-stage builds.
Approach of using poetry-dynamic-versioning plugin
But if you’re doing some sort of continuous deployment process where you’re continuously updating the version field, your Docker builds are going to be slow.
Be careful when updating the version
field of pyproject.toml
around Docker
PyPA still advertises pipenv all over the place and only mentions poetry a couple of times, although poetry seems to be the more mature product.
Sounds like PyPA does not like poetry as much for political/business reasons
The goal of this tutorial is to describe Python development ecosystem.
tl;dr:
INSTALLATION:
TESTING:
REFACTORING:
Some options (you will have to use your own judgment, based on your use case)
4 different options to install Poetry through a Dockerfile
The majority of Python packaging tools also act as virtualenv managers to gain the ability to isolate project environments. But things get tricky when it comes to nested venvs: One installs the virtualenv manager using a venv encapsulated Python, and create more venvs using the tool which is based on an encapsulated Python. One day a minor release of Python is released and one has to check all those venvs and upgrade them if required. PEP 582, on the other hand, introduces a way to decouple the Python interpreter from project environments. It is a relative new proposal and there are not many tools supporting it (one that does is pyflow), but it is written with Rust and thus can't get much help from the big Python community. For the same reason it can't act as a PEP 517 backend.
The reason why PDM - Python Development Master may replace poetry
or pipenv