Comparing the best Python project managers

A comprehensive guide to modern tools

  • Article
  • Data Engineering
python project managers
Roy-data-engineer
Roy Klip
Data Engineer
8 min
09 Aug 2024

In the ever-changing world of Python, managing packages, environments and versions efficiently is important. Traditional tools like pip and conda have served us well, but as projects become more complex, so do our requirements. This guide looks at modern alternatives - Poetry, PDM, Hatch and Rye - each of which offers unique capabilities to streamline Python project management.

Comparison framework

To compare Python project managers, we came up with four categories:

  • Package management: This evolves around installing, updating and uninstalling (external) Python packages. Using packages avoids reinventing the wheel and provides the liberty to use an arsenal of utility functions that have been used and tested by other developers.
    Example tools: pip, conda
  • Environment management: Virtual environments are isolated sets of packages and have become an essential part in Python programming. They avoid system pollution, sidestep dependency conflicts and increase reproducibility.
    Example tools: venv, virtualenv
  • Package development: This includes building and publishing packages to an index. This makes them available for other developers to use, either privately within a company or as an open source package for the Python community.
    Example tools: setuptools, twine
  • Python version management: Just as with package management, managing global and per-project Python versions ensures compatibility and increases reproducibility.
    Example tools: pyenv
Comparison overview of four managers

Why not pip?

Most Python developers use pip for their package management and quite logically so, pip is easy to use and comes preinstalled with Python 3.4 and later. However, pip is quite limited. It lets you install packages into your environment but does not control your environment. It also lacks tools for building and developing your own packages and it is unable to manage Python installations. For those reasons we wanted to try something else.

Why not Conda?

Conda  is a powerful tool that has a strong foundation within the data science community. It supports multiple languages beside Python and can therefore be used in multilingual projects. Furthermore, it not only manages packages, but also virtual environments and even Python installations.

However, there are reasons why we didn't look at Conda:

  1. Package source: Conda installs packages from the Anaconda distribution, while pip and the other tools install them from PyPI. While the Anaconda distribution has some advantages for data science packages, such as the MKL-accelerated version of numpy, the number of packages that are available is significantly lower than that of PyPI.
  2. Compatibility issues: Closely related to this, the Conda packages are not compatible with the PyPI packages, meaning that pip or the other package installers can't install them. Therefore, if Conda was chosen, it would be harder to switch to something else later on.
  3. Commercial use: Lastly, there is also uncertainty around using Conda for commercial use since the Anaconda Distribution is not free to use for companies with more than 200 employees.

Python project managers

For this analysis, we compared four project managers: Poetry, PDM, Hatch and Rye, which will be scored based on the comparison framework. We scored the tools both on a high-level, representing if it has enough functionality in the category, and on a per-feature-level, which contains a deepdive of the features in every category.

Overlapping features

Before we dive into the specific characteristics of each tool, there are some overlapping features.

Package management

  • Usage of pyproject.toml file for configurations. The pyproject.toml has become the standard meta data file since PEP621, as an alternative for setup.cfg and setup.py.
  • Dependency version specification according to  PEP440. Even though these specifiers can be used in a requirements.txt as well, the going-to-be-discussed tools have a harder focus on using well-defined version requirements.
  • Dependency groups, either as optional groups or tool-specific dev groups, which are separated sets of dependencies that can be installed on top of the required dependencies.

Environment management

  • Virtual environments located in the project-root. This is the default location for virtual environments in general and will be referred to as the classical way.

Package development

  • Being able to use a PEP517 build backend.
  • CLI commands for building and publishing packages.
pyproject.toml
pyproject.toml
Dependency version specifiers
Dependency version specifiers
Optional dependency group
Optional dependency group

The features that will be mentioned per project manager are additions to the ones mentioned above.

Poetry

 Poetry is currently the most popular alternative for package management and provides quite a few advantages over pip.

Package management

  • CLI commands for managing dependencies
  • Lock files are used to pin dependency versions and are stricter than the requirements defined in the pyproject.toml 
  • Dependency groups, as an alternative to standardised optional groups, can be used as predefined sets of dependencies
  • Extensive dependency resolvers, capable of finding third party version conflicts

Side note: Poetry has been using the pyproject.toml before PEP621 was approved and therefore has its own representation, which is not compatible with the standard version.

Environment management

  • Environments can be located outside of the project root, in the Poetry cache location

The preference for environment management differs per developer and arguments for the latter method often include having a smaller and less-cluttered project folder.

Package development

Python version management

Poetry doesn't support Python version management.

Extra

  • Plugin support
Third party dependency conflict
Third party dependency conflict
Classical virtual env vs Poetry cache
Classical virtual env vs Poetry cache

PDM

 PDM feels a bit similar to Poetry, but it is critically different in two ways:

  1. PDM follows the PEPs more closely and doesn't want to deviate from them
  2. PDM implemented the idea of venv-less environments

Package management

  • CLI commands for managing dependencies
  • Lock files for pinning dependency versions
  • Fast dependency resolver, which is not as extensive as the one from Poetry, but it is significantly faster

Environment management

  • venv-less: the idea came from PEP582, which was later rejected after it was already implemented by PDM as an option. Since this rejection PDM advises to use the classical way of project-root virtual environments, but they do still support the venv-less solution.

Package development

Python version management

PDM doesn't support Python version management.

Extra

  • Plugin support
Lock file
Lock file
Classic virtual env vs venv-less solution
Classic virtual env vs venv-less solution

Hatch

 Hatch  focusses on package development and has a different approach to environment management.

Package management

Hatch does not support dev dependencies like PDM, has no lock files and has no CLI support for package management. This means that you need to hand-curate the dependencies. Though, the development team of Hatch have announced that they will be working on lock files.

Environment management

  • Environment-specific configurations, like dependencies, environment variables and scripts. It is also possible to use one environment as a template for other environments.

Package development

Python version management

  • Cross-platform (usable for Windows, Mac and Linux users), no build dependencies and uses CPython distributions. For more information, see the explained benefits over pyenv.

Extra

  • Plugin support
  • Integrated testing capabilities and a CLI test command
  • Integrated linter and formatter (ruff) and supporting CLI commands
  • uv  support
Environment configurations and inheritance
Environment configurations and inheritance
Test matrix configuration
Test matrix configuration

Rye

Rye  is a self-proclaimed "Hassle-Free Python Experience".

Package management

  • CLI commands for managing dependencies
  • Lock files for pinning dependency versions
  • Fast dependency resolver, which is not as extensive as the one from Poetry, but it is significantly faster

Environment management

  • Workspaces, which allow working with multiple packages that have dependencies to each other. All projects in a workspace share a singular environment and are themselves installed in editable mode in this environment, which is great for monorepos.

Package development

  • All PEP517  backends are available

Python version management

  • Cross-platform (usable for Windows, Mac and Linux users), no build dependencies and uses CPython distributions

Extra

  • Integrated linter and formatter (ruff) and supporting CLI commands
  • uv support
  • Support for developing Rust Python extension modules
Workspace configuration
Workspace configuration
Rust module directory structure
Rust module directory structure

Conclusion

For a long time, pip has been the dominant package installer in the Python world, but modern solutions like Poetry, PDM, Hatch, and Rye offer much more. They provide advanced features in package management, environment management, package development and Python version management.

Each of these tools have their own strengths:

  • Poetry is widely used, reasonable mature and has a lot of community support.
  • PDM closely follows the PEP standards and has the option for a venv-less solution.
  • Hatch is great for package development with integrated testing capabilities and flexible environment management.
  • Rye feels like the all-in-one tool with features in every category, plus the workspace feature which is excellent for monorepos. Also, Rye has been taken over by team  Astral, the developers behind ruff and uv, who have been looking at unifying uv and Rye.
Detailed visual overview comparing four managers

This is an article by Roy Klip, Data Engineer at Digital Power

Roy is a Data Engineer at Digital Power with a strong background in Software Engineering and Data Science. He enjoys blending these skills in his role, where he focuses on designing, building, and maintaining data pipelines and platforms.

Roy Klip

Data Engineerroy.klip@digital-power.com

Receive data insights, use cases and behind-the-scenes peeks once a month?

Sign up for our email list and stay 'up to data':

You might also like

Kubernetes-based event-driven autoscaling with KEDA: a practical guide

This article explains the essence of Kubernetes Event Driven Autoscaling (KEDA). Subsequently, we configure a local development environment enabling the demonstration of KEDA using Docker and Minikube. Following this, we expound upon the scenario that will be implemented to showcase KEDA, and we guide through each step of this scenario. By the end of the article, you will have a clear understanding of what KEDA entails and how they can personally implement an architecture with KEDA.

Read more

Setting up Azure App functions

In the article, we start by discussing Serverless Functions. Then we demonstrate how to use Terraform files to simplify the process of deploying a target infrastructure, how to create a Function App in Azure, the use GitHub workflows to manage continuous integration and deployment, and how to use branching strategies to selectively deploy code changes to specific instances of Function Apps.

Read more
ai document explorer example

How does the AI Document Explorer work in practice?

The AI Document Explorer (AIDE) is a cloud solution developed by Digital Power that utilises OpenAI's GPT model. It can be deployed to quickly gain insights into company documents. AIDE securely indexes your files, enabling you to ask questions about your own documents. Not only does it provide you with the answers you are looking for, but it also references the locations where these answers are found.

Read more
Data Engineer and ML Engineer talking to each other

What does a (Cloud) Data Engineer do versus a Machine Learning Engineer?

In the world of data and technology, Data Engineers and Machine Learning Engineers are crucial players. Both roles are essential for designing, building, and maintaining modern data infrastructures and advanced machine learning (ML) applications. In this blog, we focus specifically on the roles and responsibilities of a Data Engineer and Machine Learning Engineer.

Read more
implementing a data platform

Implementing a data platform

Based on our know-how, the purpose of this blog is to transmit our knowledge and experience to the community by describing guidelines for implementing a data platform in an organisation. We understand that the specific needs of every organisation are different, that they will have an impact on the technologies used and that a single architecture satisfying all of them makes no sense. So, in this blog we will keep it as general as we can.

Read more
Data Engineer at work

Your Data Engineering partner

Generate reliable and meaningful insights from a solid, secure and scalable infrastructure. Our team of 25+ Data Engineers is ready to implement, maintain and optimise your data products and infrastructure end-to-end.

Read more
kadaster header

Working more efficiently thanks to migration to Databricks

The Kadaster manages complex (geo)data, including all real estate in the Netherlands. All data is stored and processed using an on-premise data warehouse in Postgres. They rely on an IT partner for maintaining this warehouse. The Kadaster aims to save costs and work more efficiently by migrating to a Databricks environment. They asked us to assist in implementing this data lakehouse in the Microsoft Azure Cloud.

Read more
iphone with spotify music

Converting billions of streams into actionable insights with a new data & analytics platform

Merlin is the largest digital music licensing partner for independent labels, distributors, and other rightsholders. Merlin’s members represent 15% of the global recorded music market. The company has deals in place with Apple, Facebook, Spotify, YouTube, and 40 other innovative digital platforms around the world for its’ member’s recordings. The Merlin team tracks payments and usage reports from digital partners while ensuring that their members are paid and reported to accurately, efficiently, and consistently.

Read more
image of euros

Fast and reliable internal information using AI Document Explorer

Financial institutions need to process large amounts of documentation. For this particular institution, an internal team facilitates this by, for example, creating summaries using text analysis and natural language processing (NLP). They make these available to the various business units. To conduct audits more efficiently, they wanted to develop a question-and-answer model to get the right information to them faster. When ChatGPT was launched, they asked us to create a proof of concept.

Read more
elevator

20% fewer complaints thanks to data-driven maintenance reports

An essential part of Otis's business operations is the maintenance of their elevators. To time this effectively and proactively inform customers about the status of their elevator, Otis wanted to implement continuous monitoring. They saw great potential in predictive maintenance and remote maintenance.

Read more
valk exclusief

Setting up a future-proof data infrastructure

Valk Exclusief is a chain of 4-star+ hotels with 43 hotels in the Netherlands. The hotel chain wants to offer guests a personal experience, both in the hotel and online.

Read more
fysioholland data

A well-organised data infrastructure

FysioHolland is an umbrella organisation for physiotherapists in the Netherlands. A central service team relieves therapists of additional work, so that they can mainly focus on providing the best care. In addition to organic growth, FysioHolland is connecting new practices to the organisation. Each of these has its own systems, work processes and treatment codes. This has made FysioHolland's data management large and complex.

Read more
billboards

A scalable machine-learning platform for predicting billboard impressions

The Neuron provides a programmatic bidding platform to plan, buy and manage digital Out-Of-Home ads in real-time. They asked us to predict the number of expected impressions for digital advertising on billboards in a scalable and efficient way.

Read more

A day in the life of a Data Engineer

For developing modern data applications, a Data Engineer is essential. But what does it actually mean to be a Data Engineer and what exactly do you do? Our colleague Oskar, Data Engineer at Digital Power, explains.

Read more