With the advent of uv, I'm finally feeling like Python packaging is solved. As mentioned in the article, being able to have inline dependencies in a single-file Python script and running it naturally is just beautiful.
#!/usr/bin/env -S uv run
# /// script
# dependencies = ['requests', 'beautifulsoup4']
# ///
import requests
from bs4 import BeautifulSoup
After being used to this workflow, I have been thinking that a dedicated syntax for inline dependencies would be great, similar to JavaScript's `import ObjectName from 'module-name';` syntax. Python promoted type hints from comment-based to syntax-based, so a similar approach seems feasible.> It used to be that either you avoided dependencies in small Python script, or you had some cumbersome workaround to make them work for you. Personally, I used to manage a gigantic venv just for my local scripts, which I had to kill and clean every year.
I had the same fear for adding dependencies, and did exactly the same thing.
> This is the kind of thing that changes completely how you work. I used to have one big test venv that I destroyed regularly. I used to avoid testing some stuff because it would be too cumbersome. I used to avoid some tooling or pay the price for using them because they were so big or not useful enough to justify the setup. And so on, and so on.
I 100% sympathize with this.
# /// script
# dependencies = [
# "requests",
# ]
# [tool.uv]
# exclude-newer = "2023-10-16T00:00:00Z"
# ///
https://docs.astral.sh/uv/guides/scripts/#improving-reproduc...This has also let me easily reconstruct some older environments in less than a minute, when I've been version hunting for 30-60 minutes in the past. The speed of uv environment building helps a ton too.
To be clear, a lock file is strictly the better option—but for single file scripts it's a bit overkill.
Use a lock file if you want transitive dependencies pinned.
I can't think of any other language where "I want my script to use dependencies from the Internet, pinned to precise versions" is a thing.
The use case described is for a small one off script for use in CI, or a single file script you send off to a colleague over Slack. Very, very common scenario for many of us. If your script depends on
a => c
b => c
You can pin versions of those direct dependencies like "a" and "b" easy enough, but 2 years later you may not get the same version of "c", unless the authors of "a" and "b" handle their dependency constraints perfectly. In practice that's really hard and never happens.The timestamp appraoch described above isn't perfect, but would result in the same dep graph, and results, 99% of the time..
One file is better for sharing than N, you can post it in a messenger program like Slack and easily copy-and-paste (while this becomes annoying with more than one file), or upload this somewhere without needing to compress, etc.
> I can't think of any other language where "I want my script to use dependencies from the Internet, pinned to precise versions" is a thing.
This is the same issue you would have in any other programming language. If it is fine for possibly having breakage in the future you don't need to do it, but I can understand the use case for it.
Because this is for scripts in ~/bin, not projects.
They need to be self-contained.
Documentation is hard enough, and that's often right there at exactly the same location.
Sometimes, the lock files can be larger than the scripts themselves...
This looks like a good strategy, but I wouldn't want it by default since it would be very weird to suddenly having a script pull dependencies from 1999 without explanation why.
I've just gotten into the habit of using only the dependencies I really must, because python culture around compatibility is so awful
For example, here is a post saying it was previously recommended to not save it for libraries: https://blog.rust-lang.org/2023/08/29/committing-lockfiles.h...
I prefer this myself, as almost all lock files are in practice “the version of packages at this time and date”, so why not be explicit about that?
Pipenv, when you create a lockfile, will only specify the architecture specific lib that your machine runs on.
So if you're developing on an ARM Macbook, but deploying on an Ubuntu x86-64 box, the Pipenv lockfile will break.
Whereas a Poetry lockfile will work fine.
And I've not found any documentation about how uv handles this, is it the Pipenv way or the Poetry way?
The point of the PEP 723 comment style in the OP is that it's human-writable with relatively little thought. Cases like yours are always going to require actually doing the package resolution ahead of time, which isn't feasible by hand. So a separate lock file is necessary if you want resolved dependencies.
If you use this kind of inline script metadata and just specify the Python dependency version, the resolution process is deferred. So you won't have the same kind of control as the script author, but instead the user's tooling can automatically do what's needed for the user's machine. There's inherently a trade-off there.
- Specifying a subset of platforms to resolve for
- Requiring wheel coverage for specific platforms
- Conflicting optional dependencies
https://docs.astral.sh/uv/concepts/resolution/#universal-res...
https://docs.astral.sh/uv/concepts/projects/config/#conflict...
Nix for all it's benefits here can be quite slow and make it otherwise pretty annoying to use as a shebang in my experience versus just writing a package/derivation to add to your shell environment (i.e. it's already fully "built" and wrapped. but also requires a lot more ceremony + "switching" either the OS or HM configs).
Flakes has caching but support for `nix shell` as shebang is relatively new (nix 2.19) and not widespread.
Before uv I avoided writing any scripts that depended on ML altogether, which is now unlocked.
(Note that hashes themselves don't make "random scripts" not a security risk, since asserting the hash of malware doesn't make it not-malware. You still need to establish a trust relationship with the hash itself, which decomposes to the basic problem of trust and identity distribution.)
Transitive dependencies are still a problem though. You kind of fall back to needing a lock file or specifying everything explicitly.
https://packaging.python.org/en/latest/discussions/distribut...
https://zahlman.github.io/posts/2024/12/24/python-packaging-...
While you could of course put an actual Python code file at a URL, that wouldn't solve the problem for anything involving compiled extensions in C, Fortran etc. You can't feasibly support NumPy this way, for example.
That said, there are sufficient hooks in Numpy's `import` machinery that you can make `import foo` programmatically compute a URL (assuming that the name `foo` is enough information to determine the URL), download the code and create and import the necessary `module` object; and you can add this with appropriate priority to the standard set of strategies Python uses for importing modules. A full description of this process is out of scope for a HN comment, but relevant documentation:
# dependencies = ['requests', 'beautifulsoup4']
And likewise, Deno can import by URL. Neither include an integrity hash. For JS, I'd suggest import * as goodlib from 'https://verysecure.com/notmalicious.mjs' with { integrity="sha384-xxx" }
which mirrors https://developer.mozilla.org/en-US/docs/Web/Security/Subres... and https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...The Python/UV thing will have to come up with some syntax, I don't know what. Not sure if there's a precedent for attributes.
tl;dr use `openssl` on command-line to compute the hash.
Ideally, any package repositories ought to publish the hash for your convenience.
This of course does nothing to prove that the package is safe to use, just that it won't change out from under your nose.
Or is it a skill issue?
Not really: https://github.com/astral-sh/uv/issues/5190
What does "production" look like in your environment, and why would this be terrible for it?
I find this feature amazing for one-off scripts. It’s removing a cognitive burden I was unconsciously ignoring.
The syntax for this (https://peps.python.org/pep-0723/) isn't uv's work, nor are they first to implement it (https://iscinumpy.dev/post/pep723/). A shebang line like this requires the tool to be installed first, of course; I've repeatedly heard about how people want tooling to be able to bootstrap the Python version, but somehow it's not any more of a problem for users to bootstrap the tooling themselves.
And some pessimism: packaging is still not seen as the core team's responsibility, and uv realistically won't enjoy even the level of special support that Pip has any time soon. As such, tutorials will continue to recommend Pip (along with inferior use patterns for it) for quite some time.
> I have been thinking that a dedicated syntax for inline dependencies would be great, similar to JavaScript's `import ObjectName from 'module-name';` syntax. Python promoted type hints from comment-based to syntax-based, so a similar approach seems feasible.
First off, Python did no such thing. Type annotations are one possible use for an annotation system that was added all the way back in 3.0 (https://peps.python.org/pep-3107/); the original design explicitly contemplated other uses for annotations besides type-checking. When it worked out that people were really only using them for type-checking, standard library support was added (https://peps.python.org/pep-0484/) and expanded upon (https://peps.python.org/pep-0526/ etc.); but this had nothing to do with any specific prior comment-based syntax (which individual tools had up until then had to devise for themselves).
Python doesn't have existing syntax to annotate import statements; it would have to be designed specifically for the purpose. It's not possible in general (as your example shows) to infer a PyPI name from the `import` name; but not only that, dependency names don't map one-to-one to imports (anything that you install from PyPI may validly define zero or more importable top-level names, and of course the code might directly use a sub-package or an attribute of some module (which doesn't even have to be a class). So there wouldn't be a clear place to put such names except in a separate block by themselves, which the existing comment syntax already does.
Finally, promoting the syntax to an actual part of the language doesn't seem to solve a problem. Using annotations instead of comments for types allows the type information to be discovered at runtime (e.g. through the `__annotations__` attribute of functions). What problem would it solve for packaging? It's already possible for tools to use a PEP 723 comment, and it's also possible (through the standard library - https://docs.python.org/3/library/importlib.metadata.html) to introspect the metadata of installed packages at runtime.
And the author is?
Actually the order of import statement is one of the things, that Python does better than JS. It makes completions much less costly to calculate when you type the code. An IDE or other tool only has to check one module or package for its contents, rather than whether any module has a binding of the name so and so. If I understand correctly, you are talking about an additional syntax though.
When mentioning a gigantic venv ... Why did they do that? Why not have smaller venvs for separate projects? It is really not that hard to do and avoids dependency conflicts between projects, which have nothing to do with each other. Using one giant venv is basically telling me that they either did not understand dependency conflicts, or did not care enough about their dependencies, so that one script can run with one set of dependencies one day, and another set of deps the other day, because a new project's deps have been added to the mix in the meantime.
Avoiding deps for small scripts is a good thing! If possible.
To me it just reads like a user now having a new tool allowing them to continue the lazy ways of not properly managing dependencies. I mean all deps in one huge venv? Who does that?? No wonder they had issues with that. Can't even keep deps separated, let alone properly having a lock file with checksums. Yeah no surprise they'll run into issues with that workflow.
And while we are relating to the JS world: One may complain in many ways about how NPM works, but it has had automatic lock file for aaages. Being the default tool in the ecosystem. And its competitors had it to. At least that part they got right for a long time, compared to pip, which does nothing of the sort eithout extra effort.
What's a 'project'? If you count every throw away data processing script and one off exploratory Jupyter notebook, that can easily be 100 projects. Certainly before uv, having one huge venv or conda environment with 'everything' installed made it much faster and easier to get that sort of work done.
I can understand it for exploratory Jupyter Notebook. But only in the truly exploratory stage. Say for example you are writing a paper. Reproducibility crisis. Exploring is fine, but when it gets to actually writing the paper, one needs to make ones setup reproducible, or lose credibility right away. Most academics are not aware of, or don't know how to, or don't care to, make things reproducible, leading to non-reproducible research.
I would be lying, if I claimed, that I personally always set up a lock file with hashsums for every script. Of course there can be scripts and things we care so little about, that we don't make them reproducible.
In other words, of course, in most long-term cases, it’s better to create a real project - this is the main uv flow for a reason. But there’s value in being able to easily specify requirements for quick one-off scripts.
Because they are annoying and unnecessary additional work. If I write something, I won't know the dependencies in the beginning. And if it's a personal tool/script or even a throwaway one-shoot, then why bother with managing unnecessary parts? I just manage my personal stack of dependencies for my own tools in a giant env, and pull imports from it or not, depending on the moment. This allows me to move fast. Of course it is a liability, but not one which usually bites me. Every some years, some dependency goes wrong, and I either fix it or remove it, but at the end the benefit I save in time far outweighs the time I would lose from micromanaging small separate envs.
Managing dependencies is for production and important things. Big messy envs is good enough for everything else. I have hundred of script and tools, micromanaging them on that level has no benefit. And it seems uv now offers some options for making small envs effortless without costing much time, so it's a net benefit in that area, but it's not something world shattering which will turn my world upside down.
If you do have a project that needs to manage its dependencies well and you still don't store hashsums and use them to install your dependencies based on them, then basically you forfeit any credibility when complaining about things going wrong with regard to bugs happening or changed behavior of the code without changing the code itself and similar things.
This can be all fine, if it is just your personal project, that gets shit done. I am not saying you must properly manage dependencies for such a personal project. Just not something ready for production.
I for one find it quite easy to make venvs per project. I have my Makefiles, which I slightly adapt to the needs of the project and then I run 1 single command, and get all set up with dependencies in a project specific venv, hashsums, reproducibility. Not that much to it really, and not at all annoying to me. Also can be sourced from any other script when that script uses another project. Could also use any other task runner thingy, doesn't have to be GNU Make, if one doesn't like it.
It's not designed nor intended for such. There are tons of Python users out there who have no concept of what you would call "production"; they wrote something that requires NumPy to be installed and they want to communicate this as cleanly and simply (and machine-readably) as possible, so that they can give a single Python file to associates and have them be able to use it in an appropriate environment. It's explicitly designed for users who are not planning to package the code properly in a wheel and put it up on PyPI (or a private index) or anything like that.
>and all but beautiful
De gustibus non est disputandum. The point is to have something simple, human-writable and machine-readable, for those to whom it applies. If you need to make a wheel, make one. If you need a proper lock file, use one. Standardization for lock files is finally on the horizon in the ecosystem (https://peps.python.org/pep-0751/).
>Actually the order of import statement is one of the things, that Python does better than JS.
Import statements are effectively unrelated to package management. Each installed package ("distribution package") may validly define zero or more top-level names (of "import packages") which don't necessarily bear any relationship to each other, and the `import` syntax can validly import one or more sub-packages and/or attributes of a package or module (a false distinction, anyway; packages are modules), and rename them.
>An IDE or other tool only has to check one module or package for its contents
The `import` syntax serves these tools by telling them about names defined in installed code, yes. The PEP 723 syntax is completely unrelated: it tells different tools (package managers, environment managers and package installers) about names used for installing code.
>Why not have smaller venvs for separate projects? It is really not that hard to do
It isn't, but it introduces book-keeping (Which venv am I supposed to use for this project? Where is it? Did I put the right things in it already? Should I perhaps remove some stuff from it that I'm no longer using? What will other people need in a venv after I share my code with them?) that some people would prefer to delegate to other tooling.
Historically, creating venvs has been really slow. People have noticed that `uv` solves this problem, and come up with a variety of explanations, most of which are missing the mark. The biggest problem, at least on Linux, is the default expectation of bootstrapping Pip into the new venv; of course uv doesn't do this by default, because it's already there to install packages for you. (This workflow is equally possible with modern versions of Pip, but you have to know some tricks; I describe some of this in https://zahlman.github.io/posts/2025/01/07/python-packaging-... . And it doesn't solve other problems with Pip, of course.) Anyway, the point is that people will make single "sandbox" venvs because it's faster and easier to think about - until the first actual conflict occurs, or the first attempt to package a project and accurately convey its dependencies.
> Avoiding deps for small scripts is a good thing! If possible.
I'd like to agree, but that just isn't going to accommodate the entire existing communities of people writing random 100-line analysis scripts with Pandas.
>One may complain in many ways about how NPM works, but it has had automatic lock file for aaages.
Cool, but the issues with Python's packaging system are really not comparable to those of other modern languages. NPM isn't really retrofitted to JavaScript; it's retrofitted to the Node.JS environment, which existed for only months before NPM was introduced. Pip has to support all Python users, and Python is about 18 years older than Pip (19 years older than NPM). NPM was able to do this because Node was a new project that was being specifically designed to enable JavaScript development in a new environment (i.e., places that aren't the user's browser sandbox). By contrast, every time any incremental improvement has been introduced for Python packaging, there have been massive backwards-compatibility concerns. PyPI didn't stop accepting "egg" uploads until August 1 2023 (https://blog.pypi.org/posts/2023-06-26-deprecate-egg-uploads...), for example.
But more importantly, npm doesn't have to worry about extensions to JavaScript code written in arbitrary other languages (for Python, C is common, but by no means exclusive; NumPy is heavily dependent on Fortran, for example) which are expected to be compiled on the user's machine (through a process automatically orchestrated by the installer) with users complaining to anyone they can get to listen (with no attempt at debugging, nor at understanding whose fault the failure was this time) when it doesn't work.
There are many things wrong with the process, and I'm happy to criticize them (and explain them at length). But "everyone else can get this right" is usually a very short-sighted line of argument, even if it's true.
Thus my warning about its use. And we as a part of the population need to learn and be educated about dependency management, so that we do not keep running into the same issues over and over again, that come through non-reproducible software.
> Import statements are effectively unrelated to package management. Each installed package ("distribution package") may validly define zero or more top-level names (of "import packages") which don't necessarily bear any relationship to each other, and the `import` syntax can validly import one or more sub-packages and/or attributes of a package or module (a false distinction, anyway; packages are modules), and rename them.
I did not claim them to be related to package management, and I agree. I was making an assertion, trying guess the meaning of what the other poster wrote about some "import bla from blub" statement.
> The `import` syntax serves these tools by telling them about names defined in installed code, yes. The PEP 723 syntax is completely unrelated: it tells different tools (package managers, environment managers and package installers) about names used for installing code.
If you had read my comment a bit more closely, you would have seen, that this is the assertion I made one phrase later.
> It isn't, but it introduces book-keeping (Which venv am I supposed to use for this project? Where is it? Did I put the right things in it already? Should I perhaps remove some stuff from it that I'm no longer using? What will other people need in a venv after I share my code with them?) that some people would prefer to delegate to other tooling.
I understand that. The issue is, that people keep complaining about things that can be solved in rather simple ways. For example:
> Which venv am I supposed to use for this project?
Well, the one in the directory of the project, of course.
> Where is it?
In the project directory of course.
> Did I put the right things in it already?
If it exists, it should have the dependencies installed. If you change the dependencies, then update the venv right away. You are always in a valid state this way. Simple.
> Should I perhaps remove some stuff from it that I'm no longer using?
That is done in the "update the venv" step mentioned above. Whether you delete the venv and re-create it, or have a dependency managing tool, that removes unused dependencies, I don't care, but you will know it, when you use such a tool. If you don't use such a tool, just recreate the venv. Nothing complicated so far.
> What will other people need in a venv after I share my code with them?
One does not share a venv itself, one shares the reproducible way to recreate it on another machine. Thus others will have just what you have, once they create the same venv. Reproducibility is key, if you want your code to run elsewhere reliably.
All of those have rather simple answers. I grant, some of these answers one learns over time, when dealing with these questions many times. However, none of it must be made difficult.
> I'd like to agree, but that just isn't going to accommodate the entire existing communities of people writing random 100-line analysis scripts with Pandas.
True, but those have apparently a need to have Pandas. Then it cannot be avoided to install dependencies. Then it depends on whether their stuff is one-off stuff, that no one will ever need to run again later, or part of some need to be reliable pipeline. The use-case changes the requirements with regard to reproducibility.
---
About the NPM - PIP comparison. Sure there may be differences. None of those however justify not having hashsums of dependencies where they can be had. And if there is a C thing? Well, you will still download that in some tarball or archive when you install it as a dependency. Easy to get a checksum of that. Store the checksum.
I was merely pointing out a basic facility of NPM, that is there for as long as I remember using NPM, that is still not existent with PIP, except for using some additional packages to facilitate it (I think hashtools or something like that was required). I am not holding up NPM as the shining star, that we all should follow. It has its own ugly corners. I was pointing out that specific aspect of dependency management. Any artifact downloaded from anywhere one can calculate the hashes of. There are no excuses for not having the hashes of artifacts.
That Pip is 19 years older than NPM doesn't have to be a negative. Those are 19 years more time to have worked on the issues as well. In those 19 years no one had issues with non-reproducible builds? I find that hard to believe. If anything the many people complaining about not being able to install some dependency in some scenario tell us, that reproducible builds are key, to avoid these issues.
Sure, but TFA is about installation, and I wanted to make sure we're all on the same page.
>I understand that. The issue is, that people keep complaining about things that can be solved in rather simple ways.
Can be. But there are many comparably simple ways, none of which is obvious. For example, using the most basic level of tooling, I put my venvs within a `.local` directory` which contains other things I don't want to put in my repo nor mention in .gitignore. Other workflow managers put them in an entirely separate directory and maintain their own mapping.
>Whether you delete the venv and re-create it, or have a dependency managing tool, that removes unused dependencies, I don't care, but you will know it, when you use such a tool.
Well, yes. That's the entire point. When people are accustomed to using a single venv, it's because they haven't previously seen the point of separating things out. When they realize the error of their ways, they may "prefer to delegate to other tooling", as I said. Because it represents a pretty radical change to their workflow.
> That Pip is 19 years older than NPM doesn't have to be a negative. Those are 19 years more time to have worked on the issues as well.
In those 19 years people worked out ways to use Python and share code that bear no resemblance to anything that people mean today when they use the term "ecosystem". And they will be very upset if they're forced to adapt. Reading the Packaging section of the Python Discourse forum (https://discuss.python.org/c/packaging/14) is enlightening in this regard.
> In those 19 years no one had issues with non-reproducible builds?
Of course they have. That's one of the reasons why uv is the N+1th competitor in its niche; why Conda exists; why meson-python (https://mesonbuild.com/meson-python/index.html) exists; why https://pypackaging-native.github.io/ exists; etc. Pip isn't in a position to solve these kinds of problems because of a) the core Python team's attitude towards packaging; b) Pip's intended and declared scope; and c) the sheer range of needs of the entire Python community. (Pip doesn't actually even do builds; it delegates to whichever build backend is declared in the project metadata, defaulting to Setuptools.)
But it sounds more like you're talking about lockfiles with hashes. In which case, please just see https://peps.python.org/pep-0751/ and the corresponding discussion ("Post-History" links there).
But... the 86GB python dependency download cache on my primary SSD, most of which can be attributed to the 50 different versions of torch, is testament to the fact that even uv cannot salvage the mess that is pip.
Never felt this much rage at the state of a language/build system in the 25 years that I have been programming. And I had to deal with Scala's SBT ("Simple Build Tool") in another life.
I just started a fresh virtual environment with "python -m venv venv" - running "du -h" showed it to be 21MB. After running "venv/bin/pip install torch" it's now 431MB.
The largest file in there is this one:
178M ./lib/python3.10/site-packages/torch/lib/libtorch_cpu.dylib
There's a whole section of the uv manual dedicated just to PyTorch: https://docs.astral.sh/uv/guides/integration/pytorch/(I just used find to locate as many libtorch_cpu.dylib files as possible on my laptop and deleted 5.5GB of them)
Compare this to the way something like maven/gradle handles this and you have to wonder WTF is going on here.
Maybe your various LLM libraries are pinning different versions of Torch?
Different Python versions each need their own separate Torch binaries as well.
At least with uv you don't end up with separate duplicate copies of PyTorch in each of the virtual environments for each of your different projects!
Found this the hard way. Something to do with breakage in ABI perhaps. Was looking at the way python implements extensions the other day. Very weird.
There is a "stable ABI" which is a subset of the full ABI, but no requirement to stick to it. The ABI effectively changes with every minor Python version - because they're constantly trying to improve the Python VM, which often involves re-working the internal representations of built-in types, etc. (Consider for example the improvements made to dictionaries in Python 3.6 - https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-compa... .) Of course they try to make proper abstracted interfaces for those C structs, but this is a 34 year old project and design decisions get re-thought all the time and there are a huge variety of tiny details which could change and countless people with legacy code using deprecated interfaces.
The bytecode also changes with every minor Python version (and several times during the development of each). The bytecode file format is versioned for this reason, and .pyc caches need to be regenerated. (And every now and then you'll hit a speed bump, like old code using `async` as an identifier which subsequently becomes a keyword. That hit TensorFlow once: https://stackoverflow.com/questions/51337939 .)
Was some kind of FFI using dlopen and sharing memory across the vm boundary ever considered in the past, instead of having to compile extensions alongside a particular version of python?
I remember seeing some ffi library, probably on pypi. But I don't think it is part of standard python.
To my understanding, though, it's less performant. And you still need a stable ABI layer to call into. FFI can't save you if the C code decides in version N+1 that it expects the "memory shared across the vm boundary" to have a different layout.
Yes, it's essentially that: CPython doesn't guarantee exact ABI stability between versions unless the extension (and its enclosing package) intentionally build against the stable ABI[1].
The courteous thing to do in the Python packaging ecosystem is to build "abi3" wheels that are stable and therefore don't need to be duplicated as many times (either on the index or on the installing client). Torch doesn't build these wheels for whatever reason, so you end up with multiple slightly different but functionally identical builds for each version of Python you're using.
TL;DR: This happens because of an interaction between two patterns that Python makes very easy: using multiple Python versions, and building/installing binary extensions. In a sense, it's a symptom of Python's success: other ecosystems don't have these problems because they have far fewer people running multiple configurations simultaneously.
I am planning to shift some of my stuff to pypy (so a "fast" python exists, kind of). But some dependencies can be problematic, I have heard.
(Recent positive developments in Python’s interpreted performance have subverted this informal tendency.)
You already had billions of lines of Java and JS code that HAD to be sped up. So they had no alternative. If python had gone down the same route, speeding it up without caveats would have been that much easier.
In other languages that didn't happen and you don't have anywhere near as good scientific/ML packages as a result.
Improving Python performance has been a topic as far back as 2008 when I attended my first PyCon. A quick detour on Python 3 because there is some historical revisionism because many online people weren't around in the earlier days.
Back then the big migration to Python 3 was in front of the community. The timeline concerns that popped up when Python really got steam in the industry between 2012 and 2015 weren't as huge a concern. You can refer to Guido's talks from PyCon 2008 and 2009 if they are available somewhere to get the vibe on the urgency. Python is impactful because it changes the language and platform while requiring a massive amount of effort.
Back to perf. Around 2008, there was a feeling that an alternative to CPython might be the future. Candidates included IronPython, Jython, and PyPy. Others like Unladen Swallow wanted to make major changes to CPython (https://peps.python.org/pep-3146/).
Removing the GIL was another direction people wanted to take because it seemed simpler in a way. This is a well researched area with David Beazley having many talks like this oldie (https://www.youtube.com/watch?v=ph374fJqFPE). The idea is much older (https://dabeaz.blogspot.com/2011/08/inside-look-at-gil-remov...).
All of these alternative implementations of Python from this time period have basically failed at the goal of replacing CPython. IronPython was a Python 2 implmentation and updating to Python 3 while trying grow to challenge CPython was impossible. Eventually, Microsoft lost interest and that was that. Similar things happened for the others.
GIL removal was a constant topic from 2008 until recently. Compatibility of extensions was a major concern causing inertia and the popularity meant even more C/C++/Rust code relying on a GIL. The option to disable (https://peps.python.org/pep-0703/) only happened because the groundwork was eventually done properly to help the community move.
The JVM has very clearly defined interfaces and specs similar to the CLR which make optimization viable. JS doesn't have the compatibility concerns.
That was just a rough overview but many of the stories of Python woes miss a lot of this context. Many discussions about perf over the years have descended into a GIL discussion without any data to show the GIL would change performance. People love to talk about it but turn out to be IO-bound when you profile code.
It was expected that extension maintainers would respond negatively to this. In many cases it presents a decision: do I port this to the new platform, or move away from Python completely? You have to remember, the impactful decisions leading us down this path were closer to 2008 than today when dropping Python or making it the second option to help people migrate, would have been viable for a lot of these extensions. There was also a lot of potential for people to follow a fork of the traditional CPython interpreter.
There were no great options because there are many variables to consider. Perf is only one of them. Pushing ahead only on perf is hard when it's unclear if it'll actually impact people in the way they think it will when they can't characterize their actual perf problem beyond "GIL bad".
For pypy it's in a weird spot as the things it does fast are the ones you'd usually just offload to a module implemented in C
(TIOBE's methodology is a bit questionable though, as far as I can tell it's almost entirely based on how many search engine hits they get for "X programming". https://www.tiobe.com/tiobe-index/programminglanguages_defin...)
Well, the extensions are going to complicate this a lot.
A fast JIT interpreter cannot reach into a library and do its magic there the way HotSpot/V8 can with native Java/JS code.
(I don't know if this is the reason in Torch's case or not, but I know from experience that it's the reason for many other popular Python packages.)
If a package manager stores more than it needs to, it is a package manager problem.
https://docs.astral.sh/uv/reference/settings/#link-mode
It's even the default. Here's where it's implemented if you're curious https://github.com/astral-sh/uv/blob/f394f7245377b6368b9412d...
`nix-store --optimize` is a little different because it looks for duplicates and hardlinks those files across all of the packages. I don’t know how much additional savings that would yield with a typical uv site-packages, but it guarantees no duplicate files.
> I just used find to locate as many libtorch_cpu.dylib files as possible on my laptop and deleted 5.5GB of them
but maybe it wasn’t actually 5.5 GB!
And since its compiled dependency even if UV was to attempt the the more complicated method of symlinking to use a single version of identical files it wouldn't help much. You'd probably need to store binary diffs or chunks of files that are binary identical, at that point your code would probably start to resemble a file system in user space and time to switch to a particular version of the files (ie create thek as qctual files in filesystem) would be much higher.
Also I believe uvs cache is separate from the pip cache so you could have different copies in both.
I think there's a uv cache prune command. Arguably it should offer to install a from job to do it periodically
[[tool.uv.index]]
name = "pytorch-cu124"
url = "https://download.pytorch.org/whl/cu124"
explicit = true
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true
[tool.uv.sources]
torch = [
{ index = "pytorch-cu124", marker = "platform_system != 'Darwin'" },
{ index = "pytorch-cpu", marker = "platform_system == 'Darwin'" },
]
$ uv cache prune
$ uv cache clean
Take your pick and schedule to run weekly/monthly.Can this be solved? Maybe, but not by a new tool (on its own). It would require a lot of devs who may not see much improvement to their workflow change their workflow for others (to newer ones which remove the assumptions which are built in to the current workflows), plus a bunch of work by key stakeholders (and maybe even the open sourcing of some crown jewels), and I don't see that happening.
This is not true in my case. The regular pytorch does not work on my system. I had to download a version specific to my system from the pytorch website using --index-url.
> packages bundle all possible options into a single artifact
Cross-platform Java apps do it too. For e.g., see https://github.com/xerial/sqlite-jdbc. But it does not become a clusterfuck like it does with python. After downloading gigabytes and gigabytes of dependencies repeatedly, the python tool you are trying to run will refuse to do so for random reasons.
You cannot serve end-users a shit-sandwich of this kind.
The python ecosystem is a big mess and, outside of a few projects like uv, I don't see anyone trying to build a sensible solution that tries to improve both speed/performance and packaging/distribution.
Cross-OS (especially with a VM like Java or JS) is relatively easy compared to needing specific versions for every single sub-architecture of a CPU and GPU system (and that's ignoring all the other bespoke hardware that's out there).
The SQLite project I linked to is a JDBC driver that makes use of the C version of the library appropriate to each OS. LWJGL (https://repo1.maven.org/maven2/org/lwjgl/lwjgl/3.3.6/) is another project which heavily relies on native code. But distributing these, or using these as dependencies, does not result in hair-pulling like it does with python.
Hardlink is somewhat better because both point to the same inode, but will also not work if the file needs different permissions or needs to be independently mutable from different locations.
Reflink hits the sweetspot where it can have different permissions, updates trigger CoW preventing confusing mutations, and all while still reducing total disk usage.
/blobs/<sha256_sum>/filename.zip
and then symlinking/reflinking filename.zip to wherever it needs to be in the source tree...
It's more portable than hardlinks, solves your "source of truth" problem and has pretty wide platform support.
Platforms that don't support symlinks/reflinks could copy the files to where they need to be then delete the blob store at the end and be no worse off than they are now.
Anyway, I'm just a netizen making a drive-by comment.
The node ecosystem and the rust one seem to have their own issues. I have zero interest in either of them so I haven't looked into them in detail.
However, I have to deal with node on occasion because a lot of JS/CSS tooling is written using it. It has a HUGE transitive dependency problem.
Ah found the issue.
I feel you.
https://scala-cli.virtuslab.org
Or for larger projects, the thing the author of the linked article is plugging (mill).
But Mill is very good. [1]
> Being independent from Python bootstrapping
Yep, conda.
> Being capable of installing and running Python in one unified congruent way across all situations and platforms.
Yep, conda.
> Having a very strong dependency resolver.
Yep, conda (or mamba).
The main thing conda doesn't seem to have which uv has is all the "project management" stuff. Which is fine, it's clear people want that. But it's weird to me to see these articles that are so excited about being able to install Python easily when that's been doable with conda for ages. (And conda has additional features not present in uv or other tools.)
The pro and con of tools like uv is that they layer over the base-level tools like pip. The pro of that is that they interoperate well with pip. The con is that they inherit the limitations of that packaging model (notably the inability to distribute non-Python dependencies separately).
That's not to say uv is bad. It seems like a cool tool and I'm intrigued to see where it goes.
1) I can't solve for the tools I need and I don't know what to do. I try another tool, it works, I can move forward and don't go back to conda
2) it takes 20-60 minutes to solve, if it ever does. I quit and don't come back. I hear this doesn't happen anymore, but to this day I shudder before I hit enter on a conda install command
3) I spoil my base environment with an accidental install of something, and get annoyed and switch away.
On top of that the commands are opaque, unintuitive, and mysterious. Do I do conda env command or just conda command? Do I need a -n? The basics are difficult and at this point I'm too ashamed to ask which of the many many docs explain it, and I know I will forget within two months.
I have had zero of these problems with uv. If I screw up or it doesn't work it tells me right away. I don't need to wait for a couple minutes before pressing y to continue, I just get what I need in at most seconds, if my connection is slow.
If you're ina controlled environment and need audited packages, I would definitely put up with conda. But for open source, personal throw away projects, and anything that doesn't need a security clearance, I'm not going to deal with that beast.
uv hardly occupies the same problem space. It elevates DX with disciplined projects to new heights, but still falls short with undisciplined projects with tons of undeclared/poorly declared external dependencies, often transitive — commonly seen in ML (now AI) and scientific computing. Not its fault of course. I was pulling my hair out with one such project the other day, and uv didn’t help that much beyond being a turbo-charged pip and pyenv.
One illustration is the CUDA toolkit with torch install on conda. If you need a basic setup, it would work (and takes age). But if you need some other specific tools in the suite, or need it to be more lightweight for whatever reason then good luck.
btw, I do not see much interest in uv. pyenv/pip/venv/hatch are simple enough to me. No need for another layer of abstraction between my machine and my env. I will still keep an eye on uv.
It's very fast, comes with lockfiles and a project-based approach. Also comes with a `global` mode where you can install tools into sandboxed environments.
When I go to https://prefix.dev, the "Get Started Quickly" section has what looks like a terminal window, but the text inside is inscrutable. What do they various lines mean? There's directories, maybe commands, check boxes... I don't get it. It doesn't look like a shell despite the Terminal wrapping box.
Below that I see that there's a pixi.toml, but I don't really want a new toml or yml file, there's enough repository lice to confuse new people on projects already.
Any time spent educating on packaging is time not spent on discovery, and is an impediment to onboarding.
Discord is predominately blocked on a corporate networks. Artifactory ( & Nexus) are very common in corporate environments. Corporate proxies are even more common. This is why I'd hesitate. These are common use cases (albeit corporate) that may not be readily available in the docs.
uv "just works". Which is a feature in itself.
However, after using conda for over three years I can confidently say I don't like using it. I find it to be slow and annoying, often creating more problems than it solves. Mamba is markedly better but still manages to confuse itself.
uv just works, if your desktop environment is relatively modern. that's its biggest selling point, and why I'm hooked on it.
>Like so many other articles that make some offhand remarks about conda, this article raves about a bunch of "new" features that conda has had for years.
Agreed. (I'm also tired of seeing advances like PEP 723 attributed to uv, or uv's benefits being attributed to it being written in Rust, or at least to it not being written in Python, in cases where that doesn't really hold up to scrutiny.)
> The pro and con of tools like uv is that they layer over the base-level tools like pip. The pro of that is that they interoperate well with pip.
It's a pretty big pro ;) But I would say it's at least as much about "layering over the base-level tools" like venv.
> The con is that they inherit the limitations of that packaging model (notably the inability to distribute non-Python dependencies separately).
I still haven't found anything that requires packages to contain any Python code (aside from any build system configuration). In principle you can make a wheel today that just dumps a platform-appropriate shared library file for, e.g. OpenBLAS into the user's `site-packages`; and others could make wheels declaring yours as a dependency. The only reason they wouldn't connect up - that I can think of, anyway - is because their own Python wrappers currently don't hard-code the right relative path, and current build systems wouldn't make it easy to fix that. (Although, I guess SWIG-style wrappers would have to somehow link against the installed dependency at their own install time, and this would be a problem when using build isolation.)
It's not just that, it's that you can't specify them as dependencies in a coordinated way as you can with Python libs. You can dump a DLL somewhere but if it's the wrong version for some other library, it will break, and there's no way for packages to tell each other what versions of those shared libraries they need. With conda you can directly specify the version constraints on non-Python packages. Now, yeah, they still need to be built in a consistent manner to work, but that's what conda-forge handles.
I guess one possible workaround is to automate making a wheel for each version of the compiled library, and have the wheel version move in lockstep. Then you just specify the exact wheel versions in your dependencies, and infer the paths according to the wheel package names... it certainly doesn't sound pleasant, though. And, C being what it is, I'm sure that still overlooks something.
Ah, I forgot the best illustration of this: uv itself is available this way - and you can trivially install it with Pipx as a result. (I actually did this a while back, and forgot about it until I wanted to test venv creation for another comment...)
There's also the issue the license for using the repos, which makes it risky to rely on conda/anaconda. See e.g. https://stackoverflow.com/a/74762864
I used conda for awhile around 2018. My environment became borked multiple times and I eventually gave up on it. After that, I never had issues with my environment becoming corrupted. I knew several other people who had the same issues and it stopped after they switched away from conda.
I've heard it's better now, but that experience burned me so I haven't kept up with it.
It follows more of a project based approach, comes with lockfiles and a lightweight task system. But we're building it up for much bigger tasks as well (`pixi build` will be a bit like Bazel for cross-platform, cross-language software building tasks).
While I agree that conda has many short-comings, the fundamental packages are alright and there is a huge community keeping the fully open source (conda-forge) distribution running nicely.
Now, I just give students a pixi.toml and pixi.lock, and a few commands in the README to get them started. It'll even prevent students from running their projects, adding packages, or installing environments when working on our cluster unless they're on a node with GPUs. My inbox used to be flooded with questions from students asking why packages weren't installing or why their code was failing with errors about CUDA, and more often than not, it was because they didn't allocate any GPUs to their HPC job.
And, as an added bonus, it lets me install tools that I use often with the global install command without needing to inundate our HPC IT group with requests.
So, once again, thank you
Pixi[1] is an alternative conda package manager (as in it still uses conda repositories; conda-forge by default) that bridges this gap. It even uses uv for PyPI packages if you can't find what you need in conda repositories.
In the same vein, I don't want Gradle or Maven to install my JVM for me.
In JVM land I use SDKMAN! (Yes, that's what the amazingly awesome (in the original sense of awe: "an emotion variously combining dread, veneration, and wonder") concretion of Bash scripts is called).
In Python land I use pyenv.
And I expect my build tool to respect the JVM/Python versions I've set (looking at you Poetry...) and fail if they can't find them (You know what you did, Poetry. But you're still so much better than Pipenv)
For what it's worth, uv does this if you tell it to. It's just not the default behaviour. Uv and pyenv can easily be used together.
Suppose conda had projects. Still, it is somewhat incredible to see uv resolve + install in 2 seconds what takes conda 10 minutes. It immediately made me want to replace conda with uv whenever possible.
(I have actively used conda for years, and don’t see myself stopping entirely because of non Python support, but I do see myself switching primarily to uv.)
Ruby-the-language is now inseparable from Rails because the venn diagram of the “Ruby” community and the “rails” community is nearly a circle. It can be hard to find help with plain Ruby because 99% of people will assume you have the rails stdlib monkeypatches.
In a similar way, conda and data science seem to be conjoined, and I don’t really see anybody using conda as a general-purpose Python environment manager.
But anaconda doesn't do inline deps, isn't a consitent experience (the typical conda project doesn't exist), is it's own island incompatible with most python ecosystem, is super slow, the yaml config is very quirky, and it's very badly documented while having poor ergonomics.
In short, anaconda solves many of those problems but brings other ones on the table.
> To bootstrap a conda installation, use a minimal installer such as Miniconda or Miniforge.
> Conda is also included in the Anaconda Distribution.
Bam, you’ve already lost me there. Good luck getting this approved on our locked down laptops.
No pip compatibility? No venv compatibility? Into the trash it goes, it’s not standard. The beauty of uv is that it mostly looks like glue (even though it is more) for standard tooling.
Here's just one example, nemo2riva, the first in several steps to taking a trained NeMo model and making it deployable: https://github.com/nvidia-riva/nemo2riva?tab=readme-ov-file#...
before you can install the package, you first have to install some other package whose only purpose is to break pip so it uses nvidia's package registry. This does not work with uv, even with the `uv pip` interface, because uv rightly doesn't put up with that shit.
This is of course not Astral's fault, I don't expect them to handle this, but uv has spoiled me so much it makes anything else even more painful than it was before uv.
I guess you're really talking about `nvidia-pyindex`. This works by leveraging the legacy Setuptools build system to "build from source" on the user's machine, but really just running arbitrary code. From what I can tell, it could be made to work just as well with any build system that supports actually orchestrating the build (i.e., not Flit, which is designed for pure Python projects), and with the modern `pyproject.toml` based standards. It's not that it "doesn't work with uv"; it works specifically with Pip, by trying to run the current (i.e.: target for installation) Python environment's copy of Pip, calling undocumented internal APIs (`from pip._internal.configuration import get_configuration_files`) to locate Pip's config, and then parsing and editing those files. If it doesn't work with `uv pip`, I'm assuming that's because uv is using a vendored Pip that isn't in that environment and thus can't be run that way.
Nothing prevents you, incidentally, from setting up a global Pip that's separate from all your venvs, and manually creating venvs that don't contain Pip (which makes that creation much faster): https://zahlman.github.io/posts/2025/01/07/python-packaging-... But it does, presumably, interfere with hacks like this one. Pip doesn't expose a programmatic API, and there's no reason why it should be in the environment if you haven't explicitly declared it as a dependency - people just assume it will be there, because "the user installed my code and presumably that was done using Pip, so of course it's in the environment".
https://www.bitecode.dev/p/charlie-marsh-on-astral-uv-and-th...
Make sense, the market is wide open for it.
But most of my work, since I adopted conda 7ish years ago, involves using the same ML environment across any number of folders or even throw-away notebooks on the desktop, for instance. I’ll create the environment and sometimes add new packages, but rarely update it, unless I feel like a spring cleaning. And I like knowing that I have the same environment across all my machines, so I don’t have to think about if I’m running the same script or notebook on a different machine today.
The idea of a new environment for each of my related “projects” just doesn’t make sense to me. But, I’m open to learning a new workflow.
Addition: I don’t run other’s code, like pretrained models built with specific package requirements.
My one off notebook I'm going to set up to be similar to the scripts, will require some mods.
It does take up a lot more space, it is quite a bit faster.
However, you could use the workspace concept for this I believe, and have the dependencies for all the projects described in one root folder and then all sub-folders will use the environment.
But I mean, our use case is very different than yours, its not necessary to use uv.
FYI, for anyone else that stumbles upon this: I decided to do a quick check on PyTorch (the most problem-prone dependency I've had), and noticed that they recommending specifically no longer using conda—and have since last November.
In your case, I guess one thing you could do is have one git repo containing you most commonly-used dependencies and put your sub-projects as directories beneath that? Or even keep a branch for each sub-project?
One thing about `uv` is that dependency resolution is very fast, so updating your venv to switch between "projects" is probably no big deal.
First, let me try to make sense of it for you -
One of uv's big ideas is that it has a much better approach to caching downloaded packages, which lets it create those environments much more quickly. (I guess things like "written in Rust", parallelism etc. help, but as far as I can tell most of the work is stuff like hard-linking files, so it's still limited by system calls.) It also hard-links duplicates, so that you aren't wasting tons of space by having multiple environments with common dependencies.
A big part of the point of making separate environments is that you can track what each project is dependent on separately. In combination with Python ecosystem standards (like `pyproject.toml`, the inline script metadata described by https://peps.python.org/pep-0723/, the upcoming lock file standard in https://peps.python.org/pep-0751/, etc.) you become able to reproduce a minimal environment, automate that reproduction, and create an installable sharable package for the code (a "wheel", generally) which you can publish on PyPI - allowing others to install the code into an environment which is automatically updated to have the needed dependencies. Of course, none of this is new with `uv`, nor depends on it.
The installer and venv management tool I'm developing (https://github.com/zahlman/paper) is intended to address use cases like yours more directly. It isn't a workflow tool, but it's intended to make it easier to set up new venvs, install packages into venvs (and say which venv to install it into) and then you can just activate the venv you want normally.
(I'm thinking of having it maintain a mapping of symbolic names for the venvs it creates, and a command to look them up - so you could do things like "source `paper env-path foo`/bin/activate", or maybe put a thin wrapper around that. But I want to try very hard to avoid creating the impression of implementing any kind of integrated development tool - it's an integrated user tool, for setting up applications and libraries.)
E.g. calling that wrapper uvv, something like
1. uvv new <venv-name> --python=... ...# venvs stored in a central location
2. uvv workon <venv-name> # now you are in the virtualenv
3. deactive # now you get out of the virtualenv
You could imagine additional features such as keeping a log of the installed packages inside the venv so that you could revert to arbitrary state, etc. as goodies given how much faster uv is.To open a notebook I run (via an alias)
uv tool run jupyter lab
and then in the first cell of each notebook I have !uv pip install my-dependcies
This takes care of all the venv management stuff and makes sure that I always have the dependencies I need for each notebook. Only been doing this for a few weeks, but so far so good.Sadly for certain types of projects like GIS, ML, scientific computing, the dependencies tend to be mutually incompatible and I've learned the hard way to set up new projects for each separate task when using those packages. `uv init; uv add <dependencies>` is a small amount of work to avoid the headaches of Torch etc.
You don't have to love uv, and there are plenty of reasons not to.
TFA offers a myriad innovative and pleasing examples. It would have been nice if you actually commented on any of those, or otherwise explained why you think otherwise.
Dozens of threads of people praising how performant and easy uv is, how it builds on standards and current tooling instead of inventing new incompatible set of crap, and every time one comment pops up with “akshually my mix of conda, pyenv, pipx, poetry can already do that in record time of 5 minutes, why do you need uv? Its going to be dead soon”.
Every packaging PEP is also hailed as the solution to everything, only to be superseded by a new and incompatible PEP within two years.
If someone doesn’t want to use it, or doesn’t like it, or is, quite reasonably skeptical that “this time it’ll be different!” … let them be.
If it’s good, it’ll stand on its own despite the criticism.
If it can’t survive with some people disliking and criticising it is, it deserves to die.
Right? Live and let live. We don’t have to all agree all the time about everything.
uv is great. So use it if you want to.
And if you don’t, that’s okay too.
For companies. Which is why when random people start acting like it’s important, you have to wonder why it’s important to them.
For example, being a corporate shill. Or so deep in coolaid you can’t allow alternative opinions? Hm?
It’s called an echo chamber.
I don't use uv because I don't currently trust that it will be maintained on the timescales I care about. I stick with pip and venv, because I expect they will still be around 10 years from now, because they have much deeper pools of interested people to draw contributors and maintainers from, because - wait for it - they are really popular. Your theory about random people being corporate shills for anything they keep an eye on the popularity of can be explained much more parsimoniously like that.
I get it! I loved my long-lived curated conda envs.
I finally tried uv to manage an environment and it’s got me hooked. That a projects dependencies can be so declarative and separated from the venv really sings for me! No more meticulous tracking of a env.yml or requirements.txt just ‘uv add` and `uv sync` and that’s it! I just don’t think about it anymore
I agree that uv is the N+1th competing standard for a workflow tool, and I don't like workflow tools anyway, preferring to do my own integration. But the installer it provides does solve a lot of real problems that Pip has.
In JVM land I used Sdkman to manage JVMs, and I used Maven or Gradle to manage builds.
I don't want them both tied to one tool, because that's inflexible.
the amount of people who switch to R because Python is too hard to setup is crazy high.
Especially among the life scientists and statisticians
(The principle is recognized in trademark law -- some may remember Apple the record label and Apple the computer company. They eventually clashed, but I don't see either of the uv's encroaching on the other's territory.)
Google returns mixed results. You may assert it's not problematic, but this is a source of noise that projects with distinct names don't have.
Seems like a lot of people have tried their hand at various tooling, so there must be more to it than I am aware of.
The dependency file (what requirements.txt is supposed to be), just documents the things you depend on directly, and possibly known version constraints. A lock file captures the exact version of your direct and indirect dependencies at the moment in time it's generated. When you go to use the project, it will read the lock file, if it exists, and match those versions for anything listed directly or indirectly in the dependency file. It's like keeping a snapshot of the exact last-working dependency configuration. You can always tell it to update the lock file and it will try to recaclulate everything from latest that meets your dependency constraints in the dependency file, but if something doesn't work with that you'll presumably have your old lock file to fall back on _that will still work_.
It's a standard issue/pattern in all dependency managers, but it's only been getting attention for a handful of years with the focus on reproducibility for supply chain verification/security. It has the side effect of helping old projects keep working much longer though. Python has had multiple competing options for solutions, and only in the lat couple years did they pick a format winner.
If the dependency file is wrong, and describes versions that are incompatible with the project, it should be fixed. Duplicating that information elsewhere is wrong.
Lockfiles have a very obvious use case: Replicable builds across machines in CI. You want to ensure that all the builds in the farm are testing the same thing across multiple runs, and that new behaviors aren't introduced because numpy got revved in the middle of the process. When that collective testing process is over, the lockfile is discarded.
You should not use lockfiles as a "backup" to pyproject.toml. The version constraints in pyproject.toml should be correct. If you need to restrict to a single specific version, do so, "== 2.2.9" works fine.
Horses for courses.
Dependency files - whether the project's requirements (or optional requirements, or in the future, other arbitrary dependency groups) in `pyproject.toml`, or a list in a `requirements.txt` file (the filename here is actually arbitrary) don't describe versions at all, in general. Their purpose is to describe what's needed to support the current code: its direct dependencies, with only as much restriction on versions as is required. The base assumption is that if a new version of a dependency comes out, it's still expected to work (unless a cap is set explicitly), and has a good chance of improving things in general (better UI, more performant, whatever). This is suitable for library development: when others will cite your code as a dependency, you avoid placing unnecessary restrictions on their environment.
Lockfiles are meant to describe the exact version of everything that should be in the environment to have exact reproducible behaviour (not just "working"), including transitive dependencies. The base assumption is that any change to anything in the environment introduces an unacceptable risk; this is the tested configuration. This is suitable for application development: your project is necessarily the end of the line, so you expect others to be maximally conservative in meeting your specific needs.
You could also take this as an application of Postel's Law.
>Lockfiles have a very obvious use case: Replicable builds across machines in CI.
There are others who'd like to replicate their builds: application developers who don't want to risk getting bug reports for problems that turn out to be caused by upstream updates.
> You should not use lockfiles as a "backup" to pyproject.toml. The version constraints in pyproject.toml should be correct. If you need to restrict to a single specific version, do so, "== 2.2.9" works fine.
In principle, if you need a lockfile, you aren't distributing a library package anyway. But the Python ecosystem is still geared around the idea that "applications" would be distributed the same way as libraries - as wheels on PyPI, which get set up in an environment, using the entry points specified in `pyproject.toml` to create executable wrappers. Pipx implements this (and rejects installation when no entry points are defined); but the installation will still ignore any `requirements.txt` file (again, the filename is arbitrary; but also, Pipx is delegating to Pip's ordinary library installation process, not passing `-r`).
You can pin every version in `pyproject.toml`. Your transitive dependencies still won't be pinned that way. You can explicitly pin those, if you've done the resolution. You still won't have hashes or any other supply-chain info in `pyproject.toml`, because there's nowhere to put it. (Previous suggestions of including actual lockfile data in `pyproject.toml` have been strongly rejected - IIRC, Hatch developer Ofek Lev was especially opposed to this.)
Perhaps in the post-PEP 751 future, this could change. PEP 751 specifies both a standard lockfile format (with all the sorts of metadata that various tools might want) and a standard filename (or at least filename pattern). A future version of Pipx could treat `pylock.toml` as the "compiled" version of the "source" dependencies in `pyproject.toml`, much like Pip (and other installers) treat `PKG-INFO` (in an sdist, or `METADATA` in a wheel) as the "compiled" version (dependency resolution notwithstanding!) of other metadata.
It is wrong to specify the versions for your transitive dependencies except to achieve reproducible builds, as in CI or other situations. If a dependency fails to correctly describe their requirements in their pyproject.toml it is a bug and should be fixed in the relevant upstream.
> There are others who'd like to replicate their builds: application developers who don't want to risk getting bug reports for problems that turn out to be caused by upstream updates.
If your application only works with specific versions of dependencies and has bugs in others, that should be described in pyproject.toml.
> In principle, if you need a lockfile, you aren't distributing a library package anyway.
It's irrelevant what kind of project you're describing, library, application, build-tooling, etc. Your pyproject.toml should contain the correct information for that project to run. If your project cannot run, contains bugs, or otherwise needs more specific information than is present in the pyproject.toml the answer is to fix the pyproject.toml.
Reproducible builds for the purpose of testing or producing hash-identical wheels, or similar situations where the goal is producing an exact snapshot of a build, is the only reason to be using lockfiles. None of those use cases aligns with, for example, tracking the lockfile in source control.
Yes, and this is why many people have both pyproject.toml and requirements.txt. pyproject.toml is meant to specify abstract, unresolved dependencies only.
>If your application only works with specific versions of dependencies and has bugs in others, that should be described in pyproject.toml.
That's quite literally not the design, if by "dependencies" you mean including transitive dependencies. pyproject.toml isn't there to enable reproducible builds. This is exactly why I included that speculation about the post-PEP751 future: currently we can "install applications", but with tools that aren't meant to handle exact application configurations.
> Reproducible builds for the purpose of testing or producing hash-identical wheels, or similar situations where the goal is producing an exact snapshot of a build, is the only reason to be using lockfiles.
Some application developers would say that, as far as they are concerned, the code "cannot run" except in the context of a reproducible build. If it works otherwise, that's an "upside", but they don't want to support it.
I think we're just going around in circles here.
I don't. Your transitive dependencies are not your problem, they are upstream's problem. Anything regarding version requirements of such packages belongs upstream.
> pyproject.toml isn't there to enable reproducible builds.
Agreed. The repository itself should not contain anything related to reproducible builds. Reproducible builds are a packaging concern not part of the source of the project itself. Ex, it would be appropriate to ship a lockfile alongside a wheel or a tarball that specifies it is the lockfile used to produce that particular build of the project; but both the wheel and the lockfile exist outside the context of the project itself.
> Some application developers would say that, as far as they are concerned, the code "cannot run" except in the context of a reproducible build.
Yes, they are wrong.
If your answer is "delete the venv and recreate it", what do you do when your code now has a bunch of errors it didn't have before?
If your answer is "ignore it", what do you do when you try to run the project on a new system and find half the imports are missing?
None of these problems are insurmountable of course. But they're niggling irritations. And of course they become a lot harder when you try to work with someone else's project, or come back to a project from a couple of years ago and find it doesn't work.
As someone with a similar approach (not using requirements.txt, but using all the basic tools and not using any kind of workflow tool or sophisticated package manager), I don't understand the question. I just have a workflow where this isn't feasible.
Why would the wrong venv be activated?
I activate a venv according to the project I'm currently working on. If the venv for my current code isn't active, it's because nothing is active. And I use my one global Pip through a wrapper, which (politely and tersely) bonks me if I don't have a virtual environment active. (Other users could rely on the distro bonking them, assuming Python>=3.11. But my global Pip is actually the Pipx-vendored one, so I protect myself from installing into its environment.)
You might as well be asking Poetry or uv users: "what do you do when you 'accidentally' manually copy another project's pyproject.toml over the current one and then try to update?" I'm pretty sure they won't be able to protect you from that.
>If your answer is "delete the venv and recreate it", what do you do when your code now has a bunch of errors it didn't have before?
If it did somehow happen, that would be the approach - but the code simply wouldn't have those errors. Because that venv has its own up-to-date listing of requirements; so when I recreated the venv, it would naturally just contain what it needs to. If the listing were somehow out of date, I would have to fix that anyway, and this would be a prompt to do so. Do tools like Poetry and uv scan my source code and somehow figure out what dependencies (and versions) I need? If not, I'm not any further behind here.
>And of course they become a lot harder when you try to work with someone else's project, or come back to a project from a couple of years ago and find it doesn't work.
I spent this morning exploring ways to install Pip 0.2 in a Python 2.7 virtual environment, "cleanly" (i.e. without directly editing/moving/copying stuff) starting from scratch with system Python 3.12. (It can't be done directly, for a variety of reasons; the simplest approach is to let a specific version of `virtualenv` make the environment with an "up-to-date" 20.3.4 Pip bootstrap, and then have that Pip downgrade itself.)
I can deal with someone else's (or past me's) requirements.txt being a little wonky.
Because when you activate a venv in a given terminal window it stays active until you deliberately deactivate it, and one terminal and one venv looks much like another.
> I activate a venv according to the project I'm currently working on.
So just manual discipline? It works (most of the time), but in my experience there's a "discipline budget"; every little niggle you have to worry about manually saps your ability to think about the actual business problem.
> "what do you do when you 'accidentally' manually copy another project's pyproject.toml over the current one and then try to update?" I'm pretty sure they won't be able to protect you from that.
Copying pyproject.toml is a lot less routine than changing directories in a terminal window. But if I did that I'd just git checkout/revert to the original version.
> the code simply wouldn't have those errors. Because that venv has its own up-to-date listing of requirements; so when I recreated the venv, it would naturally just contain what it needs to.
So how do you ensure that? pip dependency resolution is nondeterministic, dependency versions aren't locked by default and even if you lock the versions of your immediate dependencies, the versions of your transitive dependencies are still unlocked.
> If the listing were somehow out of date, I would have to fix that anyway, and this would be a prompt to do so.
Flagging up outdated dependencies can be helpful, but getting forced to update while you're in the middle of working on a feature (or maybe even working on a different project) is rather less so. Especially since you don't know what you're updating - the old versions were in the venv you just clobbered and then deleted, so you don't know which dependency is causing the error and you've got no way to bisect versions to find out when a change happened.
> Do tools like Poetry and uv scan my source code and somehow figure out what dependencies (and versions) I need? If not, I'm not any further behind here.
uv has deterministic dependency resolution with a lock file that, crucially, it uses by default without you needing to do anything. So if you wiped out your cache or something (or even switched to a new computer) you get the same dependency versions you had before. There's no venv to clobber in the first place because you're not activating environments and installing dependencies - when you "uv run myproject" the dependencies you listed in pyproject.toml, there's no intermediate non-version-controlled thing to get out of sync and cause confusion. (I mean, maybe there is a virtualenv somewhere, but if so it's transparent to me as a user)
> I spent this morning exploring ways to install Pip 0.2 in a Python 2.7 virtual environment, "cleanly" (i.e. without directly editing/moving/copying stuff) starting from scratch with system Python 3.12. (It can't be done directly, for a variety of reasons; the simplest approach is to let a specific version of `virtualenv` make the environment with an "up-to-date" 20.3.4 Pip bootstrap, and then have that Pip downgrade itself.)
Putting pip inside Python was dumb and is another pitfall uv avoids/fixes.
>...but getting forced to update while you're in the middle of working on a feature...
I feel like trying to work on more than one project in the same session would require more such discipline.
>So how do you ensure that? pip dependency resolution is nondeterministic, dependency versions aren't locked by default and even if you lock the versions of your immediate dependencies, the versions of your transitive dependencies are still unlocked.
Ah, so this is really about lock files. I primarily develop libraries; if something breaks this way, I want to find out about it as soon as possible, so that I can advertise correct dependency ranges to my downstream.
The requirements.txt approach does, of course, allow you to list transitive dependencies explicitly, and pin everything. It's not a proper lock file (in the sense that it says nothing about supply chains, hashes etc.) but it does mean you get predictable versions of everything from PyPI (assuming your platform doesn't somehow change).
If I needed proper lock files, then I would take an approach that involves them, yes. Fortunately, it looks like I'd be able to take advantage of the PEP 751 standard if and when I need that.
>Putting pip inside Python was dumb and is another pitfall uv avoids/fixes.
Agreed completely! (Of course I was only using a venv so that I could have a separate, parallel version of Pip for testing.) Rather, the Pip bootstrapping system (which you can completely skip now, thanks to the `--python` hack) is dumb, along with all the other nonsense it's enabled (such as other programs trying to use Pip programmatically without a proper API, and without declaring it as a dependency; and such as empowering the Pip team to go so long without even as functional of a solution as `--python`; and such as making lots of people think that Python venv creation has to be much slower than it really does).
I'll be fixing this with Paper, too, of course.
We all know that multitasking reduces productivity. But business often demands it (hopefully while being conscious of what it's costing).
You also don't have to be working in the "same session" to trip yourself up this way - "this terminal tab still has the venv from what I was working on yesterday/last week" is a way I've had it happen.
> I primarily develop libraries; if something breaks this way, I want to find out about it as soon as possible, so that I can advertise correct dependency ranges to my downstream.
If you want to find out as soon as possible, better to have a systematic way of finding out (e.g. a daily "edge build") than pick up new dependencies essentially at random.
> The requirements.txt approach does, of course, allow you to list transitive dependencies explicitly, and pin everything.
It allows you to, but it doesn't make it easy or natural. Especially if you're making a library, you probably don't want to list all your transitive dependencies or pin exact versions in your requirements.txt (at least not the one you're publishing). So you end up with something like two different requirements.txt where you use a frozen one for development and then switch to an unfrozen one for release or when you need to add or change dependencies, and regenerate the frozen one every so often. None of which is impossible, but it's all tedious and error-prone and there's no real standardisation (so e.g. even if you come up with a good workflow for your project, will your IDE understand it?).
> Fortunately, it looks like I'd be able to take advantage of the PEP 751 standard if and when I need that.
That's a standard written in response to the rise of uv, that still hasn't been agreed to, much less implemented, much less turned on by default (and unfortunately most of the time when you realise you need a lock file, you need the lock file that the first run of your tool would have generated when it was run, not the lock file it would generate now - so an optional lock file is of limited effectiveness). I don't think it justifies a "python packaging has never been a problem" stance - quite the opposite, it's an acknowledgement that pre-uv python packaging really was as broken as many of us were saying.
I mean, my "IDE" is Vim, and I'm not even a Vim power-user or anything.
People gravitate towards tools according to their needs and preferences. My own needs are simple, and my aesthetic sense is such that I strongly prefer to use many small tools instead of an opinionated, over-arching workflow tool. Getting into the details probably isn't productive any further from here.
>That's a standard written in response to the rise of uv
I know it looks this way given the timing, but I really don't think that's accurate. Python packaging discussion moves slowly and people have been talking about lock files for a long time. PEP 751 has seen multiple iterations, and it's not the first attempt, either. When uv first appeared, a lot of important people were taken completely by surprise; they hadn't heard of the project at all. My impression is that the Astral team liked it just fine that way, too. But it's not as if someone like Brett Cannon had an epiphany from seeing uv's approach. Poetry has been doing its own lock files for years.
>so an optional lock file is of limited effectiveness
The problem is that you aren't going to just get everyone to do everything "professionally". Python is where it is because of the low barrier to entry. A quite large fraction of Python programmers likely still don't even know what pyproject.toml is.
>I don't think it justifies a "python packaging has never been a problem" stance
That's certainly not my stance and I don't think it's the other guy's stance. I just shy away from heavyweight solutions on principle. Simple is better than complex, and all that. And I end up noticing problems that others don't, this way.
you can solve this with constraints, pip-tools etc., but the argument is uv does this better
(Of course, the alternative—"install this software you've never heard of"—isn't fantastic either. But once they do have it, it'd be pretty neat to be able to tell them to just "uvx <whatever>".)
Or you can make sure you have an entry point - probably a better UX for your coworkers anyway - and run them through a `pipx` install.
Or you could supply your own Bash script or whatever.
Or since you could use a simple packager like pex (https://docs.pex-tool.org/). (That one even allows you to embed a Python executable, if you need to and if you don't have to worry about different platforms.) Maybe even the standard library `zipapp` works for your needs.
> All those commands update the lock file automatically and transparently.... It's all taken care of.
When is the python community going to realize that simple is the opposite of easy? I don't see how hiding these aspects is desirable at all; I want to know how my programming tools work!
With all due respect to the author, I don't like the assumption that all programmers want magic tools that hide everything under the rug. Some programmers still prefer simplicity, ie understanding exactly what every part of the system does.
Nothing against uv, it seems like a fine tool. And I'm sure one could make a case for it on other technical merits. But choosing it specifically to avoid critical thinking is self-defeating.
Keep in mind: there are huge numbers of people out there who will cargo-cult about how applying the sudo hammer to Pip fixed something or other (generally, because the root user has different environment variables). People even resent having to do user-level installations; they resent venvs even more. When the Python team collaborated with multiple Linux distros to add a system to protect against global user-level installs (because they were still interfering with system Python tools), a lot of people reacted by doing whatever they could to circumvent that protection, and advising each other on how to do so - thus the education effort described in https://discuss.python.org/t/the-most-popular-advice-on-the-... . People really would, apparently, rather add `--break-system-packages` to a command line so that they can keep installing everything in the same place, than attempt to understand even the basics of environment management. And we're talking about programmers here, mind.
And then there are the complaints about how the __pypackages__ proposal (https://peps.python.org/pep-0582/) failed - e.g. https://chriswarrick.com/blog/2023/01/15/how-to-improve-pyth... . There were serious issues with that idea, which only became clear as the discussion dragged on and on across literally years (https://discuss.python.org/t/pep-582-python-local-packages-d...). But people were quite upset about having to stick with the old venv model - including the guy who wrote the best explanation of venvs I know, which I frequently refer beginners to (https://chriswarrick.com/blog/2018/09/04/python-virtual-envi...).
A lot of programmers seem to love having the details not only hidden, but as inaccessible as possible, as long as the UI is nice enough. (Unless we're talking about their own code. Then, hundred-line functions are just hunky-dory.)
All that said, I'm pretty skeptical of using uv until their monetization strategy is clear. The current setup is making me think we're in for a Docker-like license change.
In other words, it is a nice frontend to hide the mess that is the Python packaging ecosystem, but the mess of an ecosystem is still there, and you still have to deal with it. You'll still have to go through hatchling's docs to figure out how to do x/y/z. You'll still have to switch from hatchling to flit/pdm/setuptools/... if you run into a limitation of hatchling. As a package author, you're never using uv, you're using uv+hatchling (or uv+something) and a big part of your pyproject.toml are not uv's configuration, it is hatchling configuration.
I'm sticking with Poetry for now, which has a more streamlined workflow. Things work together. Every Poetry project uses the same configuration syntax (there are no Poetry+X and Poetry+Y projects). Issues in Poetry can be fixed by Poetry rather than having to work with the backend.
I understand that uv is still young and I am sure this will improve. Maybe they'll even pick a specific backend and put a halt to this. But of course Poetry might catch up before then.
What limitations have you personally experienced?
Strange. The main reason I'm not using uv or its competitors (or going back to Poetry, where I was using only a tiny fraction of the functionality) is that they're all too opinionated and all-in-one for me. If I had to write down a top 10 of reasons the Python packaging ecosystem is a "mess", and give detailed reasoning, probably at least 6 of them would be problems with Pip specifically (including things that appear to be problems in other tools, but which are really Pip's fault). And at least one more would be "Setuptools is a massive pile of backwards-compatibility wrappers that are mostly useless with the modern packaging flow, that didn't even directly make the wheels (relying on a separate dependency instead) until 70.1".
(The other reason I wouldn't go back to Poetry is because Masonry was a terrible, non-standards-compliant experience for me, and its installation procedures changed repeatedly over time, and you'd end up not being able to uninstall old versions cleanly.)
(Hatchling and Setuptools are build backends, yes; where you say "flit" I assume you mean flit-core, and similarly pdm-backend for PDM.)
If uv provided its own backend, there would still be a risk of running into limitations with that backend; and regardless you'd have to go through its docs to figure out how to do x/y/z with it.
Being able to experiment with different build backends was part of the explicit rationale for `pyproject.toml` in the first place. The authors of PEP 517 and 518 consciously expected to see competing backends pop up (and explicitly designed a system that would allow that competition, and allow for Pip etc. to know how to invoke every backend), and did not (per my understanding of Python Discourse forum discussion) consciously expect to see competing workflow tools pop up.
I'm making a build backend because I have my own opinion. I don't have any interest in making a workflow tool. I want people who like uv to be able to use my build backend. That's how the system was designed to work.
I don't want someone to integrate the baseline and make all the decisions for me. I want a better-quality baseline. Which is why I'm also making an installer/environment manager. `build` is a perfectly fine build frontend. I don't need or want a replacement.
This is my only gripe with uv, despite how the author decided to depict it, this really turns into a headache fast as soon as you have ~4-5 in-house packages.
I don't think it's that bad that uv is so unforgiving in those case because it leads to better overall project quality/cohesion, but I wish there was a way to more progressively onboard and downgrade minor version mismatch to warnings.
[1] https://github.com/python-poetry/poetry/issues/697
[2] https://docs.astral.sh/uv/concepts/resolution/#dependency-ov...
[3] https://docs.astral.sh/uv/reference/settings/#override-depen...
Take for example the werkzeug package that released a breaking API regression in a patch release version. It didn't affect everyone, but notably did affect certain Flask(?) use cases that used werkzeug as a dependency. In a sane system, either werkzeug immediately removes the last released version as buggy (and optionally re-releases it as a non-backwards compatible SemVer change), or everyone starts looks for an alternative to non-compliant werkzeug. Pragmatically though, Python dependency specification syntax should have a way for Flask to specify, in a patch release of its own, that werkzeug up to the next minor version, _but excluding a specific range of patch versions_, is a dependency. Allowing them to monkey patch the problem in the short term.
It should never be on the end user to be specifying overrides of indirect dependency specifications at the top level though, which is what was requested from the poetry tool.
npm and yarn both let you do it. PDM and uv think about it differently, but both allow overrides.
It should never be on the end user to be specifying overrides of indirect dependency specifications at the top level though, which is what was requested from the poetry tool.
I'm jealous of your upstreams. I just want to use Django package XYZ that says it's only compatible with Django 3.X on Django 4. Works just fine, but poetry won't let it happen. Upstream seems like they might literally be dead in some cases, with an unmerged unanswered PR open for years. In other cases a PR was merged but no new PyPI release was ever made because I allowed for more liberal requirements for a 0.7.X release last made in 2019 and they're on version 4.X or whatever these days.
On one decade old application I have a half dozen forks of old packages with only alterations to dependency specifications specifically to please poetry. It's really annoying as opposed to just being able to say "I know better than what this package says" like in npm and yarn.
This is exactly what half the comments in poetry's "please allow overrides" issue are saying.
Indeed. One of the biggest recurring themes I've seen in Python packaging discussion is that package metadata can't be updated after the fact. When you publish something that depends on the just-released foolib N, you don't know if it will be compatible with foolib N+1 (you might not even be completely sure it will work with N.0.1) because it doesn't even exist yet so you can't possibly test it. In other languages, where the environment can contain multiple versions of the same library, it's common to assume it won't. For Python, there are lots of good reasons to assume it will (https://iscinumpy.dev/post/bound-version-constraints/), but people will blame you when you're wrong and the fix isn't straightforward. (AIUI, the best you can do is make a .post1 release with the updated metadata, and "yank" the existing release.)
It's what I most miss about it.
Package com.foo.Something pins a dependency on crap.bollocks.SomethingElse v1.1.0?
But I want to use crap.bollocks.SomethingElse v1.1.5? And I know that they're compatible?
Then I can configure a dependency exclusion.
I really really miss this feature in every non-JVM build tool.
It's another one of those things that the JVM ecosystem did right that everyone else forgot to copy.
(The other massive one being packages having verifiable namespaces. Can't really typosquat Guava because it's namespace is com.google and they can prove it)
You can do it with [patch] in cargo (I think), or .exclude in SBT. In Maven you can use <dependencyManagement>. In fact I can't think of a package manager that doesn't support it, it's something I'd always expect to be possible.
> Point blank, that's a packaging failure and the solution is, and always has been, to immediately yank the offending package.
Be that as it may, PyPi won't.
> It should never be on the end user to be specifying overrides of indirect dependency specifications at the top level though
It "shouldn't", but sometimes the user will find themselves in that situation. The only choice is whether you give them the tools to work around it or you don't.
What’s a typical or better way of handling in-house packages?
> What’s a typical or better way of handling in-house packages?
Fixing your dependencies properly, but on some older codebases that pull also old dependencies this be a headache.
For example "Pillow" a Python image library is a dependency in just about everything that manipulates images. This means that one package might have >=9.6<9.7, some package will have ==9.8 and another will have >=10<11. In practice it never matters and any of those version would work but you have a "version deadlock" and now you need to bump the version in packages that you may not actually own. Having some override of "this project uses Pillow==10, if some package ask for something else, ignore it" is something that pip does that uv doesn't.
Astral is a great team, they built the ruff linter and are currently working on a static type checker called red-knot: https://x.com/charliermarsh/status/1884651482009477368
Python has come an immensely long way in the world of packaging, the modern era of PEP 517/518 and the tooling that has come along with it is a game changer. There are very few language communities as old as Python with packaging ecosystems this healthy.
I've had conversations with members of SG15, the C++ tooling subgroup, where Python's packaging ecosystem and interfaces are looked on enviously as systems to steal ideas from.
EDIT - 'uv run nvim' works also
I have installed miniconda system-wide. For any Python package that I use a lot, I install them on base environment. And on other environments. Like ipython.
For every new project, I create a conda environment, and install everything in it. Upon finishing/writing my patch, I remove that environment and clean the caches. For my own projects, I create an environment.yaml and move on.
Everything works just fine. Now, the solving with mamba is fast. I can just hand someone the code and environment.yaml, and it runs on other platforms.
Can someone say why using uv is a good idea? Has anyone written a migration guide for such use cases?
I am mightily impressed by one line dependency declaration in a file. But I don't know (yet) where the caches are stored, how to get rid of them later, etc.
I have yet to learn uv, but I intend to. Still, having to ".venv/bin/activate" to activate the virtualenv is a lot less ergonomic than "pipenv shell".
There is a request for `uv shell` or similar[0], but it's trickier than it looks, and even poetry gave up `poetry shell` in their recent 2.0 release.
layout python python3.12
pip install --upgrade pip
python -m pip install -r requirements.txt
Looks like direnv can be extended to use uv:Also, `nvim` is started with an environment activated if you want all the LSP goodies.
`uv run` is good for some things, but I prefer to have my venv activated as well.
doit, poethepoet, just... They are simpler than builders like make or maeven, and more convenient than aliases.
E.g: i don't run ./manage.py runserver 0.0.0.0:7777, I run "just openserver".
Poetry, cargo and npm have support for this natively, and there is an open ticker for this in uv too.
So you would not do "uv run manage.py runserver" but "uv serve".
But still, it's not good enough for Django as there are too many management commands and I don't want to configure them in pyproject.toml file, especially since some of them take additional arguments... There is no point in using anything but django-admin command (I do have a wrapper around it, but the point remains) and that requires activated venv.
It does seem like people have use cases for running code in a different environment vs. the one being actively used to develop the package.
You can also force an env passing a complete path to --python
Unfortunately, Hermit doesn't do Windows, although I'm pretty sure that's because the devs don't have Windows machines: PRs welcome.
I agree it isn’t the best use of Docker, but with the hell that is conda (and I say this as someone who likes conda more than most other options) and what can feel like insanity managing python environments, Docker isn’t the worst solution.
All that said, I moved to uv last year and have been loving it.
Of course, I migrated from it after I learned uv.
Yes (unless you use uv in your Dockerfile). I mean, a Docker container will freeze one set of dependencies, but as soon as you change one dependency you've got to run your Dockerfile again and will end up with completely different versions of all your transitive dependencies.
For running containers, pip is best way to go just to keep dependency requirements to minimum.
Use both.
That being said, UV is great.
If your purpose is to denigrate Python as a language, then uv isn't solving problems for you anyway. But I will say that the kind of evangelism you're doing here is counterproductive, and is the exact sort of thing I'd point to when trying to explain why the project of integrating Rust code into the Linux kernel has been so tumultuous.
One immediate speed-up that requires no code changes: when uv creates a venv, it doesn't have to install Pip in that venv. You can trivially pass `--without-pip` to the standard library venv to do this manually. On my system:
$ time uv venv uv-test
Using CPython 3.12.3 interpreter at: /usr/bin/python
Creating virtual environment at: uv-test
Activate with: source uv-test/bin/activate
real 0m0.106s
user 0m0.046s
sys 0m0.021s
$ time python -m venv --without-pip venv-test
real 0m0.053s
user 0m0.044s
sys 0m0.009s
For comparison: $ time python -m venv venv-test
real 0m3.308s
user 0m3.031s
sys 0m0.234s
(which is around twice as long as Pip actually takes to install itself; I plan to investigate this in more detail for a future blog post.)To install in this environment, I use a globally installed pip (actually the one vendored by pipx), simply passing the `--python` argument to tell it which venv to install into. I have a few simple wrappers around this; see https://zahlman.github.io/posts/2025/01/07/python-packaging-... for details.
In my own project, Paper, I see the potential for many immediate wins. In particular, Pip's caching strategy is atrocious. It's only intended to avoid the cost of actually hitting the Internet, and basically simulates an Internet connection to its own file-database cache in order to reuse code paths. Every time it installs from this cache, it has to parse some saved HTTP-session artifacts to get the actual wheel file, unpack the wheel into the new environment, generate script wrappers etc. (It also eagerly pre-compiles everything to .pyc files in the install directory, which really isn't necessary a lot of the time.) Whereas it could just take an existing unpacked cache and hard-link everything into the new environment.
Feel like that's super-obvious but yet here we are.
Actix is just one of many web frameworks, minijinja is an implementation of jinja2, by the original author.
(And many of them are completely fake, anyway. You don't actually need to spend an extra 15MB of space, and however many seconds of creation time, on a separate copy of Pip in each venv just so that Pip can install into that venv. You just need the `--python` flag. Which is a hack, but an effective one.)
(Last I checked, the uv compiled binary is something like 35MB. So sticking with a properly maintained Pip cuts down on that. And Pip is horrendously bloated, as Python code goes, especially if you only have the common use cases.)
Things like pypi sources per dep are there finally.
I still find rough points (as many others pointed out, especially with non sandboxed installs), that are problematic, but on the whole it’s better than Mamba for my use.
It just hit its 12 month birthday a few days ago and has evolved a LOT on those past 12 months. One of the problems I ran into with it was patched out within days of me first hitting it. https://simonwillison.net/2024/Nov/8/uv/
The latest release is at 0.6.1, what is missing (roadmap/timeline wise) for uv to exist as a 1.0 release?
> If you're getting started with Rye, consider uv, the successor project from the same maintainers.
> While Rye is actively maintained, uv offers a more stable and feature-complete experience, and is the recommended choice for new projects.
It also links to https://github.com/astral-sh/rye/discussions/1342.
So it’ll get rolled into UV
If you mean creating and publishing packages to PyPI end users can't tell if you used uv or poetry or something else.
I'm extremely excited about the ruff type checker too.
Otherwise, it works pretty much like Poetry. Unfortunately Poetry is not standards-compliant with the pyproject.toml, so you'll have to rewrite it. There are tools for this, never bothered with them though.
For instance, from a personal project that uses a src layout, without being a package, I have this in my pyproject.toml:
[tool.poetry] ... packages = [{ include = "*", from = "src", format = "sdist" }] ...
[tool.poetry.scripts] botch = "launcher:run_bot('botch')" beat = "launcher:run_bot('beat')"
I can't find any way to get that working in uv without some pretty major refactoring of my internal structure and import declarations. Maybe I've accidentally cornered myself in a terrible and ill-advised structure?
I don't know what you mean by "without being a package". I guess you mean that Poetry will also run your code without installing it anywhere, and use the marked entry points directly. Per the other replies, apparently uv will get an equivalent soon. But really, the point of having `pyproject.toml` in the first place is to explain how to build an installable wheel and/or sdist for your project, and you basically get it for free. (Even if you don't include a [build-system] table, the standards say to use Setuptools by default anyway.)
(Of course, uv can already `run` your entry point, but this involves installing the code in a temporary venv.)
> without some pretty major refactoring of my internal structure and import declarations. Maybe I've accidentally cornered myself in a terrible and ill-advised structure?
Possibly. Is the code up on GitHub? I could take a look.
I'm planning on asking on the Astral Discord server once I have some time to set aside to it.
By "without it being a package", I mean that I don't have `src/foo`, which has `src/foo/__main__.py`, but e.g. `src/main.py`.
This doesn't matter. What does matter, though, is that [project.scripts] doesn't support passing arguments - the entry point is only specified as path.to.module:function , which is expected not to take any arguments (although of course it can read `sys.argv`, which will be forwarded from the wrapper). (You can, however, specify any callable, not just a function; and it could be a nested attribute of some other object.)
Relevant documentation:
https://packaging.python.org/en/latest/guides/writing-pyproj...
As far as I'm aware, there's no formal name for this syntax, and only indirect documentation of what's supported - via Setuptools, which originated it:
https://setuptools.pypa.io/en/latest/userguide/entry_point.h...
You could fix this by just making functions that hard-code the name (or e.g. use `functools.partial` to get what you want). With that properly set up, any standards-compliant build backend will put the needed metadata into your wheel, and Pip and uv will both read that metadata and create the necessary wrapper when installing. (This is "recommended" behaviour for installers in the wheel standard, but not required IIRC.)
Then for example, on Linux, you end up with a wrapper script in the environment's bin folder, which looks like:
#!/path/to/.venv/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from main import run_botch_bot
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(run_botch_bot())
(On Windows, IIRC you get cookie-cutter compiled executables that read their own filename and use it to do the right thing, perhaps with a Python shim to do the import or something.)(Perhaps you'd be interested in raising the documentation issue on https://github.com/pypa/packaging.python.org/issues ?)
[tool.hatch.build.targets.wheel]
force-include = {"src/" = "/"}
With those three things done, I'm in business. Thanks! [project.scripts]
hello = "example:hello"
Assuming you have src/example.py with a function called hello, then "uv run hello" will call that function. I think you also need to have a (empty) src/__init__.py file.I just reviewed uv for my team and there is one more reason against it, which isn't negligible for production-grade projects: Github Dependabot doesn't handle (yet) uv lock file. Supply chain management and vulnerability detection is such an important thing that it prevents the use of uv until this is resolved (the open github issue mentions the first quarter of 2025!)
With uv, there is now one more.
What is CS?
1. Computer Science
2. Customer Service
3. Clinical Services
4. Czech
5. Citrate synthase
6. Extension for C# files
It looks to me that every new minor python release is a separate additional install because realistically you cannot replace python 3.11 with python 3.12 and expect things to work. How did they put themselves in such a mess?
Maybe we need a Geographic Names Board to deconflict open source project names, or at least the ones that are only two or three characters long.