-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a generic wheel-building service #19
Comments
It's not clear to me what this is about. Can you do a short write-up here? |
I'll take a stab at it: The idea would be for PyPI or the PSF to provide a public-good service which allows users to upload a source distribution and have wheels automatically built for specific platforms and architectures. This would allow users to easily target a wide range of built distributions, and would also let us provide a 'canonical' way to build wheels, e.g. with the latest standards and best practices, without having to configure complicated CI pipelines for every individual project. The service could automatically publish what it builds to PyPI on behalf of the maintainers. This would be a fairly complex project because at the surface, it's extremely similar to a CI service like GitHub Actions. It would require:
A 'fundable' version of of this would probably be an MVP and likely won't include more esoteric architectures, and maybe only one platform (i.e. the most popular one). It also might not support much, if at all, configuration of the build, so it wouldn't initially support projects with more specific needs in their build environments. I think, at a minimum, it would be able to turn a source distribution into a pure-Python wheel, and publish it. |
Excellent idea, @di! I have been proposing a similar service for a while. :) Instead of allowing users to upload a source distro, I would rather start a step earlier. My idea for a MVP looks like this
The steps assume that the project uses a standard layout with a well-configured pyproject.toml in its root directory. This MVP would simplify most build steps for pure Python projects and does not require much resources on our side. Binary wheels are typically more complicated to build because most projects need additional dependencies and custom build environments. |
Honest question: does this actually require a separate service? Wouldn't documenting the procedure with existing CI services suffice? Case in point: https://github.com/agronholm/cbor2/blob/master/.github/workflows/publish.yml |
IMO, extending this to build source distributions as well should absolutely be a goal here, but having to integrate against source repositories doesn't sound like an MVP to me 🙂 (also at that point it's not really a 'generic wheel-building service') I also think the 'build a wheel from a source distribution uploaded to PyPI' is going to be an important flow for projects that may not have upstream repos that we support, or may not have upstream repos at all.
It certainly doesn't require it, and I expect that if this existed there would still be strong reasons to use existing CI services, but running a separate service that the PSF/PyPI owns and operates has some advantages:
I think it's likely that such a service would leverage some or all of |
It would still require CI changes to leverage this service, yes?
I have a hard time believing this one. The number of possible platform/architecture combinations is pretty mind blowing, and building every possible combination for every release is going to be a massive resource drain.
I'm confused – are you thinking of closed source projects? Because both GitHub and Gitlab are free for F/OSS projects. Closed source projects, on the other hand, are not likely to send their source code to third parties.
Aren't all builds isolated with cibuildwheel? I don't know about the other two features as I have no clue what they are. |
Another possibility is PyPI provides a service the communicates with a service run by various platform providers that perform the actual builds. For example, if Red Hat wanted to provide CentOS builds for things, PyPI would use some API w/ a service run by Red Hat that took the sdist, did the wheel build, and sent it all back to PyPI to use and display to the user. That way PyPI acts as the integration point while the various platform providers handle the actual building. Drawback to this is who is going to do this for the platforms that choose not to participate? Who does the manylinux builds? Or those platforms that lack enough funding to pull this off (e.g. does the FreeBSD Foundation have what it would take to participate if we opened up FreeBSD wheels)? But it could allow for more platforms to participate when they do have the funds to do this sort of thing. It also means we don't have to be the experts in the building. Now having said all of that, I still like Dustin's idea more. 🙂 We could get the platform vendors involved to donate what is necessary to use their services to do the build, but we still ultimately control the service and code. It would also help share the knowledge required to make this sort of thing work and increase transparency which is critical for this sort of thing for the wheels to be trusted. I will also say that Dustin's idea leads to reproducibility by providing all the details used to build the wheels. Once again, it's a security thing. Lastly, it can act as a 🥕 for projects to use modern packaging practices as that's going to be the easiest way to make sure there isn't a ton of custom code for every project's special way of being built. |
Hey @agronholm, sort of sensing that you're feeling strongly opposed to this idea but I don't really have a good sense of why based on your comments. If it's for a reason other than "it would take a large investment of time/money" (which I think is acknowledged by the fact that we're considering this to be a project we'd need to seek funding for) then I'd love to hear what your hesitation is.
I think ideally this would be configured entirely through PyPI or via the service itself, depending on their relationship. Either it would consume the source distribution from PyPI when it gets published there, or it would integrate directly with the upstream repo like @tiran is describing.
You're absolutely right, which means that such a service would likely only be limited to the combinations that are most important or that people care the most about -- I don't think it would ever achieve complete coverage and that shouldn't be a goal. Which means that the average user can probably depend on the service to build for all combinations the service supports without really having to think about what those are. And users that do need to build for esoteric combinations that the service doesn't support would need to do some portion of their build elsewhere anyways, but they could still use the service for the combinations it supports.
This assumes end-users are blindly upgrading to the latest
I'm talking about an admittedly hypothetical situation where GitHub/Gitlab become non-free or ineffective to use at the free tier a la Travis CI. At the end of the day, these are services offered by for-profit companies and their ultimate goals are not necessarily aligned with offering free compute to OSS developers forever, whereas this is very much inline with the mission and goals of the PSF and PyPI.
They are isolated in the sense that the underlying CI job is isolated, but not in the sense that each individual build is guaranteed to be isolated -- a user might invoke cibuildwheel multiple times in a single CI job, e.g. in the case where a given source repo ships multiple distinct projects to PyPI. Not super common but it does exist. By 'build provenance' I'm referring to the build environment providing a non-forgeable and cryptographically signed accounting of exactly went into the build and what commands were run as part of it -- it's fair though to say that some of the environment providers that users currently use are also working on providing this right now. By 'attestations' I'm talking about something like https://in-toto.io/, which, again, some (but not all) of the build environments are working to provide by default, and also users can manually run themselves. |
Also, for anyone not familiar with this repo: this issue is about updating our list of projects which could use funding with an entry about the idea in question, not about actually implementing it. So all we're trying to decide here is if this is a) something we want b) something sufficiently large that it requires funding c) something sufficiently small that it's actually reasonable to fund and d) something that potential funders would be interested in, all with the assumption that much more design work would go into this at the point where someone has expressed interest in funding it. |
I'm just bringing up these questions to determine if this project is feasible, and what its actual goals are. I've had this same idea for a long time but then I figured it would never pan out due to the massive computing resource requirements and scalability challenges, so I'm curious to learn what the plan would be. |
Cool, thanks for kicking off the discussion and raising the points, they're totally valid. |
You all might look into the experience of the conda-forge community around this, which has automated builds of thousands of Python packages and their C/C++ binary dependencies (including whole compiler toolchains) for multiple platforms and architectures, relying on public CI services for the automation: |
Thanks @wesm! I'm aware of conda-forge, but not sure if other folks here are. I think one big difference between it and what's being proposed here is the reliance on public CI services, but otherwise we probably have something to learn from them if we go down this path. |
I think a generic wheel building service is a great idea that could help address the often-unintentional differences between source code repositories and the artefacts on PyPI. Furthermore, this will be an excellent basis for increased supply chain security in the Python ecosystem i.e. providing additional/different TUF integration points than those proposed in PEP-458 and PEP-480. |
+1 on this idea! I also think it would be relatively easy to add all the bells and whistles (at their respective time):
|
Great idea, @di, long time coming, and excited to see traction building on this! Agree with both @joshuagl and @SantiagoTorres that the auto wheel-builder service will be excellent an security "chokepoint" for adding TUF/in-toto/SLSA/sigstore/etc metadata. |
Another benefit to this is moving the entire community to a new Python release very quickly compared to having to wait for wheels to percolate up your dependency tree from leaf nodes to your project. This also benefits Python itself as it ups the chances people will test e.g. Python betas to help find issues sooner and lead to a more stable Python release. |
Signed, Reproducible builds from source off a trusted build farm are possible with conda-forge, emscripten-forge, Ubuntu PPA, Fedora COPR, and OpenSUSE OBS Open Build System . From https://news.ycombinator.com/item?id=36045057 :
Conda-Forge:
SLSA (TUF && Sigstore && Build Attestation)
Cost Estimates:
|
Background: pypa/packaging-problems#25
Create a generic wheel-building service to make releases faster and more robust.
The text was updated successfully, but these errors were encountered: