Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mavros in its own build stage #241

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

amarburg
Copy link
Collaborator

@amarburg amarburg commented Aug 14, 2024

Changes Made

This is an experimental PR which isolates the Mavros/Mavlink build in its own (cacheable) build stage. It builds on #220.

Mavros is built in a standalone workspace (~/ws_mavros) in a build stage. This workspace is then copied into the ci stage. Subsequent builds in ws_blue then extend this mavros workspace rather than the base ROS environment.

Besides increasing complexity of the build process, the major negative is that more functionality is pulled into the ci stage, specifically:

  1. The non-root "ubuntu" user is created earlier than before so ws_mavros can be built and owned by the non-root user.
  2. The built ws_mavros workspace is copied into the "ci" stage as a dependency of blue (given mavros would normally would be installed by rosdep anyway if a binary was available)
  3. The rosdep dependencies of mavros are installed.

(on further reflection, these aren't strictly required, depending on how pedantic we want to be about what's in/out of the ci stage:

  1. Since more noble-based images include the ubuntu user anyway, this isn't a big deal.
  2. The workspace could be copied in the robot stage. Appropriate --ignore-keys would need to be added to the call to rosdep in ci
  3. This would then need to happen in robot.
    )

I've also committed my docker-bake.hcl to simplify repeatability. I have not updated the Github action to use buildx bake. This conflicts with #266, and if/when that's merged it will require some manual rebasing in this branch.

Cross-compile performance

All builds on an i7-11800H, running docker buildx bake mavros --no-cache with appropriate platform -- only one platform at a time. Package time are reported by colcon, "total" is the value reported by docker buildx for the colcon build step.

"amd64 bare metal" is a with rolling installed directly onto the same laptop

Raspberry Pis are on a clean distribution of Bookworm 64bit "Lite" (no GUI), with no active or passive cooling on the board (so thermal throttling may be a factor)

Package amd64 bare metal amd64 arm64 (qemu) arm64 on Pi 5 arm64 on Pi4 4GB (-j2)
mavlink 1.95s 2.14s 3min 46s 7.44s 31.1s
libmavconn 8.73s 9.00s 2min 14s 34.6s 3min 20s
mavros_msgs 1min 25s 1min 27s 2h 56min 31s 5min 21s 26min 28s
mavros 1min 15s 1min 17s 16min 59s 7min 45s 35min 55s
mavros_extras 1min 1s 1min 2s 13min 46s 6min 27s 25min 25s
Total 3min 42s 3min 46s 3h 27min 25s 19min 33s 1h 27min 49s

Yikes.

n.b. The Pi4 would OOM (even with 8GB of swap), so the job was run with -j2

Testing

Please provide a clear and concise description of the testing performed.

@evan-palmer
Copy link
Collaborator

evan-palmer commented Aug 15, 2024

Are you using QEMU for the arm64 build?

EDIT: I'm asking because I wonder if we can leverage Docker support for cross-compilation builds to speed up the build. Here is a blog post from the Docker blog regarding it.

Nevermind, it doesn't seem like cross-compilation is well-supported by colcon

@amarburg
Copy link
Collaborator Author

Are you using QEMU for the arm64 build?

Yes.

And yeah, I briefly considered something in that space. mavros_msgs spends quite a bit of time running the rosidl generator which is (a) run one-at-a-time and (b) in python, so I'm positing that's the major bottleneck.

I could imagine running that step in build-platform-native python3, and then either cross-compiling or qemu-compiling the resulting autogenerated files. But I can't imagine cleanly isolating that step within the build process.

.docker/Dockerfile Show resolved Hide resolved
&& apt-get -q -y upgrade \
&& apt-get -q install --no-install-recommends -y \
git \
gosu \
Copy link
Collaborator

@evan-palmer evan-palmer Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to leaving gosu in here if we split this into its own temporary Dockerfile, but if we add it to the original, we should consider a separate PR (or just add it with #226)

.docker/docker-bake.hcl Outdated Show resolved Hide resolved
.github/workflows/docker.yaml Outdated Show resolved Hide resolved
@@ -18,6 +18,47 @@ env:
PUSH: ${{ (github.event_name != 'pull_request') && (github.repository == 'Robotic-Decision-Making-Lab/blue') }}

jobs:
mavros:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to host this image if we aren't going to use it as the base image for the rest of the Dockerfile?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the mavros build image needs to be built and cached ... somewhere. And I don't think all of those built layers are cached in ci ... it only caches the layer which includes the copied files?

But there's still some poking to ensure that mavros cache is loaded by the ci layers.

@evan-palmer
Copy link
Collaborator

Adding these links here for future reference:

@evan-palmer
Copy link
Collaborator

evan-palmer commented Aug 16, 2024

Looks like you accidentally disabled amd64 instead of arm64

Sorry I thought I was looking at #220 😁

Copy link

mergify bot commented Aug 17, 2024

This pull request is in conflict. Could you fix it @amarburg?

@amarburg
Copy link
Collaborator Author

Looks like you accidentally disabled amd64 instead of arm64

Sorry I thought I was looking at #220 😁

You're not entirely wrong ... to not just burn CI minutes, disabling arm builds on this branch. Can test performance on the desktop.

Copy link

mergify bot commented Aug 22, 2024

This pull request is in conflict. Could you fix it @amarburg?

@amarburg
Copy link
Collaborator Author

Merged in docker-bake.hcl changes from #266

@evan-palmer
Copy link
Collaborator

evan-palmer commented Sep 10, 2024

FYI, GitHub just added arm64 runners. We might be able to build without using BuildX now.

EDIT: Here is another announcement with a bit more info.

@evan-palmer
Copy link
Collaborator

I've submitted a PR mavros!1988 with the changes needed to support Jazzy and Rolling. I'll see what other changes need to be made to get the binaries released.

@amarburg
Copy link
Collaborator Author

FYI, GitHub just added arm64 runners. We might be able to build without using BuildX now.

Out of interest, a wrote an MWE for building independent arm64 and amd64 docker images, and combining them into a single manifest: https://github.com/amarburg/multiplatform_ci_mwe

By my reading, arm64 runners are not yet available to the free tier ("generally available" to paying customers); the repo above does work now, but it builds the arm64 image with qemu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants