-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setup-gradle: GUH cache key collision when used in a reusable workflow #316
Comments
As far as I can see, this is working as expected.
Note that when you re-run a Job, it doesn't get a new git SHA and thus you don't get a new cache key for Gradle User Home. This is why you see the "Entry not saved: referencing 'Gradle User Home' cache entry not saved". To test this properly, you'll need to push a dummy commit for each run. Please check out https://github.com/gradle/actions/blob/main/docs/setup-gradle.md#how-gradle-user-home-caching-works for more details on how this works. ** There are definitely ways this could be made more efficient, if we either stored one big entry with all of the dependencies for all of the jobs, or by extracting the common dependencies into one entry and only storing the unique dependencies per-job. |
Thanks for the clarification; have re-run tests (with dummy commits) - still seeing cleanups happening but presumably these relate to unused items within that build (e.g. removing the unused Groovy DSLs as its a Kotlin DSL script). Subsequent runs don't redownload artifacts, so good there. Familiar with the documentation - what led me to believe that say dependencies were a shared cache across jobs:
If these aren't indeed "shared" caches the documentation should be clearer on that - perhaps a table indicating what items are shared across jobs (where all other attributes match) etc. |
The dependencies caches are shared between Jobs, if those Jobs have the exact same set of dependencies resolved. I might want to clarify the docs a little:
If the |
I did try to have a separate cache entry for each dependency, for maximal cache storage efficiency. This approach quickly led to 429 errors in the cache: GHA cache doesn't like too many small entries. |
lol, yea, can see that being a challenge. Lots of complexities here. Thanks for continuing to evolve this. |
Thank you both for investigating and explaining. I understand that the caches are saved 'per job', and I wonder how that works with re-usable workflows? Kotest uses a re-useable workflow to call Gradle. The Gradle workflow is called twice in the same job: once to generate a JVM API dump (using BCV), and then again to run all Kotlin Multiplatform tests. BCV only needs JVM dependencies, while the KMP tests require all KMP dependencies. How would the dependencies cache for the job be populated? Would a single one be re-used for both invocations of the re-usable workflow, or a separate one each? Could it cause the dependency cache to be unnecessarily discarded? |
Here's a Kotest run that shows the problem. Specifically, this workflow snippet:
...doesn't store cache information on the second job, many entries such as:
...perhaps due to Gradle User Home cache not changing (same SHA, same OS, ...):
|
Ahh.... This cache key for GUH:
So it would appear that |
(this is the issue I thought was a race condition in the initial part of this ticket) |
@cloudshiftchris Thanks for investigating further. This indeed looks like a bug, but I'm not sure how to fix it. The full identifier of the "Job" is
Unfortunately this doesn't differentiate between 2 Jobs with the same name in the same workflow, which is what you have. Although |
Oh wow, this (getting the full job name) is way more complicated than it should be. There are a couple of open issues on this:
Setup repo to reproduce this (one workflow, two jobs that call a reusable workflow) with lots of debugging - the only place the full job name is evident is in an early job entry: This snippet returns something that may be useful (it has
Log output from run here. |
Would using the concurrency group be possible, and useful? It's possible to encode all inputs of the workflow as a concurrency group using |
Good thoughts. Looked at using Tried with What we really need is the full job name, as shown in the UI - |
There would be a collision, but would it be a problem? Am I right in thinking that the resulting cache entry will unnecessarily be updated, but at least it should contain the same dependencies? Also, is it likely that a re-useable workflow with the same inputs will be called twice? |
What happens on a GUH collision is that the cache is not updated (for the second job) - it loads cache entries as they existed for the first run, but doesn't store anything on the premise that, because the job key is the same, everything else must be the same (which would work if the job was uniquely identified). Could imagine scenarios where reusable workflows are called multiple times with the same inputs (or no inputs, if all one desires is the side-effects). There's drawbacks to using |
The cache entry for a key is immutable once written. So there's no way to "update" the cache entry without creating a new cache key. This is why we try hard to create a unique cache key for each Job execution (using workflow-name + job-name + matrix-values + Git SHA). To avoid a massive explosion of cache usage, we first extract any content that we think might be shared between cache entries. These extracted entries are keyed only on their content, so you'll only get sharing if the content is exactly the same between to Job runs. This is commonly the case for subsequent runs of the same Gradle build (same Job), and less commonly occurs for different Jobs with similar content. @cloudshiftchris I'd be interested to see how things work if the calling Jobs ( |
It's worth reading How Gradle User Home caching works, as it describes the properties of the GitHub Actions cache that led to the current model of caching in |
:-)
Thanks so much for taking the time to research. This really helps me out.
I think we might be able to adapt something like this to run inside the action. While I'd prefer not to have to make an API call, if it can provide all of the context required to create a unique cache key, it could be worth it. Can you try adding some matrix parameters to your test and see if they are also included in the JSON? I'm currently accessing them like this, which works but ideally I wouldn't need this layer of indirection. |
Ah, now I see that I can only obtain the set of all Jobs running, and there's no way that I can tell to determine which one matches the currently executing Job! https://stackoverflow.com/questions/71240338/obtain-job-id-from-a-workflow-run-using-contexts |
Yea, that's where I ran into the brick wall as well, its unclear / imprecise how to match things up :( |
Tried two variations on matrix jobs here; findings:
|
After a bit more investigation, I'm convinced that we should be including the inputs for a reusable workflow in the Gradle User Home cache key, similar to how we currently include the matrix values. This would allow us to differentiate between different "instances" of the same Job in the workflow. As far as I can tell, the only way to differentiate between different calls to a reusable workflow are via inputs: environment variables aren't automatically passed in, so I think it's fair to differentiate based on inputs. I forked the reproducer to include more dimensions in the workflow:
Here's a run with the default configuration. Each of the Here's a workflow run where the workflow inputs are explicitly added to the workflow-job-context parameter. Unfortunately, I can't (yet) find a way to access the inputs directly from the action implementation, and this needs to be configured by the user. I can access the So we have a solution for reusable workflows, but it requires a user to know this is required and to supply the inputs specifically to the action. (Note that passing in the "complete job name" value would also work: just need something that uniquely identifies the job execution). |
Repro: https://github.com/cloudshiftinc/gradle-setup-action-cache-repro/actions/runs/10067337462
That repository consists of a small "Hello World" Gradle / Kotlin project, with two build files:
There's a GHA workflow that runs two steps, sequentially, using each of the above build files.
Each step checks out the code, calls
setup-gradle
(v4 beta 1), and executes the respective Gradle build file (build.gradle.kts or build-second.gradle.kts).The second build "cleans up" artifacts it didn't use from the first build:
This isn't desirable in a multi-step setup - we'd want to retain all the cached entries and only cleanup at the end.
For reasons I don't currently understand the cache isn't updated after the cleanup in the second build, even though the cleanup removed entries:
The issue starts here where
setup-action
stores the current timestamp, and then in the post-build cleanup removes everything older than that. This assumes a closed-loop where it's a self-contained build, which doesn't work for shared caches (dependencies, wrappers, etc) that other builds will use/update/cleanup.Not sure what the best solution may be - perhaps loosen the cleanup windows to "older than a day"; not thrilled about making it configurable, as one wouldn't know what to reasonably put (is 10m too long? what if the execution time varies? and so on).
If there's a way to get the "start of the workflow" timestamp that may be more deterministic, though that still leaves holes where there are different workflows running that share those caches - one may wish to save the shared caches for longer periods (e.g. workflows that use those caches that run weekly, for example).
The text was updated successfully, but these errors were encountered: