Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE]: GitVersion tool is slow / stalls #4159

Open
2 tasks done
christophwille opened this issue Aug 13, 2024 · 14 comments
Open
2 tasks done

[ISSUE]: GitVersion tool is slow / stalls #4159

christophwille opened this issue Aug 13, 2024 · 14 comments
Milestone

Comments

@christophwille
Copy link

christophwille commented Aug 13, 2024

Prerequisites

  • I have written a descriptive issue title
  • I have searched issues to ensure it has not already been reported

GitVersion package

GitVersion.Tool

GitVersion version

6.0.2

Operating system

Linux

What are you seeing?

This is for

      git status
      dotnet gitversion /version
      dotnet gitversion $(Build.SourcesDirectory) /output buildserver /nofetch

on Azure DevOps pipelines (git status only added for testing and because some old issue mentioned that the index might be the culprit).

I am seeing a "pause" in the log - and it always happens for the current PRs branch (pull/2271/merge) End and the Begin of the actual branch I created comparing to develop (GitFlow). Below are two runs, as you can see it seems to wait for 3 minutes for no reason at all.

2024-08-13T07:24:15.4623771Z     INFO [24-08-13 7:24:15:46] -< End: Finding merge base between 'pull/2271/merge' and 'origin/release/4.77.0'. (Took: 0.43ms) >-
2024-08-13T07:24:15.4667782Z   INFO [24-08-13 7:24:15:46] -< End: Finding branches source of 'pull/2271/merge' (Took: 368.42ms) >-
2024-08-13T07:27:26.9638056Z   INFO [24-08-13 7:27:26:96] -< Begin: Finding branches source of 'feature/gitversionperftest' >-
2024-08-13T07:27:26.9666509Z     INFO [24-08-13 7:27:26:96] -< Begin: Finding merge base between 'feature/gitversionperftest' and 'develop'. >-

2024-08-13T07:36:02.1232132Z     INFO [24-08-13 7:36:02:12] -< End: Finding merge base between 'pull/2271/merge' and 'origin/release/4.77.0'. (Took: 0.40ms) >-
2024-08-13T07:36:02.1277016Z   INFO [24-08-13 7:36:02:12] -< End: Finding branches source of 'pull/2271/merge' (Took: 376.28ms) >-
2024-08-13T07:39:15.4620032Z   INFO [24-08-13 7:39:15:46] -< Begin: Finding branches source of 'feature/gitversionperftest' >-
2024-08-13T07:39:15.4648616Z     INFO [24-08-13 7:39:15:46] -< Begin: Finding merge base between 'feature/gitversionperftest' and 'develop'. >-

This also already happened for 6.0.0, I was testing 6.0.2 to see if there would be a difference.

I scrolled all "Took: " and I found one sore thumb:

2024-08-13T07:39:19.2643763Z INFO [24-08-13 7:39:19:26] -< End: Fetching the base versions for version calculation... (Took: 197,524.02ms) >-

However, that would never add up because no other Took: ever exceeded low 100s of ms, most of them were single digit.

INFO [24-08-13 7:36:00:97] -< Begin: Normalizing git directory for branch 'refs/pull/2271/merge' >-
  INFO [24-08-13 7:36:01:00] One remote found (origin -> 'https://someazdourlredacted/myrepo').
  INFO [24-08-13 7:36:01:00] Skipping fetching, if GitVersion does not calculate your version as expected you might need to allow fetching or use dynamic repositories
  INFO [24-08-13 7:36:01:01] Creating local branch pull/2271/merge pointing at d4d958b
and then a lot more Creating local branch

What is expected?

Of couse not seeing the pause.

Steps to Reproduce

Not sure if that reproduces outside of our project.

RepositoryFixture Test

No response

Output log or link to your CI build (if appropriate).

No response

@arturcic
Copy link
Member

I think it's a logging issue, we need to add more logs for the Finding merge base between

@christophwille
Copy link
Author

It'd be really nice to know where those three minutes went (maybe it is some out of whack branch or something)

@david-driscoll
Copy link
Contributor

I'm also having this issue as well, it's very strange.

@david-driscoll
Copy link
Contributor

I've been able to narrow this down to this method below. When I run this against my repository it runs for ~1.5 minutes as long as I'm on a branch off of main. I did a quick profile run and found the at the various Where methods are called about 7 million times.

var returnedBranches = new HashSet<IBranch>();
var referenceLookup = this.repository.Refs.ToLookup(r => r.TargetIdentifier);
var commitBranches = FindCommitBranchesBranchedFrom(branch, configuration, excludedBranches).ToHashSet();
var ignore = new HashSet<BranchCommit>();
foreach (var commitBranch in commitBranches)
{
foreach (var commit in branch.Commits.Where(element => element.When > commitBranch.Commit.When))
{
var parents = commit.Parents.ToArray();
if (parents.Length > 1 && parents.Any(element => element.Equals(commitBranch.Commit)))
{
ignore.Add(commitBranch);
}
}
}
foreach (var item in commitBranches.Skip(1).Reverse())
{
if (ignore.Contains(item)) continue;
foreach (var commitBranch in commitBranches)
{
if (item.Commit.Equals(commitBranch.Commit)) break;
foreach (var commit in commitBranch.Branch.Commits.Where(element => element.When >= item.Commit.When))
{
if (commit.Equals(item.Commit))
{
commitBranches.Remove(item);
}
}
}
}
foreach (var branchGrouping in commitBranches.GroupBy(element => element.Commit, element => element.Branch))
{
bool referenceMatchFound = false;
var referenceNames = referenceLookup[branchGrouping.Key.Sha].Select(element => element.Name).ToHashSet();
foreach (var item in branchGrouping)
{
if (referenceNames.Contains(item.Name))
{
if (returnedBranches.Add(item)) yield return item;
referenceMatchFound = true;
}
}
if (!referenceMatchFound)
{
foreach (var item in branchGrouping)
{
if (returnedBranches.Add(item)) yield return item;
}
}
}

I see the nested loops, I was able to get that number down to 15 seconds but broke some unit tests, so clearly I'm a little out of my depth with what this method needs to do.

Hopefully else can it a step further and figure out a good solution!

@david-driscoll
Copy link
Contributor

I have some proposed changes over at #4208 it appears the culprit is at least partially related to iterating over all the branch commits several different times.

@HHobeck
Copy link
Contributor

HHobeck commented Sep 18, 2024

Hi there. Thank you very much for narrowing down this performance problem. I see your change as an improvement and think it has no downsides. Might it be an idea to do the same change for:

  • BranchCollection
  • ReferenceCollection
  • RefSpecCollection
  • RemoteCollection
  • TagCollection

@arturcic
Copy link
Member

There is one detail we need to take in consideration, when the normalization happens we meed to refresh the cache of these collections

@asbjornu
Copy link
Member

Indeed. Which is another argument for doing normalization as a separate command and step in the build.

@christophwille
Copy link
Author

Any sort of (interim) improvement is very welcome because it is getting worse for our builds by the day.

@elifry
Copy link

elifry commented Sep 18, 2024

Same issues here. The Azure step is taking 11 minutes to run, should only take seconds!

@HHobeck
Copy link
Contributor

HHobeck commented Sep 20, 2024

There is one detail we need to take in consideration, when the normalization happens we meed to refresh the cache of these collections

May I ask you: The actual change of the CommitCollection is not effected by the normalization process and can be merged as a short-term hotfix to improve the performance, right?

I would say that we merge this change and do the other improvements in an extra ticket.

@arturcic
Copy link
Member

I think we better spent the time now to improve the lazy loading/ reset the cached collection, for all git object collections we have than just the one that is in this PR just for the sake of having this merged. So if the author is willing to continue the improving the git object collections loading, then we can help with that, otherwise I would rather improve it myself when I find some time.

@christophwille
Copy link
Author

So an interim QFE is really not an option? Even shaving off a few percent would be really highly appreciated as a stopgap until a nicely architected all-encompassing change is done and dusted.

@HHobeck HHobeck added this to the 6.x milestone Sep 24, 2024
@arturcic
Copy link
Member

The thing is we had previously this experience when there was something half-done and the original author of the changes was not willing to continue, and then we had to take it over and redo. So my preference is to get this done first and then merge. I get it that this small change can improve someone's specific case, but let's spend a bit more time and get it done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants