Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another attempt to improve the search performance #5324

Closed

Conversation

r15ch13
Copy link
Member

@r15ch13 r15ch13 commented Jan 6, 2023

Description

This is another attempt to improve the local search performance.

  • Not using the manifest() function
  • Use only one loop over $apps
  • Push results to $list directly
  • Use [System.IO.File]::ReadAllText() instead of Get-Content

Motivation and Context

Closes #4239

Relates to #4818

How Has This Been Tested?

Measure-Command { scoop search telegram }
Measure-Command { .\bin\scoop.ps1 search telegram }

image

Checklist:

  • I have read the Contributing Guide.
  • I have ensured that I am targeting the develop branch.
  • I have updated the documentation accordingly.
  • I have updated the tests accordingly.
  • I have added an entry in the CHANGELOG.

@niheaven
Copy link
Member

niheaven commented Jan 6, 2023

WIP or Ready for review?

@rashil2000
Copy link
Member

Some more areas for optimization:

  • Since we are listing lots of files in function apps_in_bucket, using [System.IO.Directory]::EnumerateFiles() instead of Get-ChildItem should be much faster (ref)
  • Since we are deserializing a lot of JSON, using [System.Text.Json.Utf8JsonReader] should be faster than ConvertFrom-Json
  • While searching in remotes, since we are making multiple requests, a .NET method should be faster than Invoke-WebRequest
  • I'm not sure I understand this, but why do we use Select-Object -ExpandProperty instead of just directly accessing the field on lines 69, 90 and 100?

Copy link
Member

@niheaven niheaven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for nested manifests

libexec/scoop-search.ps1 Outdated Show resolved Hide resolved
libexec/scoop-search.ps1 Outdated Show resolved Hide resolved
libexec/scoop-search.ps1 Outdated Show resolved Hide resolved
libexec/scoop-search.ps1 Outdated Show resolved Hide resolved
r15ch13 and others added 4 commits January 7, 2023 16:29
Co-authored-by: Hsiao-nan Cheung <[email protected]>
Co-authored-by: Hsiao-nan Cheung <[email protected]>
Co-authored-by: Hsiao-nan Cheung <[email protected]>
Co-authored-by: Hsiao-nan Cheung <[email protected]>
@niheaven
Copy link
Member

niheaven commented Jan 7, 2023

Since we are listing lots of files in function apps_in_bucket, using [System.IO.Directory]::EnumerateFiles() instead of Get-ChildItem should be much faster (ref)

image

Sorry but I found Get-ChildItem is faster than [System.IO.Directory]::EnumerateFiles() here.

@r15ch13
Copy link
Member Author

r15ch13 commented Jan 7, 2023

  • Since we are listing lots of files in function apps_in_bucket, using [System.IO.Directory]::EnumerateFiles() instead of Get-ChildItem should be much faster (ref)
    It's only run once for every bucket. Could be a problem if you have 1000 buckets. 😄
  • Since we are deserializing a lot of JSON, using [System.Text.Json.Utf8JsonReader] should be faster than ConvertFrom-Json
    Is it available in PowerShell?

@niheaven
We need to take the filesystem cache and the drive type into account here.
The expensive part was running Get-ChildItem and Find-BucketDirectory in manifest_path() for every app.

libexec/scoop-search.ps1 Outdated Show resolved Hide resolved
libexec/scoop-search.ps1 Outdated Show resolved Hide resolved
r15ch13 and others added 2 commits January 7, 2023 16:53
Co-authored-by: Hsiao-nan Cheung <[email protected]>
Co-authored-by: Hsiao-nan Cheung <[email protected]>
@niheaven
Copy link
Member

niheaven commented Jan 7, 2023

That's LGTM, and it needs an updated changelog :).

@rashil2000
Copy link
Member

Is it available in PowerShell?

Yes

image

@rashil2000
Copy link
Member

rashil2000 commented Jan 7, 2023

Sorry but I found Get-ChildItem is faster than [System.IO.Directory]::EnumerateFiles() here.

On multiple repeated runs (i.e. warm cache), EnumerateFiles is almost always twice as fast as Get-ChildItem in my testing. On cold cache, the difference is even more.

@niheaven
Copy link
Member

niheaven commented Jan 7, 2023

@r15ch13
Copy link
Member Author

r15ch13 commented Jan 7, 2023

Oh, I was searching for [System.Text.Json.Utf8JsonReader] explicitly... 😁

@niheaven
Copy link
Member

niheaven commented Jan 7, 2023

On multiple repeated runs (i.e. warm cache), EnumerateFiles is almost always twice as fast as Get-ChildItem in my testing.

If we just run Get-ChildItem once for one bucket, that's not worthwhile to use EnumerateFiles. And the output of EnumerateFiles is string[] which doesn't have FullName or BaseName.

@niheaven
Copy link
Member

niheaven commented Jan 7, 2023

Out of topic here, the minimal requirements of Scoop are WindowsPowerShell and .Net Framework 4.5 which really lacks some new features of modern PowerShell and .Net.

Could this be updated? (To newest PowerShell)

@rashil2000
Copy link
Member

  • I'm not sure I understand this, but why do we use Select-Object -ExpandProperty instead of just directly accessing the field on lines 69, 90 and 100?

Alright, what about this one?

@r15ch13
Copy link
Member Author

r15ch13 commented Jan 7, 2023

It goes down to around 700ms from 1.3sec by using JsonDocument.Parse(). My C# implementation runs at around 300ms.

@rashil2000
Copy link
Member

Since [System.Text.Json] is unavailable in PS 5.1, there's another way: we could reuse the NewtonSoft.Json DLL. As mentioned in this SO post, it is also much faster than ConvertFrom-Json.

@niheaven
Copy link
Member

niheaven commented Jan 7, 2023

Too complicated for the JSON parsing function here... And [System.Text.Json] is not available for .Net Frameworks (4.5, 5 and all version before .Net Core)

The loop and generic list (this is not needed even, IMO. I've tried to replace all array '@()' to generic list but the time savings is 😭) are enough for scoop search, and if ConvertFrom-Json matters, let's create a new PR.

@r15ch13
Copy link
Member Author

r15ch13 commented Jan 7, 2023

Added a fallback for when System.Text.Json is not available

@niheaven
Copy link
Member

niheaven commented Jan 8, 2023

I still don't think we need to speed up ConvertFrom-Json. From tens of seconds to seconds is important, but from seconds to milliseconds is not.

If this is needed indeed, tweak (or migrate to) the parse_json() function is better imo.

@r15ch13
Copy link
Member Author

r15ch13 commented Jan 12, 2023

Performance right after starting the PC still sucks.

image

@niheaven
Copy link
Member

The two different searching time are all using this PR's solution? Wow...

@r15ch13
Copy link
Member Author

r15ch13 commented Jan 20, 2023

Yes, just ran the same code twice after restarting the PC.

@bytebone
Copy link

bytebone commented Jun 3, 2023

Hello. Whats the status of this? From reading the messages here, it seems to me all painpoints have been agreed upon, no? As someone coming back to Windows from Arch, I find it super painful to have to wait multiple seconds for every search I do, and I'd much appreciate any improvement to the search speed, no matter how big or small. Just wanted to add my two cents - thanks for all the hard work!

@JVimes
Copy link

JVimes commented Jul 13, 2023

Really excited about this one 🙂 Is it still on the radar?

@r15ch13 r15ch13 closed this Sep 13, 2023
@r15ch13 r15ch13 deleted the another-search-improvement branch September 13, 2023 19:04
@r15ch13 r15ch13 restored the another-search-improvement branch September 13, 2023 19:06
@r15ch13
Copy link
Member Author

r15ch13 commented Sep 13, 2023

Messed up the branch. Created a new PR #5644

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants