Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for StringSyntaxAttribute.CompositeFormat in CA2241 (#6… #6272

Merged
merged 5 commits into from
Jun 27, 2023

Conversation

manfred-brands
Copy link
Contributor

@manfred-brands manfred-brands commented Nov 16, 2022

Fixes #6012

@stephentoub The existing implementation uses an ImmutableDictionary for the pre-defined methods.

This means that for all other methods, if called multiple times, the check if the methods is a format method is re-done every invocation.

I added one extra commit to use a ConcurrentDictionary instead which is updated when a method is discovered so the next time this method is called it is found immediately.

I don't know if this is safe. The class is created as part of a RegisterCompilationStartAction so isolated per compilation.

One other optimization is to also add methods to the dictionary that have been checked, but not found to be format methods.
This way we only need to check every method definition once.

@manfred-brands manfred-brands requested a review from a team as a code owner November 16, 2022 08:28
@codecov
Copy link

codecov bot commented Nov 16, 2022

Codecov Report

Merging #6272 (4f0bcdd) into main (2f6e15a) will increase coverage by 0.00%.
The diff coverage is 90.81%.

❗ Current head 4f0bcdd differs from pull request most recent head c49c5a5. Consider uploading reports for the commit c49c5a5 to get more accurate results

Additional details and impacted files
@@           Coverage Diff            @@
##             main    #6272    +/-   ##
========================================
  Coverage   96.38%   96.39%            
========================================
  Files        1379     1379            
  Lines      322290   322497   +207     
  Branches    10461    10465     +4     
========================================
+ Hits       310652   310859   +207     
+ Misses       9144     9141     -3     
- Partials     2494     2497     +3     

@stephentoub
Copy link
Member

Thanks for working on this!

@mavasani, does this approach make sense / is this something we do in other analyzers? My concern would be memory consumption in large projects; is that something we find is an issue?

@buyaa-n
Copy link
Member

buyaa-n commented May 18, 2023

I added one extra commit to use a ConcurrentDictionary instead which is updated when a method is discovered so the next time this method is called it is found immediately.

I don't know if this is safe. The class is created as part of a RegisterCompilationStartAction so isolated per compilation.

You might want to use https://github.com/dotnet/roslyn-analyzers/blob/main/src/Utilities/Compiler/PooledObjects/PooledConcurrentDictionary.cs instead.

One other optimization is to also add methods to the dictionary that have been checked, but not found to be format methods. This way we only need to check every method definition once.

Saving each invoked method doesn't feel right, but I'll defer to @mavasani

@buyaa-n
Copy link
Member

buyaa-n commented May 25, 2023

One other optimization is to also add methods to the dictionary that have been checked, but not found to be format methods. This way we only need to check every method definition once.

Saving each invoked method doesn't feel right, but I'll defer to @mavasani

We actually use this approach in other analyzers already, so looks it is OK. @manfred-brands please resolve merge conflict and other comments. Thank you!

@manfred-brands manfred-brands force-pushed the issue6012_StringSyntaxCodeFormat branch from 511d475 to ce87a5b Compare June 8, 2023 14:32
@manfred-brands
Copy link
Contributor Author

Branch rebased and code-review changes applied.

@manfred-brands manfred-brands force-pushed the issue6012_StringSyntaxCodeFormat branch from 57ada20 to 4f0bcdd Compare June 13, 2023 08:16
Copy link
Member

@buyaa-n buyaa-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @manfred-brands!

@buyaa-n buyaa-n enabled auto-merge (squash) June 27, 2023 18:49
@buyaa-n buyaa-n merged commit 03fd0f4 into dotnet:main Jun 27, 2023
10 of 11 checks passed
@tarekgh
Copy link
Member

tarekgh commented Aug 22, 2023

@stephentoub @manfred-brands what would be the expected behavior for methods like

public static CompositeFormat Parse([StringSyntax(StringSyntaxAttribute.CompositeFormat)] string format)
https://github.com/dotnet/runtime/blob/7cffd46ec956fbbb0c1e4b0fbe0a3a1b335ef006/src/libraries/System.Private.CoreLib/src/System/Text/CompositeFormat.cs#L73C39-L73C45

The change in this PR is what cause the problem mentioned in dotnet/runtime#90357

@manfred-brands
Copy link
Contributor Author

manfred-brands commented Aug 23, 2023

@tarekgh First, an analyzer should never throw exceptions.

What it should do with a method that not actually formats is another question.
Parse("{0}: {1}") where the argument is declared as a CompositeFormat but no actual arguments are supplied.
It probably should be ignored (because there is no params parameter to supply any arguments)
Maybe in the future there will be an analyzer that checks the actual formatting text to find errors like: {0: {1}.

The error is in pre-exiting code where the next parameter is checked.
The code doesn't account for the use case where there is a format string but no actual params parameter.

It is triggered by this PR because previously only a few methods were checked and now this method is added to be checked.
I can created a PR to account for this use case.

@stephentoub
Copy link
Member

The intent of the issue was this:
"For any method invocation, look at the parameters to see if any is attributed with [StringSyntax(StringSyntaxAttribute.CompositeFormat)], and if it is, validate that the argument being passed for that parameter is correctly formed (as it does today for the hardcoded list)."

In other words, validate that the attributed composite format string itself was formatted correctly (eg no dangling braces), not that the were a number of additional arguments passed to the same method to satisfy the format items in that string.

@manfred-brands
Copy link
Contributor Author

@stephentoub That is not what the analyzer does nor what is documented in CA2241. It clearly states it checks the additional parameters.

The format string argument passed to a method such as WriteLine, Write, or System.String.Format does not contain a format item that corresponds to each object argument, or vice versa.

@manfred-brands
Copy link
Contributor Author

@stephentoub If you create a bug issue to account for the no-params case, I will create a PR.

@buyaa-n
Copy link
Member

buyaa-n commented Aug 23, 2023

That is not what the analyzer does nor what is documented in CA2241. It clearly states it checks the additional parameters.

The analyzer flags string.Format("{0}"); I think it applies to or vice versa part of the doc, so for

public static CompositeFormat Parse([StringSyntax(StringSyntaxAttribute.CompositeFormat)] string format)

usage it could behave similarly for CompositeFormat.Parse("{0}");

If you create a bug issue to account for the no-params case, I will create a PR.

Could you reference the original issue dotnet/runtime#90357 for that?

@stephentoub
Copy link
Member

That is not what the analyzer does

It does. It also checks whether the right number of arguments is supplied to match the holes, which is why the issue calls out that part isn't relevant. But, for example, all of these trigger CA2241:

string.Format("{");
string.Format("{0}}", 42);
string.Format("{0,}", 42);

because the string is malformed.

@manfred-brands
Copy link
Contributor Author

string.Format("{0}}", 42);
string.Format("{0,}", 42);

Unfortunately, it doesn't actual say that the format string is malformed.
In the last two cases, the correct number of arguments is passed in.
But because the malformed string, the expected number of arguments is -1.
The user needs to look very carefully to find out that there is something wrong with the actual format.
A separate diagnostics message when GetFormattingArguments returns -1 would be beneficial.

The class is called ProvideCorrectArgumentsToFormattingMethodsAnalyzer.
as CompositeFormat.Parse is not a formatting method, checking it by this analyzer doesn't seem right.
We can never pass the correct number of parameters to this method, regardless of format.
I don't think the intention is to always fire.

I created #6884 to prevent the IndexOutOfRangeException.

To keep with the spirit of @stephentoub to only check the format, I also created #6885.
This to both prevent the IndexOutOfRangeException and raise an issue when the format specification is malformed.
This then raises 'Invalid number of Parameters; for the CompositeFormat.Parse with an invalid format.
It should probably be a different message or even a different code. I leave that up to you to decide.

@tarekgh
Copy link
Member

tarekgh commented Aug 23, 2023

I prefer going with #6885 and we can decide if we can introduce a new diagnostic id for checking the specification string case.

@buyaa-n
Copy link
Member

buyaa-n commented Aug 23, 2023

To keep with the spirit of @stephentoub to only check the format, I also created #6885.
This to both prevent the IndexOutOfRangeException and raise an issue when the format specification is malformed.
This then raises 'Invalid number of Parameters; for the CompositeFormat.Parse with an invalid format.
It should probably be a different message or even a different code. I leave that up to you to decide.

Thank you, I agree we need to determine if we should use different message and/or different ID. I can create a separate issue for that.

As #6885 fix the IndexOutOfRangeException and going to flag CompositeFormat.Parse("{0}") case I assume it could introduce new warnings in case user project was not warning for AD0001 and had code similar to CompositeFormat.Parse("{0}") (as @tarekgh mentioned to me offline it may be OK as it most likely was a bug in user code). Anyway, in order to not introduce new warnings, we might want to only do the simple fix #6884 for now and port it to 8.0, the complete fix like in #6885 could go into 9.0.

Please let us know what you think @stephentoub @jeffhandley

@tarekgh
Copy link
Member

tarekgh commented Aug 23, 2023

As #6885 fix the IndexOutOfRangeException and going to flag CompositeFormat.Parse("{0}")

Small clarification, CompositeFormat.Parse("{0}") will not be flagged at all with the fix. this will be flagged only if format string is malformed like CompositeFormat.Parse("{ ") or similar.

The change of behavior when having CompositeFormat.Parse("{0}") is users used to get AD0001 and now will not get any warning. I assume this is not a breaking change.
The change of behavior when having CompositeFormat.Parse("{ ") is users used to get AD0001 and now will get CA2241 (or other diagnostic Id if we decide to do that). In this case it is good to have users get this warning as they have a real problem.

@buyaa-n
Copy link
Member

buyaa-n commented Aug 23, 2023

Small clarification, CompositeFormat.Parse("{0}") will not be flagged at all with the fix. this will be flagged only if format string is malformed like CompositeFormat.Parse("{ ") or similar.

The change of behavior when having CompositeFormat.Parse("{0}") is users used to get AD0001 and now will not get any warning. I assume this is not a breaking change.

Thank you for explanation I haven't had checked the PR yet, now I see that.

The change of behavior when having CompositeFormat.Parse("{ ") is users used to get AD0001 and now will get CA2241 (or other diagnostic Id if we decide to do that). In this case it is good to have users get this warning as they have a real problem.

This change definitely needs a message change at least, showing Provide correct arguments to formatting methods in this case is definitely confusing. Second, I would prefer this diagnostic to: IDE0043 Format string contains invalid placeholder. I will create an issue for discussing appropriate solution.

With that being said now I am OK with accepting the fix in #6885 as it makes the behavior of CompositeFormat.Parse("{ ") same as string.Format("{") and port the fix into 8.0 and flagging it will be better than throwing at runtime

@buyaa-n
Copy link
Member

buyaa-n commented Aug 23, 2023

@manfred-brands is it easy to change the messages for the scenario like string.Format("{") without any side effects? If so could you change the message with something like Format string contains invalid placeholder in #6885? Thank you!

@manfred-brands
Copy link
Contributor Author

This change definitely needs a message change at least, showing Provide correct arguments to formatting methods in this case is definitely confusing. Second, I would prefer this diagnostic to: IDE0043 Format string contains invalid placeholder. I will create an issue for discussing appropriate solution.

The IDE0043 issue seems to have been resolved as to retire IDE0043.
The PR isn't merged because of a PR to move more cases from IDE0043 to CA2241. That PR is stuck since 2020. Lots of discussion about raising more useful separate messages depending on use case.

@manfred-brands
Copy link
Contributor Author

manfred-brands commented Aug 24, 2023

@manfred-brands is it easy to change the messages for the scenario like string.Format("{") without any side effects? If so could you change the message with something like Format string contains invalid placeholder in #6885? Thank you!

I have made the changes, but not sure who will make the translations for the .resx files.

I also found a "bug" in the rule where it only counted the number of unique references, not taking the actual index into account.

The below didn't raise an error:

string.Format("{1}", 42);
string.Format("{1} {2}", 42, 43);

But this one did:

string.Format("{1}", 3, 4)

Although the latter is debatable, it it valid.

@buyaa-n
Copy link
Member

buyaa-n commented Aug 24, 2023

The dotnet/roslyn#46001 seems to have been resolved as to retire IDE0043.
The dotnet/roslyn#62303 isn't merged because of a #4226 to move more cases from IDE0043 to CA2241. That PR is stuck since 2020. Lots of discussion about raising more useful separate messages depending on use case.

Good catch thanks!

I have made the changes, but not sure who will make the translations for the .resx files.

Thank you! No need to worry about translations, a language experts would do that later

I also found a "bug" in the rule where it only counted the number of unique references, not taking the actual index into account

Seems this rule needs a lot of fixes, FYI @mavasani

@buyaa-n buyaa-n mentioned this pull request Aug 24, 2023
@jeffhandley
Copy link
Member

I prefer going with #6885 here too. I'll review that PR with the expectation of getting it approved for merge into .NET 8 RC2.

@manfred-brands manfred-brands deleted the issue6012_StringSyntaxCodeFormat branch November 2, 2023 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CA2241 should respect new StringSyntaxAttribute.CompositeFormat
6 participants