Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Analyzer: Warn on local/parameter reassignment #7422

Open
HaloFour opened this issue Sep 25, 2024 · 20 comments
Open

New Analyzer: Warn on local/parameter reassignment #7422

HaloFour opened this issue Sep 25, 2024 · 20 comments

Comments

@HaloFour
Copy link

HaloFour commented Sep 25, 2024

It seems that proposals to add readonly local/parameter support to C# has stalled. This seems largely due to the feature being overly noisy given it requires an additional modifier of some kind and it would likely apply in the vast majority of cases. Given that C# cannot (or is extremely unlikely to) go readonly by default, I would like to suggest an analyzer that would give developers the option to enforce default readonly-ness of locals and/or parameters to achieve the same benefits.

Tuning the strictness policies

Different developers likely prefer different degrees of reassignment protection based on the scenario. Rather than having a single policy to issue a diagnostic on any reassignment, I propose categorizing reassignment based on the situation and allowing the developer to adjust their severity. For example, a developer might want to warn on reassignment of parameters, but doesn't care about reassignment of locals. Or a developer might want to be much more cautious about potential reassignment of captures in lambdas or local functions.

The following is a list of possible reassignment scenarios:

  1. Reassignment of non-ref/out parameter
    void fn(string p) {
        p = "something else"; // diagnostic
    }
  2. Reassignment of local
    var local = "foo";
    local = "bar"; // diagnostic
  3. Passing local/parameter as ref
    var local = "foo";
    fn(ref local); // diagnostic
  4. Reassignment of captured local/parameter within lambda
    var local = "foo";
    var fn = () => {
        local = "bar"; // diagnostic
    };
  5. Reassignment of captured local/parameter outside of lambda
    var local = "foo";
    var fn = () => {
        Console.WriteLine(local);
    };
    local = "bar"; // diagnostic
  6. Reassignment of primary constructor parameter
    class Foo(string p) {
        void fn() {
            p = "foo"; // diagnostic
        }
    }

This is not meant to be an exhaustive list. My intent is to drive conversation around the categorization of different scenarios of reassignment based on how "dangerous" or "undesirable" they may be considered by a developer. That would provide the policy options to make the analyzer as strict or lenient as desired.

A simpler alternative would be to mark all diagnostics the same, making this a one-size-fits-all analyzer.

Opting into mutability

I believe it would be desirable to provide a simple way to mark that a particular local/parameter is "mutable", suppressing any diagnostics on reassignment. Additionally, a diagnostic could be issued for a declaration marked as "mutable" which is never reassigned. For parameters, this could be implemented through an attribute that the analyzer recognizes by name:

void fn([Mutable] string p) {
    p = "foo"; // no diagnostic
}

For locals there are no existing ways to attach attributes or other custom metadata. In lieu of that, we have a couple of options we can explore to allow a developer to indicate that the local should allow reassignment:

Naming Convention (thanks @Richiban)

The name of the local could be used to indicate whether it should be mutable. Perhaps a prefix like m_ or mutable_ could be prepended and the analyzer would then not issue diagnostics on reassignment:

var mutable_local = "foo";

mutable_local = "bar"; // no diagnostic

Comments

A specially-formatted comment could be added to the declaration which the analyzer would interpret as declaring the local as mutable:

/*mut*/ var local = "foo";
// or
var local = "foo"; //mut

void fn(/*mut*/ string p) { }

Additionally, perhaps a reassignment could also be marked as mutable:

var local = "foo";
local = "bar"; // diagnostic
/*mut*/ local = "baz"; // no diagnostic

fn(/*mut*/ ref local); // no diagnostic

While a comment is a little messy and error prone, at least the developer would have to get it right in order to suppress the diagnostic in that case, and a fixer could probably be provided to automatically add the correct comment. In the future something like "source attributes" would definitely be a preferred metadata mechanism.

Alternatively, the diagnostic could be suppressed using standard C# pragmas.

/cc @RikkiGibson

@HaloFour
Copy link
Author

This is a bit of a moon-shot, for the sake of discussion.

@RikkiGibson
Copy link
Contributor

RikkiGibson commented Sep 25, 2024

I will try and look in more depth later, but, as a first impression, the // mut comments remind me of // language=regex comments over string literals to force regex highlighting. blog post link to when that was introduced. I personally find it a reasonable solution for the analyzer.

I think marking the declaration with // mut to permit reassignment makes sense. Marking the assignment doesn't really make as much sense to me, as you're just asking for permission to do the thing you're already doing at that exact spot. If user is insistent that only this one place be allowed to reassign then I'd go with a pragma.

@RikkiGibson
Copy link
Contributor

RikkiGibson commented Sep 25, 2024

If local variable is not definitely assigned, then assigning it should be fine (not produce a diagnostic even under strict settings).

string str;
if (cond)
    str = GetString(); // ok
else
    GetString(out str); // ok

str = "b"; // warning

Maybe something similar would be used to make out parameter handling more strict. Not sure about the value as much in that case though.

M(out int x)
{
    x = 42; // ok
    // ...
    x = 43; // warning?
}

Further feedback:
It feels simple enough to distinguish scenarios we want to be individually configurable by using different diagnostic IDs. I don't have a ton of input on the exact specifics of which to give their own IDs.

I would also say that it feels right to keep it oriented around readonly-by-default, and not build any kind of "mode" which would make mutable the default and opt-in with comments to readonly.

@HaloFour
Copy link
Author

Excellent ideas. I like limiting out parameters to a single assignment.

@HaloFour
Copy link
Author

HaloFour commented Sep 26, 2024

I do mention it in the proposal but I do want to call out any potential controversy in using comments as a mechanism to mark locals or parameters as mutable/reassignable. Apart from being potentially clumsy, I don't know that there's much precedent in an analyzer using comments that could potentially affect the build, particularly if they are removed. There's also potential interactions with other tools, such as "minifiers", although I cannot fathom why anyone would want to "minify" C# prior to compilation.

Obviously if there were a better choice, such as source attributes, they would be the preferable option. However, those don't exist, and even if they did that wouldn't help with older versions of the language where this analyzer would still be useful. As an optional opt-out mechanism within an optional analyzer to suppress optional diagnostics, I personally don't find this to be terribly problematic.

Since parameters can support attributes, I think it's worthwhile to also (or only) support a custom attribute being applied to that parameter.

@RikkiGibson
Copy link
Contributor

I would expect people to find the line-comment form of // mut to be more palatable than inline /* mut */. So, maybe attributes will be preferable for parameters, which are often declared multiple on the same line. e.g. void M([Mutable] int param, [Mutable] bool flag) { }.

This is the kind of thing which I would suggest prototyping prior to any commitments. We have to see how this actually feels when onboarding a big project to it. Are the ergonomics acceptable, and does it mostly meet the goals that a language-level readonly locals feature would meet.

@HaloFour
Copy link
Author

I would expect people to find the line-comment form of ... to be more palatable than inline ...

I agree. I don't know if we'd want to keep both forms as the inline form could be applicable in more places, e.g.

int.TryParse(str, /*mut*/ out var i);

Although that can be revisited based on demand for such a feature in the future.

This is the kind of thing which I would suggest prototyping prior to any commitments.

👍

@HaloFour
Copy link
Author

I'd also be fine if the mutability opt-in is left out out a first phase of such an analyzer. Perhaps it would be useful so infrequently that suppressing the diagnostic via pragma is sufficient? Feedback from people who would use the analyzer around the friction they experience with it could inform the shape such an opt-in feature might take.

@IS4Code
Copy link

IS4Code commented Sep 27, 2024

I appreciate the discussion on the appropriateness of using comments for this. I would much rather prefer sticking with the existing "language" of attributes, used in all other places for code analysis, to keep it consistent with parameters (and potentially other storage locations). I also believe there is no need to bring in a whole new system of source-only attributes just to support this "properly"; just extending the existing attribute facilities to local variables could be enough ‒ see dotnet/csharplang#8460.

@HaloFour
Copy link
Author

Even if C# gets source attributes (or some flavor of local attributes), this analyzer should be able to target previous versions of C# that wouldn't support either.

@Richiban
Copy link

Nice idea! Have you considered a naming convention for triggering mutability? E.g. num is immutable but m_num is mutable?

@HaloFour
Copy link
Author

@Richiban

Have you considered a naming convention for triggering mutability?

Excellent idea, I'll add it to the proposal.

@RikkiGibson
Copy link
Contributor

m_ prefix is used in some codebases for instance fields, so would want to consider some different options, but the general idea of a naming convention seems reasonable.

@HaloFour
Copy link
Author

Added a comment that the analyzer could issue a diagnostic for a variable marked as "mutable" which is never reassigned.

@DavidArno
Copy link

DavidArno commented Oct 3, 2024

@RikkiGibson,
I do not like the idea of it not warning on after-declaration assignment other than for out params (which have to be allowed do to their nature; in fact I'd treat them as mutable by default). I also do not want such an analyzer to need to do semantics/flow analysis to determine if a re-assignment is occuring. It wants to be fast. I'll live with imprecise warnings instead. So I'd want you code example to behave like this:

string str; // warning - readonly variable is not set when declared
if (cond)
    str = GetString(); // warning - (possible) reassignment  of readonly variable
else
    GetString(out str); // warning - (possible) reassignment  of readonly variable

str = "b"; // warning - reassignment of readonly variable

getting around those first warnings will be trivial when we get expression blocks:

var str = cond ? GetString() : { GetString(out var s); return s; };

@DavidArno
Copy link

Nice idea! Have you considered a naming convention for triggering mutability? E.g. num is immutable but m_num is mutable?

Please, no. Hungarian notation has been (almost completely) consigned to the dustbin of history for very good reasons.

@DavidArno
Copy link

I would expect people to find the line-comment form of // mut to be more palatable than inline /* mut */

That would surprise me greatly:

/* mut */ var x = something; // - it's clunky, but it makes sense and is unambiguous
var y = something; // mut      <- this just looks wrong. Why is control of mutability at the end of the line?
                   // And can that extra part of the comment work with it? Or did I make it readonly again by adding 
                   // stuff after <-
var z = something; // mutual something or other etc comment <- and what happens here if extra comments are allowed?

@HaloFour
Copy link
Author

HaloFour commented Oct 3, 2024

I also do not want such an analyzer to need to do semantics/flow analysis to determine if a re-assignment is occuring. It wants to be fast.

Why do you believe such analysis would be slow? I wouldn't rule it out unless it was demonstrated to be unusable, and I doubt it would even be noticeable.

Please, no. Hungarian notation has been (almost completely) consigned to the dustbin of history for very good reasons.
That would surprise me greatly:

I think Hungarian notation is fine here. It encodes semantic meaning that the type system itself doesn't convey, for the purpose of making the variable stand out. However, I wouldn't ascribe one particular way to encode mutability in the name, and several options should be considered.

However, I don't think an analyzer would be as constrained as a language feature as to how it would need to be designed. I think it would be fine if multiple mechanisms are offered by the analyzer, and the developer can choose the one that they like the best.

@RikkiGibson
Copy link
Contributor

The reassignment analysis would probably just use the Roslyn dataflow APIs, which are used all over the place in the IDE, for example, for greying out unnecessary assignments. They're certainly quite a bit less expensive than nullable analysis.

@DavidArno
Copy link

The reassignment analysis would probably just use the Roslyn dataflow APIs, which are used all over the place in the IDE, for example, for greying out unnecessary assignments. They're certainly quite a bit less expensive than nullable analysis.

Interesting. I didn't know that functionality existed as APIs. Are they documented anywhere, do you know?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants