Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert EMPTY_LINE_AFTER_OUTER_ATTR and EMPTY_LINE_AFTER_OUTER_ATTR lint into early lints #13658

Closed

Conversation

GuillaumeGomez
Copy link
Member

This is needed to help the compiler attributes API rewrite.

cc @jdonszelmann @xFrednet

changelog: Convert EMPTY_LINE_AFTER_OUTER_ATTR and EMPTY_LINE_AFTER_OUTER_ATTR lint into early lints

@rustbot
Copy link
Collaborator

rustbot commented Nov 5, 2024

r? @y21

rustbot has assigned @y21.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Nov 5, 2024
@jdonszelmann
Copy link
Contributor

yayyy

@@ -690,7 +611,7 @@ fn check_attrs(cx: &LateContext<'_>, valid_idents: &FxHashSet<String>, attrs: &[
Some(("fake".into(), "fake".into()))
}

if suspicious_doc_comments::check(cx, attrs) || empty_line_after::check(cx, attrs) || is_doc_hidden(attrs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It said this check was to avoid spurious warnings somewhere, but we don't do it anymore. Is that a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran cargo uibless so I think it's ok if nothing new got picked up. To be confirmed by someone more knowledgeable though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intended for e.g. #12917

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what should we do here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may cause some confusion but if attrs are being reworked it's something we could live without unless we want to add another early -> late storage hack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those don't look so bad though, I'd say

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look, they all look like true positives.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean here with false and true positives? is this something that should be fixed or unralated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true positive doesn't need to be fixed, because that means the lint behaves as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, makes sense :)

impl Stop {
fn convert_to_inner(&self) -> (Span, String) {
let inner = match self.kind {
// #|[...]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, but this should be #! not #| I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah indeed!

@@ -5,7 +5,7 @@ LL | / /// for the crate
LL | |
| |_
LL | fn first_in_crate() {}
| ------------------- the comment documents this function
| ---------------------- the comment documents this function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this now points to the entire function, while previously it pointed to the signature of the function. That'd mean that if the function is longer than 0 lines it'd also highlight the block. Do we want that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I supposed it was fine but I can try to see if it can be shortened somehow.

Copy link
Contributor

@jdonszelmann jdonszelmann Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustc_ast::FnSig has a span

image

@@ -71,8 +74,11 @@ error: empty line after outer attribute
LL | / #[repr(C)]
LL | |
| |_
LL | struct Foo {
| ---------- the attribute applies to this struct
LL | / struct Foo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here I guess, you can see it better here

@GuillaumeGomez GuillaumeGomez force-pushed the migrate-empty_after branch 2 times, most recently from 7089e72 to 1f9ccc2 Compare November 5, 2024 21:02
@jdonszelmann
Copy link
Contributor

much better

@Alexendoo
Copy link
Member

Is there a link to why this is needed?

@jdonszelmann
Copy link
Contributor

So as part of rust-lang/compiler-team#796, I bumped into a few issues with clippy where some lints are late, but use a lot of information that mostly relates to the AST, and have very little reason to interact with the hir. After lowering, clippy is the only part of the compiler using information such as the AttrId, AttrStyle and Span on attributes.

By moving these lints to be early:

  1. attributes, after lowering, can be only 8 bytes instead of their current 48
  2. the refactor to pre-parsed attributes becomes easier
  3. moving one of the lints to early (not this PR) we actually solved a bug with attributes being emitted twice

Hope that clears some things up :)

@jdonszelmann
Copy link
Contributor

for 2), that might mean that at some point in the future I will revisit these lints and make them use the new attribute parsing logic. By making them early, they can enjoy the old parsing logic for a little longer while the new parsing logic can be implemented without having to fix clippy at the same time as introducing a large refactoring in the compiler.

@flip1995
Copy link
Member

flip1995 commented Nov 6, 2024

Lintcheck diff is quite big and some of them look like FPs. Now I'm even more concerned that rust-lang/rust#132598 might've introduced FPs... We will see tomorrow, once I can do the sync.

@GuillaumeGomez
Copy link
Member Author

A lot of them are because of message changes, like:

 8 |   macro_rules! test_for_each_provider {
-  |   ----------------------------------- the comment documents this macro
+  |   ----------------------------------- the comment documents this macro definition

or of span changes like:

 51 |   struct FlatMapConsumer<'f, C, F> {
-   |   -------------------------------- the comment documents this struct
+   |   ---------------------- the comment documents this struct

For changes like:

Added clippy::doc_lazy_continuation at [url-2.5.2/src/lib.rs:379](https://docs.rs/url/2.5.2/src/url/lib.rs.html#379)

```
warning: doc list item without indentation
   --> target/lintcheck/sources/url-2.5.2/src/lib.rs:379:9
    |
379 |     /// Without it, the last path component is considered to be a “file” name
    |         ^
    |
    = help: if this is supposed to be its own paragraph, add a blank line
    = note: `--force-warn clippy::doc-lazy-continuation` implied by `--force-warn clippy::all`
help: indent this line
    |
379 |     ///   Without it, the last path component is considered to be a “file” name
    |         ++
```

are because of what was discussed here. In short: since it's not part of the doc lints anymore (one of the 3 that was triggering an early return), some more lints are emitted. It's not incorrect but will generate more warnings.

@flip1995
Copy link
Member

flip1995 commented Nov 6, 2024

Just went through all of them: There's actually only 1 FP and I'm not sure if it is related to this PR:

warning: backticks are unbalanced
   --> target/lintcheck/sources/chrono-0.4.38/src/month.rs:191:9
    |
191 |       /// `Month::from_i64(n: i64)`: | `1`                  | `2`                   | ... | `12`
    |  _________^
192 | |     /// ---------------------------| -------------------- | --------------------- | ... | -----
193 | |     /// ``:                        | Some(Month::January) | Some(Month::February) | ... | Some(Month::December)
    | |_______________________________________________________________________________________________________________^
    |
    = help: a backtick may be missing a pair

In chrono-0.4.38/src/month.rs:191. Might be because of the double backticks in the 3rd line. But the span is then wrong.

@GuillaumeGomez
Copy link
Member Author

That seems like a bug into another lint that might have been uncovered by this PR.

@bors
Copy link
Contributor

bors commented Nov 21, 2024

☔ The latest upstream changes (presumably 8298da7) made this pull request unmergeable. Please resolve the merge conflicts.

@jdonszelmann
Copy link
Contributor

hey all, rust-lang/rust#135726 currently can't be merged because of this. The last status seems to be that it's not this PR that introduces a bug. But, someone probably needs to take a decision whether the small regression is worth it or something needs to be done about it right now. (cc @flip1995, you might be the person or know who to defer such decisions to). However, I guess this is my gentle reminder to see if we can resolve and merge this PR :)

I don't mind doing the rebase, though I'm not sure that I can since this branch is by @GuillaumeGomez so I think I'll be unable to push to it.

@GuillaumeGomez
Copy link
Member Author

Didn't know it was blocking your PR. :o

Rebasing it then.

@jdonszelmann
Copy link
Contributor

jdonszelmann commented Jan 24, 2025

Well, you couldn't lol. I didn't know either. When we talked about this in november it seems I anticipated that some day it'd block the PR I made last week and that it should be fixed early hahaha, just seems we never got to it afterwards

xkcd future self, 1421

@GuillaumeGomez
Copy link
Member Author

Fixed merge conflicts.

@jdonszelmann
Copy link
Contributor

May I ask when the next merge to rust-lang/rust is? no need to rush it for me, the other PR might be open for a little longer, just curious :3

@GuillaumeGomez
Copy link
Member Author

Better ask the question on zulip. ;)

Copy link
Member

@y21 y21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I feel like this lint actually made some sense as a late lint seeing how it needs to do things like access the parent node which Just Works on the HIR, but now needs to be manually implemented with a stack of item info, but I guess that's fine

self.items.pop();
}

fn check_item(&mut self, cx: &EarlyContext<'_>, item: &Item) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably also implement check_impl_item and check_trait_item -- they are no longer linted with this change looking at the lintcheck output

self.items.push(ItemInfo {
kind: item.kind.descr(),
name: item.ident.name,
span: if item.span.contains(item.ident.span) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this false? Is this for dummy spans?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I"m trying to figure out but the original impl didn't do this. @GuillaumeGomez ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for proc-macro spans.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right

@jdonszelmann
Copy link
Contributor

@y21 there's a complicated trade-off here. You probably know that, but I guess I'll just document it anyway for future readers.


On the one hand, we want to simplify the way attributes are represented, just like we want to simplify how all elements of a programming language are represented throughout a compiler. for loops become while loops, async becomes state machines, conditionals become jumps and basic blocks, etc. All complexity is reduced until the complexity matches that of what hardware can execute. As the complexity reduces, it becomes simpler to extract information from it in a way in which we can guarantee, to some level, is correct, by being able to reason about it. At the level of the AST, we can't yet reason about types and flows of lifetimes, so we first reduce the complexity. We do what we can or have to at the AST level (name resolution, expansion) until we can safely discard some details and reduce the complexity. This isn't some magical insight of mine or something, I just like this way of thinking about it.

Attributes are interesting, because their complexity is (was) never reduced. On all levels of the compiler, we reason about them as if they're at the syntactical level. While, after expansion, they aren't really used as syntax anymore. They're used as a small set of annotations to guide processes such as code generation. At this point, we don't really care about how the user wrote these annotations, we just care about the list of annotations themselves. This item has C representation, should be used, is stable since 1.85, etc. Those properties are true whether you annotated the item from the inside with #![repr(C)] or from the outside with #[repr(C)]. Similarly, we don't really care anymore whether you said #[repr(C)] #[repr(packed)] struct Foo; or #[repr(C, packed)] struct Foo; because they mean the same. So, there is complexity we can reduce here to make later passes easier. I think this is in most cases a desirable change.

From that point of view, lints that tell users that the syntax of their attributes is wrong, is supposed to happen early. In a stage of the compiler where we still care about syntaxes of things. While lints that care about the structure of attributes,
let's say a hypothetical lint that says repr(C) and repr(Rust) are incompatible, is fine to go at the hir level of abstraction


On the other hand, and that's the downside we see here of course, we sometimes need information spanning these abstraction levels. I wanted to say that this is especially true for clippy. For example, we sometimes need information about the syntax of an attribute, and the type of an item.

However, the compiler has this situation very often too. if the compiler itself needed this kind of spanning information, it'd make some kind of table. Name resolution is an example of this. At the AST level we already analyse code for names, and store that in a bunch of maps. Then, later parts of the compiler can query what name refers to what definition. Type checking is a similar operation, and so are spans to later be used in diagnostics.

Clippy here has the problem that it might need information in a table like this, to span abstraction levels, that the compiler does not. So far, where this happened it lead to awkwardness. Here it's needing parent nodes (information that is only analysed after HIR creation) together with the purely syntactical representation of attributes. Another place you see this in is around format_args!() leading to what we now call "early -> late storage hacks".

I'm not necessarily advocating for those hacks, but in some ways, this is kind of how rustc works, and how compilers work all the time. It's just that clippy can usually benefit from the storage and analyses of the compiler, and only rarely needs to do these kinds of hacks for just itself. The few times that it does we call it a hack because it's so exceptional.


It seems that representing attributes at the AST-complexity level throughout the entire compiler – while lowering the complexity of all other constructs (to HIR, and then MIR, and further) – isn't maintainable. It's unfortunate that this, at least at the moment, gives clippy slightly less information from the compiler side. Still, I'm somewhat convinced that this is the right direction anyway. Clippy lints dealing with the syntactical representation of attributes should be early, since that's where we deal with syntactical constructs. But, I'm not maintaining clippy, so that might be easy for me to say.

I guess this long comment is the reasoning I have behind these changes, and how I like looking at it from, admittedly, a little idealized point of view of what compilers should look like. Maybe that reasoning is good to have documented somewhere, so I guess for now that's here. However, neither rustc nor clippy are even close to ideal, I think we can all mostly agree on that. So, I realise that if the overhead is exceedingly large everywhere except in the compiler, it's well possible I could be wrong. I'm happy to hear either alternative opinions or maybe ways to go forward with this and keep the situation better in clippy as well.

<3

@GuillaumeGomez
Copy link
Member Author

Cherry-picked @jdonszelmann's fix and fixed merge conflict.

@jdonszelmann
Copy link
Contributor

image

0 removed now, awesome

@GuillaumeGomez
Copy link
Member Author

\o/

@jdonszelmann
Copy link
Contributor

@y21 (sorry for the ping) any further objections?

kind: &'static str,
name: Symbol,
span: Span,
mod_items: Vec<NodeId>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only ever needs access to the first item, right? Could store this as first_mod_item: Option<NodeId>

Comment on lines +461 to +462
fn check_impl_item(&mut self, cx: &EarlyContext<'_>, item: &Item<AssocItemKind>) {
self.check_item_kind(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to pop() the inserted items like in check_item_post?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, apparently there's no check_impl_item_post for early lint passes so that's not really possible... Hm, thinking about it I guess it happens to not really matter looking at how this items vec is used. Maybe. It only accesses the last one for the current item or the second-to-last (assuming the parent node is a module) to see if the NodeId is the first item in that module. A missing pop() means that it won't see the module if the previous item in the same module was a trait with an associated item, but in that case it wouldn't do anything anyway because it's not the first module item.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, you're right. I can fix that but in that case I'm going to refile this PR on rust-lang/rust cause otherwise it will take forever

@y21
Copy link
Member

y21 commented Feb 6, 2025

To be clear, I do think the HIR attribute refactor makes sense and I've also noticed before just in how many places the compiler and tools try to parse attributes, so it's definitely a positive change. I didn't mean to imply with my previous comment that we shouldn't do this, it was just an observation that it's not obviously wrong that this was implemented as a late lint pass. I also do agree with most of your points.

The thing that I don't 100% agree with is that any lint that only looks at syntax should be an early lint, because early lints are so incompatible with the rest of the compiler and utils. It's not uncommon that something that looks syntactical at a high level actually does end up needing to do something that requires a TyCtxt later down the line when someone reports a false positive for some edge cases, and then the lint needs to be rewritten. I don't have anything against the format args "hack" and I actually think that could be a nicer solution for allowing access to AST attributes in late lints than converting late lint passes to early lint passes (though I have no idea how or if that would even work if things like AttrId and the span are being removed). Like I said before though it doesn't really seem all that bad or necessary to introduce a side map like that at this point since only two closely related lints are affected.

@jdonszelmann
Copy link
Contributor

@y21 your last comment could not be applied easily here because it needs changes in rustc_lint. I filed mostly the same PR over there which also means the sync back to rust-lang/rust is already done. Let me know if you'd like it done differently but I think it makes some sense to do it that way. rust-lang/rust#136657

Copy link
Member

@flip1995 flip1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting my partial review here, before moving over to the rust repo PR.

@@ -690,7 +611,7 @@ fn check_attrs(cx: &LateContext<'_>, valid_idents: &FxHashSet<String>, attrs: &[
Some(("fake".into(), "fake".into()))
}

if suspicious_doc_comments::check(cx, attrs) || empty_line_after::check(cx, attrs) || is_doc_hidden(attrs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +140 to +146
/// Returns an `Option<Month>` from a i64, assuming a 1-index, January = 1.
///
/// `Month::from_i64(n: i64)`: | `1` | `2` | ... | `12`
/// ---------------------------| -------------------- | --------------------- | ... | -----
/// ``: | Some(Month::January) | Some(Month::February) | ... |
/// Some(Month::December)
fn bar() {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't get linted here, but lintcheck still shows this FP 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second look, the doc_markdown lint doesn't even run on this test file.

@flip1995
Copy link
Member

flip1995 commented Feb 7, 2025

Closing in favor of rust-lang/rust#136657

@flip1995 flip1995 closed this Feb 7, 2025
@GuillaumeGomez GuillaumeGomez deleted the migrate-empty_after branch February 7, 2025 16:15
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 8, 2025
…21,flip1995,WaffleLapkin

Make empty-line-after an early clippy lint

r? `@y21`

95% a refiling of rust-lang/rust-clippy#13658 but for correctness it needed 2 extra methods in `rust_lint` which made it much easier to apply on `rust-lang/rust` than `rust-lang/rust-clippy`.

Commits have been thoroughly reviewed on `rust-lang/clippy already`. The last two review comments there (about using `Option` and popping for assoc items have been applied here.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 8, 2025
…21,flip1995,WaffleLapkin

Make empty-line-after an early clippy lint

r? ``@y21``

95% a refiling of rust-lang/rust-clippy#13658 but for correctness it needed 2 extra methods in `rust_lint` which made it much easier to apply on `rust-lang/rust` than `rust-lang/rust-clippy`.

Commits have been thoroughly reviewed on `rust-lang/clippy already`. The last two review comments there (about using `Option` and popping for assoc items have been applied here.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 8, 2025
…21,flip1995,WaffleLapkin

Make empty-line-after an early clippy lint

r? ```@y21```

95% a refiling of rust-lang/rust-clippy#13658 but for correctness it needed 2 extra methods in `rust_lint` which made it much easier to apply on `rust-lang/rust` than `rust-lang/rust-clippy`.

Commits have been thoroughly reviewed on `rust-lang/clippy already`. The last two review comments there (about using `Option` and popping for assoc items have been applied here.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 8, 2025
Rollup merge of rust-lang#136657 - jdonszelmann:empty-line-after, r=y21,flip1995,WaffleLapkin

Make empty-line-after an early clippy lint

r? ```@y21```

95% a refiling of rust-lang/rust-clippy#13658 but for correctness it needed 2 extra methods in `rust_lint` which made it much easier to apply on `rust-lang/rust` than `rust-lang/rust-clippy`.

Commits have been thoroughly reviewed on `rust-lang/clippy already`. The last two review comments there (about using `Option` and popping for assoc items have been applied here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants