Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Option to not check included files in clang-tidy #52959

Closed
cinderblock opened this issue Jan 2, 2022 · 8 comments · Fixed by #128150
Closed

Feature Request: Option to not check included files in clang-tidy #52959

cinderblock opened this issue Jan 2, 2022 · 8 comments · Fixed by #128150
Labels
clang-tidy enhancement Improving things as opposed to bug fixing, e.g. new or missing feature

Comments

@cinderblock
Copy link

cinderblock commented Jan 2, 2022

When using clang-tidy to lint files, it needs to #include the dependencies. More often than not, however, dependencies are much larger and out of control of the consumer that is running clang-tidy.

clang-tidy, by default, currently does the correct thing and does not report errors/warnings from the #included files. However, from watching it run, it's clearly still actually checking those files for errors (52k+ warnings suppressed!). This dramatically increases the time it takes for clang-tidy to run from nearly instant to ~3 seconds for a large header file like <napi.h> (and this is a rather fast desktop). This really gets in the way of tools that integrate linters into code editors as every change requires that 3 second delay.

I see options about filtering headers or even lines but, at best, it looks like I'd need to list out every single range of my included header files which is actually a rather large number of files which could also change outside of my control. I have tried a variety of these options and not noticed a significant change in behavior in ways I care about.

Is there a way to do what I'd like? Or is there some fundamental reason why this wouldn't work? Frankly, if feels like this should even be the default.

@cinderblock cinderblock changed the title [clang-tidy] Option to not check included files Feature Request: Option to not check included files in clang-tidy Jan 3, 2022
@njames93
Copy link
Member

clang-tidy needs to build the entire AST for a translation unit so it has to parse all the includes.
However there is some work in a way to disable some checking in header files, but that is not quite ready.

For editor integrations, have you tried clangd. This builds a preamble saving the need to reparse all your header files and then when clang-tidy checks get ran on the AST, they ignore all Declarations that aren't in the main file.

@cinderblock
Copy link
Author

clang-tidy needs to build the entire AST for a translation unit so it has to parse all the includes. However there is some work in a way to disable some checking in header files, but that is not quite ready.

Yes, it's clear that the preprocessor directives need to all be evaluated, which includes #includes. However clang can traverse all of those way faster than clang-tidy does, so it's not an issue of loading the files/parsing the preprocessor. It's clearly the 50k+ warnings that are "suppressed" that are causing the delays.

For editor integrations, have you tried clangd. This builds a preamble saving the need to reparse all your header files and then when clang-tidy checks get ran on the AST, they ignore all Declarations that aren't in the main file.

I have not. I'm actually building with gcc and just using clang-tidy (and clang-format) for "linting" my C/C++. Specifically, I'm using the notskm.clang-tidy VS Code extension. Looks like llvm-vs-code-extensions.vscode-clangd provides this integration and supports clang-tidy. I wonder if this will work for me as I'm using some slightly custom clang-tidy binary wrappers to handle automatically finding some includes.

@njames93
Copy link
Member

I wonder if this will work for me as I'm using some slightly custom clang-tidy binary wrappers to handle automatically finding some includes.

It should be configurable for your needs. You can set it up with additional include folders if there are any issues with them not being found.

@fbridault
Copy link

I was looking if this issue was reported and I am glad to find it here. Of course, I guess this is not trivial to implement.

I just made some tests and for a simple file c++ file, just include <memory>, takes 5 seconds to complete on my laptop, even with system headers excluded. If you add boost/thread/thread.hpp, this takes up to 15 seconds. A regular clang build of a file takes less than a second of course, so as said above, this seems not the preprocessing part but well the detection of warnings.

In the general case, I doubt most C++ developers would care anyway about reports of defects in the STL or any 3rd part library, so that's a big waste of time to spend time for that in my opinion. clang-tidy is so widely used today, that I believe that if you address this, you would reduce a significant part of the carbon footprint of humanity. 🤣

Well that being said, that's always simpler to request something than to do it, so thank you for all your hard work. I just wanted to motivate the need a bit, if that was even necessary... If there is anything else I can do to help please let me know.

@tonygould
Copy link

Not parsing headers that aren't part of your project would be great. clang-tidy is a fantastic tool, but it is very problematic to run it in a script (e.g. on CI) and for it to take ~6x longer. I see there was an attempt to add an option to skip parsing header files, https://reviews.llvm.org/D98710, in 2021. Anyone know why that got abandoned -- looks like some non-trivial work went into it? And whether that work could reasonably be used as a basis for implementing the feature?

@mikael-s-persson
Copy link

Just adding a +1 to the need for this feature. Here is the output from a clang-tidy run I'm staring at right now (very typical output):

132165 warnings generated.
Suppressed 132439 warnings (132155 in non-user code, 10 due to line filter, 274 NOLINT).

It would nice for those 132155 checks to have been skipped entirely. On the bright side, it's nice that clang-tidy is at least honest about how much work it's doing for no reason. ;)

@thomas-seiler-bl
Copy link

It would also be highly apreciated by us if this could be fixed. It feels very inefficient to take so much time for "skipping" 99.99% of all checks because they are not in the current scope.

@EugeneZelenko EugeneZelenko added the enhancement Improving things as opposed to bug fixing, e.g. new or missing feature label Feb 22, 2024
@carlosgalvezp
Copy link
Contributor

I see that some checks use unless(isExpansionInSystemHeader(). Does this fix the problem? Or is there still significant runtime involved in traversing the full AST instead of a partial AST?

If this fixes the problem, can't we apply this to all/most existing checks?

carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Feb 21, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this by prepending a new
ASTConsumer to the list of consumers: this new consumer sets the
traversal scope in the ASTContext, which is later used by the
MatchASTConsumer.

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"IndirectFieldDecl" that appears in the AST when having a global
scope anonymous union.

I have not found a way to make this one work. However, it does seem
like a very niche use case, and the benefits of a 10x faster clang-tidy
largely outweigh the false negative now introduced by this patch. This
use case is therefore removed from the unit test to make it pass.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Feb 21, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this by prepending a new
ASTConsumer to the list of consumers: this new consumer sets the
traversal scope in the ASTContext, which is later used by the
MatchASTConsumer.

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"IndirectFieldDecl" that appears in the AST when having a global
scope anonymous union.

I have not found a way to make this one work. However, it does seem
like a very niche use case, and the benefits of a 10x faster clang-tidy
largely outweigh the false negative now introduced by this patch. This
use case is therefore removed from the unit test to make it pass.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Feb 28, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this by prepending a new
ASTConsumer to the list of consumers: this new consumer sets the
traversal scope in the ASTContext, which is later used by the
MatchASTConsumer.

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"IndirectFieldDecl" that appears in the AST when having a global
scope anonymous union.

I have not found a way to make this one work. However, it does seem
like a very niche use case, and the benefits of a 10x faster clang-tidy
largely outweigh the false negative now introduced by this patch. This
use case is therefore removed from the unit test to make it pass.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Mar 6, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this by prepending a new
ASTConsumer to the list of consumers: this new consumer sets the
traversal scope in the ASTContext, which is later used by the
MatchASTConsumer.

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"IndirectFieldDecl" that appears in the AST when having a global
scope anonymous union.

I have not found a way to make this one work. However, it does seem
like a very niche use case, and the benefits of a 10x faster clang-tidy
largely outweigh the false negative now introduced by this patch. This
use case is therefore removed from the unit test to make it pass.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Mar 10, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this in the
MatchASTConsumer class, so the logic can be reused by other tools.
This behavior is currently off by default, and only clang-tidy
enables skipping system headers. If wanted, this behavior can be
activated by other tools in follow-up patches.

I had to move MatchFinderOptions out of the MatchFinder class,
because otherwise I could not set a default value for the
"bool SkipSystemHeaders" member otherwise. The compiler error message
was "default member initializer required before the end of its
enclosing class".

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"CXXRecordDecl" of the global anonymous union, see:
llvm#130618

I have not found a way to make this work. For now, document the
technical debt introduced.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Mar 11, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this in the
MatchASTConsumer class, so the logic can be reused by other tools.
This behavior is currently off by default, and only clang-tidy
enables skipping system headers. If wanted, this behavior can be
activated by other tools in follow-up patches.

I had to move MatchFinderOptions out of the MatchFinder class,
because otherwise I could not set a default value for the
"bool SkipSystemHeaders" member otherwise. The compiler error message
was "default member initializer required before the end of its
enclosing class".

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"CXXRecordDecl" of the global anonymous union, see:
llvm#130618

I have not found a way to make this work. For now, document the
technical debt introduced.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Mar 12, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this in the
MatchASTConsumer class, so the logic can be reused by other tools.
This behavior is currently off by default, and only clang-tidy
enables skipping system headers. If wanted, this behavior can be
activated by other tools in follow-up patches.

I had to move MatchFinderOptions out of the MatchFinder class,
because otherwise I could not set a default value for the
"bool SkipSystemHeaders" member otherwise. The compiler error message
was "default member initializer required before the end of its
enclosing class".

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"CXXRecordDecl" of the global anonymous union, see:
llvm#130618

I have not found a way to make this work. For now, document the
technical debt introduced.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
carlosgalvezp pushed a commit to carlosgalvezp/llvm-project that referenced this issue Mar 14, 2025
Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this in the
MatchASTConsumer class, so the logic can be reused by other tools.
This behavior is currently off by default, and only clang-tidy
enables skipping system headers. If wanted, this behavior can be
activated by other tools in follow-up patches.

I had to move MatchFinderOptions out of the MatchFinder class,
because otherwise I could not set a default value for the
"bool SkipSystemHeaders" member otherwise. The compiler error message
was "default member initializer required before the end of its
enclosing class".

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"CXXRecordDecl" of the global anonymous union, see:
llvm#130618

I have not found a way to make this work. For now, document the
technical debt introduced.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959
frederik-h pushed a commit to frederik-h/llvm-project that referenced this issue Mar 18, 2025
…8150)

[clang-tidy] Avoid processing declarations in system headers

Currently, clang-tidy processes the entire TranslationUnit, including
declarations in system headers. However, the work done in system
headers is discarded at the very end when presenting results, unless
the SystemHeaders option is active.

This is a lot of wasted work, and makes clang-tidy very slow.
In comparison, clangd only processes declarations in the main file,
and it's claimed to be 10x faster than clang-tidy:

https://github.com/lljbash/clangd-tidy

To solve this problem, we can apply a similar solution done in clangd
into clang-tidy. We do this by changing the traversal scope from the
default TranslationUnitDecl, to only contain the top-level declarations
that are _not_ part of system headers. We do this in the
MatchASTConsumer class, so the logic can be reused by other tools.
This behavior is currently off by default, and only clang-tidy
enables skipping system headers. If wanted, this behavior can be
activated by other tools in follow-up patches.

I had to move MatchFinderOptions out of the MatchFinder class,
because otherwise I could not set a default value for the
"bool SkipSystemHeaders" member otherwise. The compiler error message
was "default member initializer required before the end of its
enclosing class".

Note: this behavior is not active if the user requests warnings from
system headers via the SystemHeaders option.

Note2: out of all the unit tests, only one of them fails:

readability/identifier-naming-anon-record-fields.cpp

This is because the limited traversal scope no longer includes the
"CXXRecordDecl" of the global anonymous union, see:
llvm#130618

I have not found a way to make this work. For now, document the
technical debt introduced.

Note3: I have purposely decided to make this new feature enabled by
default, instead of adding a new "opt-in/opt-out" flag. Having a new
flag would mean duplicating all our tests to ensure they work in both
modes, which would be infeasible. Having it enabled by default allow
people to get the benefits immediately. Given that all unit tests pass,
the risk for regressions is low. Even if that's the case, the only
issue would be false negatives (fewer things are detected), which
are much more tolerable than false positives.

Credits: original implementation by @njames93, here:
https://reviews.llvm.org/D150126

This implementation is simpler in the sense that it does not consider
HeaderFilterRegex to filter even further. A follow-up patch could
include the functionality if wanted.

Fixes llvm#52959

Co-authored-by: Carlos Gálvez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang-tidy enhancement Improving things as opposed to bug fixing, e.g. new or missing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants