Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(actions): add pre-commit framework with codespell #191

Open
wants to merge 35 commits into
base: trunk
Choose a base branch
from

Conversation

jbampton
Copy link
Member

@jbampton jbampton commented Dec 1, 2023

Official -> "Git hook scripts are useful for identifying simple issues before submission to code review. We run our hooks on every commit to automatically point out issues in code such as missing semicolons, trailing whitespace, and debug statements. By pointing these issues out before code review, this allows a code reviewer to focus on the architecture of a change while not wasting time with trivial style nitpicks."

https://pre-commit.com/

Using a pre-commit framework speeds up development as a lot of tests can be run on the local machine giving instant feedback. So we don't have to wait for the CI / GitHub actions to run to get feedback. The pre-commit automatically fixes some of the issues when you do git commit and if there are any issues the tests are marked as red failed. Then you will need to commit again so that all the tests pass green.

When pre-commit runs with GitHub Actions on the GitHub website the hooks/tests either pass or fail.

There are many more pre-commit checks listed here -> https://pre-commit.com/hooks.html

Lets get this PR merged and then I will look at adding more pre-commit tests 👍

This PR adds codespell to our pre-commit hooks.

The words in codespell.txt are ignored and this file has basically been created by running:

codespell . | cut -f2 -d' ' | tr A-Z a-z | sort | uniq > codespell.txt

from the repo root.

https://github.com/codespell-project/codespell

codespell is one of the leading spell checkers on GitHub.

Going forwards we will need to fix a lot of the misspelled words that are in codespell.txt

Comment on lines +40 to +45
- name: Set PY
run: echo "PY=$(python -VV | sha256sum | cut -d' ' -f1)" >> $GITHUB_ENV
- uses: actions/cache@v3
with:
path: ~/.cache/pre-commit
key: pre-commit|${{ env.PY }}|${{ hashFiles('.pre-commit-config.yaml') }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@dave2wave dave2wave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a huge number of misspellings

About the linter is it only checking comments?

@jbampton
Copy link
Member Author

jbampton commented Dec 2, 2023

https://github.com/codespell-project/codespell
https://pypi.org/project/codespell/

From the official repo:

"Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code (backslash escapes are skipped), but it can be used with other files as well."

So it checks a lot more than just comments.
Some of the misspelled words maybe just code terms that we need to ignore.

@jbampton
Copy link
Member Author

jbampton commented Dec 2, 2023

That's a huge number of misspellings

About the linter is it only checking comments?

If needed you can also exclude files / folders from being spell checked.

Apache Airflow is using codespell with pre-commit as seen at the next link:

https://github.com/apache/airflow/blob/41f4766d5b4873ddbf8daa94d837398342aeaf98/.pre-commit-config.yaml#L274

And they have about 1,800 lines in their ignored words list seen here:

https://github.com/apache/airflow/blob/main/docs/spelling_wordlist.txt

@Pilot-Pirx
Copy link
Member

/extras would definitely need to be excluded.
It contains the translations for all languages. so I would expect a lot of "false positives".

Official -> "Git hook scripts are useful for identifying simple issues before submission to code review. We run our hooks on every commit to automatically point out issues in code such as missing semicolons, trailing whitespace, and debug statements. By pointing these issues out before code review, this allows a code reviewer to focus on the architecture of a change while not wasting time with trivial style nitpicks."

https://pre-commit.com/

Using a `pre-commit` framework speeds up development as a lot of tests can be run on the local machine giving instant feedback. So we don't have to wait for the CI / GitHub actions to run to get feedback. The pre-commit automatically fixes some of the issues when you do git commit and if there are any issues the tests are marked as red failed. Then you will need to commit again so that all the tests pass green.

When pre-commit runs with GitHub Actions on the GitHub website the hooks/tests either pass or fail.

There are many more pre-commit checks listed here -> https://pre-commit.com/hooks.html

Lets get this PR merged and then I will look at adding more pre-commit tests 👍

This PR adds `codespell` to our pre-commit hooks.

The words in `codespell.txt` are ignored and this file has basically been created by running:

`codespell . | cut -f2 -d' ' | tr A-Z a-z | sort | uniq > codespell.txt`

from the repo root.

https://github.com/codespell-project/codespell

`codespell` is one of the leading spell checkers on GitHub.

Going forwards we will need to fix a lot of the misspelled words that are in `codespell.txt`
@jbampton jbampton force-pushed the add-pre-commit-action-with-codespell branch from f9d6ac0 to 070411b Compare November 25, 2024 16:27
@dave2wave
Copy link
Member

@Pilot-Pirx Do you want to give this a try after merging on one of your ongoing change and we can see if the results are acceptable, or if some more exclusions are necessary.

@Pilot-Pirx
Copy link
Member

@dave2wave "I'm sorry, Dave. I'm afraid I can't do that." (HAL 9000)

Honestly, I have no idea what Github actions is about and where to test it. ;-)

@jbampton
Copy link
Member Author

Hey @Pilot-Pirx you can test pre-commit on your local machine:

Install pre-commit:

pip install pre-commit

Then from the repo root run:

pre-commit install

to install the hooks.

Then to test the codespell hook run:

pre-commit run --all-files

Also GitHub Actions don't run the first time they are added to a repo via a pull request. If you merge this PR the new action workflows will run on the new PRs going forwards.

@Pilot-Pirx
Copy link
Member

Hi @jbampton
Thank you for the explanation, but that still needs to be done by someone who understands it... ;-)

@jbampton
Copy link
Member Author

jbampton commented Jan 9, 2025

Currently I also have codespell installed locally and so now I am recreating codespell.txt with:

codespell --skip='./extras' . | cut -f2 -d' ' | tr A-Z a-z | sort | uniq > codespell.txt

Runs from the repo root.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants