Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/codeflash.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Codeflash / Optimize new Python code

on:
pull_request:
paths:
- 'unstructured_ingest/**'


concurrency:
group: "${{ github.workflow }}-${{ github.ref }}"
cancel-in-progress: true


jobs:
optimize:
name: Codeflash / Optimize new Python code
if: ${{ github.actor != 'codeflash-ai[bot]' }}
runs-on: ubuntu-latest
env:
CODEFLASH_API_KEY: ${{ secrets.CODEFLASH_API_KEY }}
NLTK_DATA: ${{ github.workspace }}/nltk_data
steps:
- uses: 'actions/checkout@v4'
- name: Install uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true
python-version: 3.12

- name: Set up Python
run: uv python install

- name: Codeflash
run: |
uv sync --all-groups --all-extras --upgrade
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this uv sync line may cause a pretty big install. lets keep an eye on it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were a bit conservative in the dependency installation stage to ensure all unit tests pass when codeflash is run. We'll be monitoring the logs in github actions to reduce it going forward.

uv pip install unstructured
uv pip install codeflash
uv run python -m nltk.downloader -d $NLTK_DATA punkt_tab averaged_perceptron_tagger_eng
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming we need nltk.downloader as a requirement for unstructured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's needed for the unit tests in test/unit/unstructured

uv run python -m nltk.downloader -d $NLTK_DATA punkt_tab averaged_perceptron_tagger_eng

uv run codeflash
8 changes: 8 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -202,3 +202,11 @@ fail_under = 0

[tool.hatch.build.targets.sdist]
packages = ["/unstructured_ingest"]

[tool.codeflash]
# All paths are relative to this pyproject.toml's directory.
module-root = "unstructured_ingest"
tests-root = "test"
test-framework = "pytest"
ignore-paths = []
formatter-cmds = ["ruff check --exit-zero --fix $file", "ruff format $file"]
Loading