Skip to content

fix: raise Ignore() when issues are disabled to prevent chord cascade failure#3793

Open
mn-ram wants to merge 3 commits intoaugurlabs:mainfrom
mn-ram:bug/skip-issues-collection-when-disabled
Open

fix: raise Ignore() when issues are disabled to prevent chord cascade failure#3793
mn-ram wants to merge 3 commits intoaugurlabs:mainfrom
mn-ram:bug/skip-issues-collection-when-disabled

Conversation

@mn-ram
Copy link
Copy Markdown

@mn-ram mn-ram commented Mar 25, 2026

Summary

Fixes #3461

`ResourceGoneException` was introduced in commit `04ef1ef` and correctly excluded from Celery's retry policy in `_decide_retry_policy`. However it was never given a graceful skip path at the task level, so any repo with GitHub Issues disabled would silently abort its entire core collection phase — not just issue collection.

This PR applies one fix — catch `ResourceGoneException` and raise `Ignore()` — across three tasks that all access Issues-dependent endpoints:

Task Endpoint Problem before this fix
`collect_issues` `repos/{owner}/{repo}/issues` Caught by bare `except Exception`, returned `-1`
`collect_events` `repos/{owner}/{repo}/issues/events` No exception handling at all
`collect_github_messages` `repos/{owner}/{repo}/issues/comments` No exception handling at all

All three endpoints return HTTP 410 Gone when Issues are disabled on a repo — so all three need the same fix. They are grouped in one PR because the change is identical in each file and splitting them would just produce three single-line PRs for the same root cause.

Because these tasks run inside a Celery chord/chain, an unhandled exception in any one of them aborted the entire collection phase — commits, pull requests, contributors, and releases were all skipped for the affected repo. Raising `Ignore()` lets Celery treat the task as intentionally skipped so the rest of the chord continues normally.

Changes:

  • Catch `ResourceGoneException` in `collect_issues`, `collect_events`, and `collect_github_messages` and raise `celery.exceptions.Ignore()`, following the same pattern already used for `RepoGoneException` → `Reject()` in `detect_move/tasks.py`
  • Add unit tests for `_decide_retry_policy` to explicitly verify that `ResourceGoneException` is excluded from retries
  • Add unit tests for all three tasks confirming `Ignore()` is raised when the GitHub API returns 410

Test Plan

  • `pytest tests/test_tasks/test_task_utilities/test_paginators/test_github_data_access.py -v`
  • `pytest tests/test_tasks/test_github_tasks/test_issues.py -v`
  • Manually add a repo with GitHub Issues disabled (e.g. a docs mirror) and confirm commits, PRs, and contributors still collect while the three tasks log a `WARNING` and no `ERROR` status is set on the repo

… failure

When a repository has GitHub Issues intentionally disabled, the GitHub API
returns HTTP 410 Gone, causing ResourceGoneException to be raised inside
collect_issues. Previously this was swallowed by the bare except block and
returned -1, but Celery still treated the task result in a way that could
propagate through the enclosing chord and abort all collection for that repo
(commits, PRs, contributors, releases — not just issues).

Catch ResourceGoneException before the general except clause and raise
celery.exceptions.Ignore() instead. Celery marks the task as IGNORED (a
graceful skip), so the chord continues and the rest of the repo's data is
collected normally.

Add three unit tests that cover:
- ResourceGoneException on get_resource_page_count (HEAD request phase)
- ResourceGoneException on paginate_resource (pagination phase)
- Unrelated exceptions still return -1 as before

Fixes augurlabs#3461

Signed-off-by: mn-ram <152869502+prakash-kalwaniya@users.noreply.github.com>
mn-ram added 2 commits March 26, 2026 04:26
The issues/events and issues/comments GitHub API endpoints also return
HTTP 410 Gone when Issues are disabled on a repository.  collect_events and
collect_github_messages had no ResourceGoneException handling at all, so they
would raise unhandled exceptions in the secondary collection group that runs
after collect_issues.

Apply the same Ignore()-based fix to both tasks so that the full secondary
group (events, messages, clones) skips gracefully rather than failing when
Issues are disabled.

Also extends the unit-test suite with coverage for collect_events and
collect_github_messages under the same condition.

Refs augurlabs#3461

Signed-off-by: mn-ram <152869502+prakash-kalwaniya@users.noreply.github.com>
The previous fix wrapped the entire GithubTaskManifest context manager in a
try/except, which unnecessarily indented the full function body.  Move the
try/except inside the with-block so it only wraps the two calls that can
raise ResourceGoneException (fast_retrieve_all_pr_and_issue_messages and
process_large_issue_and_pr_message_collection).  No behaviour change.

Also add unit tests for GithubDataAccess._decide_retry_policy (introduced in
commit 04ef1ef) to verify that ResourceGoneException and UrlNotFoundException
are excluded from retries while transient errors are still retried.

Refs augurlabs#3461

Signed-off-by: mn-ram <152869502+prakash-kalwaniya@users.noreply.github.com>
@MoralCode
Copy link
Copy Markdown
Collaborator

this seems like a PR that does a lot of different things

Lets chat about this in more detail over on the CHAOSS Slack in the #wg-augur-8knot channel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

re-check handling of ResourceGoneException when issues are disabled

2 participants