Skip to content

Batch upsert SEO page audits, progress reporting, and idempotency for onboarding full-site analysis#428

Open
AJaySi wants to merge 1 commit intomainfrom
codex/refactor-onboarding_full_website_analysis_executor
Open

Batch upsert SEO page audits, progress reporting, and idempotency for onboarding full-site analysis#428
AJaySi wants to merge 1 commit intomainfrom
codex/refactor-onboarding_full_website_analysis_executor

Conversation

@AJaySi
Copy link
Owner

@AJaySi AJaySi commented Mar 12, 2026

Motivation

  • Avoid frequent DB commits from concurrent URL workers by decoupling network audits from persistence and reduce contention during full-site onboarding audits.
  • Improve observability and robustness by tracking per-run progress, preserving per-URL failure details, and producing a structured execution summary.
  • Ensure idempotent re-runs update existing SEOPageAudit rows without creating duplicate records.

Description

  • Introduced a configurable persist_batch_size (default 50) and refactored _audit_urls to collect per-page audit records in memory and flush them via a new _bulk_upsert_page_audits function in batches.
  • Changed _audit_single_url to stop performing DB writes and instead return structured results including audit_record and failure_reason, so concurrent tasks do not share mutable session-side writes.
  • Added _build_audit_record to create per-page payloads, _update_progress to persist periodic progress into task.payload and task_log.result_data, and aggregated failure analytics (top_fail_reasons) and idempotency metadata in the final result.
  • Preserved per-URL failure records and continued auditing remaining pages, then persisted failures alongside successes during batch upserts and included failure_details and execution_summary (including success_rate, duration_ms) in task results.

Testing

  • Compiled the modified module with python -m compileall backend/services/scheduler/executors/onboarding_full_website_analysis_executor.py and it succeeded.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant