Skip to content

Conversation

@PRAteek-singHWY
Copy link
Contributor

@PRAteek-singHWY PRAteek-singHWY commented Jan 3, 2026

⚠️ This PR depends on

Please review this PR after the above PR has been merged.


📝 Note for reviewers

This PR intentionally limits its scope to backend-only validation improvements in the following file:

  • application/web/web_main.py
  • application/frontend/src/pages/MyOpenCRE/MyOpenCRE.tsx
    No frontend files, APIs, import logic, or data models were modified.

Any additional file diffs shown by GitHub are inherited from stacked branch history and are not part of this change.


Summary

This PR strengthens backend CSV import validation for MyOpenCRE by making validation
errors precise, structured, and actionable.

Earlier backend work (e.g. #683 / #684) introduced basic validation checks, but several
important failure modes still resulted in ambiguous, late, or unstructured errors.
This PR does not replace those validations — it extends and hardens them so that
common CSV mistakes are caught earlier and reported clearly.

The result is safer imports and significantly better feedback when a CSV is invalid,
without changing import behavior.


What is different from previous backend validation

Previous backend validation focused on whether an import could proceed.

This PR focuses on how failures are reported when an import cannot proceed.

Specifically, this PR adds:

  • Explicit validation of CRE value format
  • Structured handling of invalid CRE IDs (instead of uncaught exceptions)
  • Column-level context for row validation errors
  • Early detection of malformed CSV rows (extra columns)
  • Concrete examples included directly in error payloads

All existing validation logic remains intact — this PR makes failures easier to
understand and fix.


Background

Before this PR, CSV validation had several gaps:

  • Invalid CRE values could fail deep in parsing with unclear messages
  • Row-level errors did not consistently identify the column involved
  • CRE format expectations were implicit and undocumented
  • Invalid CRE IDs could raise exceptions instead of structured API responses
  • Users were not shown an example of a valid CRE value
  • Spreadsheet issues like extra columns were not detected early

These issues made it difficult for users to correct CSV files even when they were
close to valid.


What changed

✅ Row-level validation with column context

  • Every validation error now includes:
    • Row number
    • Column name (e.g. CRE 0)
    • Stable error code
    • Clear human-readable message

✅ Explicit CRE format validation

  • CRE values are validated to ensure they follow:

  • Errors include a concrete example:
    616-305|Development processes for security

✅ Graceful handling of invalid CRE IDs

  • Catches InvalidCREIDException
  • Returns structured validation errors instead of server failures
  • Prevents parsing crashes caused by malformed CRE identifiers

✅ Schema robustness

  • Detects rows with extra columns (a common CSV editor issue)
  • Returns a clear schema error pointing to the exact row

✅ Frontend compatibility

  • Error payloads align with the existing frontend error renderer
  • Improved messages automatically surface in the UI
  • No frontend changes required

Scope

  • Backend only
  • No frontend changes
  • No API changes
  • No dependency changes
  • No change to import behavior

Test evidence

Invalid CRE format

Response

{
  "row": 2,
  "column": "CRE 0",
  "code": "INVALID_CRE_FORMAT",
  "message": "Invalid CRE entry format",
  "example": "616-305|Development processes for security"
}

Screenshot

image

Invalid CRE ID

Response

{
  "code": "INVALID_CRE_ID",
  "message": "CRE ID 'CRE-999' does not fit pattern '\\d\\d\\d-\\d\\d\\d'",
  "example": "616-305|Development processes for security"
}

Screenshot

image

Extra columns detected

Response

{
  "type": "SCHEMA_ERROR",
  "message": "Row 2 has more columns than header. Please ensure the CSV matches the exported template."
}

Screenshot

image

Valid import (unchanged behavior)

Response

{
  "status": "success",
  "import_type": "created",
  "new_cres": ["616-305"],
  "new_standards": 1
}

Screenshot

image

PRAteek-singHWY and others added 20 commits December 15, 2025 12:06
- Validate file type, encoding, and required headers
- Accept CSVs generated from CRE catalogue export
- Skip empty and padding rows present in exported templates
- Validate CRE format only when CRE references exist
- Guard against misaligned rows with extra columns
- Return structured validation errors before import

This keeps the importer aligned with the exporter while
preventing malformed inputs from causing server errors.
@PRAteek-singHWY PRAteek-singHWY deleted the myopencre-csv-validation-errors branch January 4, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant