Skip to content

feat: upload validation report — row-level errors, coercions, and diff vs. last upload #110

@William-Hill

Description

@William-Hill

Context

From AASCU Intermediary feedback session (see docs/aascu_intermediary_feedback_summary.md, pain points A + E):

"Data is getting incorrectly processed when submitted to the PDP where it is causing data from the institutions to have inaccurate values."

"[At one institution] one of the main person retired, and that was the only person who submitted it. No one else knew how to do it. And apparently, this person did it in some way that was unique to how they wanted to do it."

PDP's submission feedback is opaque enough that errors slip through, and submission knowledge tends to live with one person. A clear, human-readable validation report after every upload solves both problems: it catches errors immediately AND becomes the de-facto runbook for whoever inherits the role.

Goal

After every upload, the user sees a clear validation report showing what was accepted, what was rejected, what was transformed, and how this upload compares to the previous one.

Scope

  • Builds on existing self-service upload (feat: self-service data upload for PDP, AR files, student and course data #86)
  • Post-upload report includes:
    • Row counts: accepted / rejected / coerced
    • Row-level errors with line numbers and human-readable messages ("Row 412: 'enrollment_intensity' value 'pt' is not in allowed set [FT, PT, HT]")
    • Field-level coercions applied ("converted 'Yes'/'No' → boolean for column is_pell_eligible")
    • Deduplication decisions ("3 duplicate student_ids — kept latest by submission_date")
    • Diff vs. previous upload: students added, students dropped, columns added/removed
    • Anomaly flags: campuses dropped, cohort sizes that changed > X%
  • Downloadable as PDF / CSV for record-keeping
  • Optionally email the report to the uploader

Acceptance criteria

  • Report shown immediately after upload completes
  • All row-level errors are line-numbered and human-readable
  • Diff vs. last upload is computed and displayed
  • Report can be downloaded
  • Empty reports (clean upload) are still generated for the audit trail

Related

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions