Skip to content

fix(csv): function parse option fieldsPerRecord when negative records may have less fields#7153

Open
PengjuXu wants to merge 1 commit into
denoland:mainfrom
PengjuXu:fix_csv_parse
Open

fix(csv): function parse option fieldsPerRecord when negative records may have less fields#7153
PengjuXu wants to merge 1 commit into
denoland:mainfrom
PengjuXu:fix_csv_parse

Conversation

@PengjuXu
Copy link
Copy Markdown

@PengjuXu PengjuXu commented May 26, 2026

problem

When invoked for object (not array) return, csv/parse function not work with fieldsPerRecord, which states If negative, no check is made and records may have a variable number of fields.
It throws error instead like The record has 5 fields, but the header has 6 fields

fix

Make it allow row to have less fields than header has. (see small commit's test change)
(It still does not allow more fields in row than header, as it is not clear what to do in the case)

I believe #7141 is the better solution (dropping extra fields in the row than header)

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 26, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions Bot added the csv label May 26, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.61%. Comparing base (95a1e2e) to head (d957140).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7153   +/-   ##
=======================================
  Coverage   94.61%   94.61%           
=======================================
  Files         634      634           
  Lines       51843    51855   +12     
  Branches     9346     9350    +4     
=======================================
+ Hits        49050    49062   +12     
  Misses       2218     2218           
  Partials      575      575           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this — the bug is real: with columns/skipFirstRow + fieldsPerRecord: -1, parse() throws despite the docs saying "no check is made and records may have a variable number of fields."

A few things to address before this can land:

1. CsvParseStream is not fixed. It calls the same convertRowToObject (csv/parse_stream.ts:460) and is not updated by this PR, so the bug persists for the streaming API. The fix + a parallel test should be added to parse_stream.ts / parse_stream_test.ts, otherwise the two APIs diverge.

2. Asymmetry contradicts the docs. fieldsPerRecord < 0 should allow any row length — fewer and more. This PR only permits fewer; more still throws (with a new message). You acknowledge this and point at #7141 as a better solution — agreed. It probably makes sense to resolve the design question on #7141 first rather than ship a half-fix here that needs immediate follow-up.

3. JSDoc update. With this change, a short row produces an object where missing columns are simply absent keys (not undefined, not ""). That's a reasonable choice but it's new public-visible behavior — please document it on ParseOptions.fieldsPerRecord in both parse.ts and parse_stream.ts.

Inline comments below for the smaller issues.

Comment thread csv/_io.ts
headers: readonly string[],
zeroBasedLine: number,
/* allow less fields in a row than headers */
allowLessRowFields = false,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small nits:

  • Use // for one-line comments — that's the convention in this file (see e.g. parse.ts:229).
  • allowLessRowFieldsallowFewerRowFields ("fewer" for countables).

Also, threading a boolean flag through an internal helper for one call-site is a bit awkward. Consider passing fieldsPerRecord (or { variable: boolean }) so the policy lives inside the helper rather than at every caller — this would also make it easier to fix CsvParseStream symmetrically.

Comment thread csv/_io.ts
`Syntax error on line ${
zeroBasedLine + 1
}: The record has ${row.length} fields, but the header allows maximum ${headers.length} fields`,
);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues here:

  1. Message wording: the header doesn't "allow" anything. Please mirror the existing expected N fields but got M style used elsewhere (see parse_stream.ts:453): e.g. expected at most ${headers.length} fields but got ${row.length}.
  2. This throws plain Error, but the rest of the parser throws SyntaxError (see parse_stream.ts:450 and the doc example at parse.ts:486). The pre-existing throw above is also Error — worth fixing both while you're here for consistency.

Comment thread csv/_io.ts
}
const out: Record<string, unknown> = {};
for (const [index, header] of headers.entries()) {
if (index === row.length) break;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: a mid-loop break reads a bit awkwardly. Could be expressed more directly as:

const limit = Math.min(row.length, headers.length);
for (let i = 0; i < limit; i++) {
  out[headers[i]] = row[i];
}

or headers.slice(0, row.length).forEach(...). Not blocking.

Comment thread csv/parse.ts
row,
headers,
zeroBasedFirstLineIndex + i,
(options?.fieldsPerRecord ?? 0) < 0,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you adopt the suggestion to push the fieldsPerRecord < 0 check inside convertRowToObject, this call site becomes a plain convertRowToObject(row, headers, zeroBasedFirstLineIndex + i, options.fieldsPerRecord) and the same change can be applied verbatim in parse_stream.ts:460.

Comment thread csv/parse_test.ts
}),
Error,
"Syntax error on line 2: The record has 4 fields, but the header allows maximum 3 fields",
);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name "mismatching number of headers and fields 3" is opaque — please name it after the case being asserted, e.g. "throws when row has more fields than columns with fieldsPerRecord: -1". Same suggestion for the new step at line ~313 — "Allow less row fields than columns" could be "allows rows with fewer fields than columns when fieldsPerRecord is negative".

@PengjuXu
Copy link
Copy Markdown
Author

@bartlomieju Thanks for detailed comments and pointing out my blind spots.
I see that you also commented on #7141 , I can wait for that one to resolve. I believe #7141 is the right fix, but if that fall through, I will make all the changes needed in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants