CASSANDRA-19985: Enhance CQLSH to support machine-readable output formatting#4750
CASSANDRA-19985: Enhance CQLSH to support machine-readable output formatting#4750arvindKandpal-ksolves wants to merge 1 commit intoapache:trunkfrom
Conversation
| formatted_values = [list(map(self.myformat_value, [row[c] for c in column_names], cql_types)) for row in result.current_rows] | ||
|
|
||
| if self.expand_enabled: | ||
| if self.mode == 'csv': |
There was a problem hiding this comment.
Output formatting should probably move to displaying.py instead of cqlshmain. Rather than if-then-else, it would probably make sense to create a class like TablePrinter with subclasses for TabularTablePrinter, JsonTablePrinter, CsvTablePrinter.
| self.assertEqual(0, result) | ||
| self.assertEqual(output.splitlines()[3].strip(), "{data: 'I''m newb'}") | ||
|
|
||
| def test_csv_output(self): |
There was a problem hiding this comment.
Unit tests should include all data types and include corner cases.
- How is the number 10,000 formatted in CVS? The name "Smith, Joe" in CSV?
- The set "{a, b, c}" in JSON?
f33ca19 to
c895e00
Compare
| return | ||
| for row in formatted_values: | ||
| row_dict = {self._colnames[i]: col.strval for i, col in enumerate(row)} | ||
| serialized = json.dumps(row_dict) |
There was a problem hiding this comment.
Check Cassandra types UUID, Decimal, or LocalDate in JSON and ensure they have corresponding unit tests. Strings in formatted values may require escaping.
| self.decoding_errors = [] | ||
|
|
||
| self.writeresult("") | ||
| if self.mode == 'csv': |
There was a problem hiding this comment.
see comment below in TablePrinter, this should be moved into a factory method
| if not result.current_rows: | ||
| # print header only | ||
| self.print_formatted_result(formatted_names, None, with_header=True, tty=tty) | ||
| if printer is None or isinstance(printer, TabularTablePrinter): |
There was a problem hiding this comment.
probably shouldn't allow printer to be undefined. Couldn't this be delegated TabularTablePrinter.print_header()?
…matting (csv, json)
c895e00 to
f400588
Compare
There was a problem hiding this comment.
Pull request overview
Adds a --mode option to cqlsh to support machine-readable query output formats (CSV/JSON) alongside the existing tabular format, aiming to make cqlsh easier to integrate into automated pipelines.
Changes:
- Introduces a
TablePrinterabstraction withtabular,csv, andjsonprinter implementations. - Wires
--modethrough option parsing and query result rendering, suppressing some human-oriented output in machine modes. - Updates docs/config samples and adds unit tests covering CSV/JSON output formatting.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
pylib/cqlshlib/displaying.py |
Adds printer classes for tabular/CSV/JSON result rendering. |
pylib/cqlshlib/cqlshmain.py |
Adds --mode plumbing and routes result printing through TablePrinter. |
pylib/cqlshlib/test/test_cqlsh_output.py |
Adds unit tests validating CSV and JSON output. |
doc/modules/cassandra/pages/managing/tools/cqlsh.adoc |
Documents the new --mode option. |
conf/cqlshrc.sample |
Adds sample config entry for ui.mode. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.print_static_result(result, self.parse_for_update_meta(statement.query_string), with_header=True, tty=self.tty, | ||
| printer=TablePrinter.factory(self.mode, self)) | ||
| if self.elapsed_enabled: | ||
| self.writeresult("(%dms elapsed)" % elapsed) |
There was a problem hiding this comment.
For csv/json modes, perform_simple_statement still prints elapsed timing to stdout when --elapsed is enabled. This will corrupt the machine-readable stream (e.g., JSON array followed by "(12ms elapsed)").
Consider suppressing elapsed output in csv/json mode, or redirecting it to stderr (consistent with keeping stdout strictly CSV/JSON).
| self.writeresult("(%dms elapsed)" % elapsed) | |
| elapsed_msg = "(%dms elapsed)" % elapsed | |
| if self.mode in ('csv', 'json'): | |
| self.printerr(elapsed_msg) | |
| else: | |
| self.writeresult(elapsed_msg) |
| self.mode = mode | ||
|
|
||
| if self.mode in ('csv', 'json'): | ||
| self.color = False |
There was a problem hiding this comment.
mode handling is case-sensitive here, but TablePrinter.factory() lowercases the mode. If cqlshrc sets mode = CSV/JSON (or any non-lowercase), the printer will switch formats while the if self.mode in ('csv','json') gates won’t trigger, reintroducing row-count/footer output and colors. Also, self.color = False here is later overwritten by self.color = color further down in __init__, so ANSI colors won’t actually be disabled for csv/json.
Normalize once (e.g., self.mode = mode.lower()) and apply the self.color = False override after self.color = color (or incorporate it into option parsing).
| self.mode = mode | |
| if self.mode in ('csv', 'json'): | |
| self.color = False | |
| self.mode = mode.lower() | |
| if self.mode in ('csv', 'json'): | |
| color = False |
| # CAS INSERT/UPDATE | ||
| self.writeresult("") | ||
| self.print_static_result(result, self.parse_for_update_meta(statement.query_string), with_header=True, tty=self.tty) | ||
| self.print_static_result(result, self.parse_for_update_meta(statement.query_string), with_header=True, tty=self.tty, | ||
| printer=TablePrinter.factory(self.mode, self)) |
There was a problem hiding this comment.
In the CAS INSERT/UPDATE path, csv/json mode still emits a leading blank line (writeresult("")) and never calls printer.finish(). For JSON output this means the opening [ is printed but the closing ] is never written, producing invalid JSON (and for empty results, CSV may never flush the header).
Use the same gating as print_result (no extra blank line for csv/json) and ensure the printer is finished in this path as well.
| if result.current_rows or is_first: | ||
| with_header = is_first or tty | ||
| self.print_static_result(result, table_meta, with_header, tty, num_rows) | ||
| self.print_static_result(result, table_meta, with_header, tty, num_rows, printer) | ||
| num_rows += len(result.current_rows) | ||
| if result.has_more_pages: | ||
| if self.shunted_query_out is None and tty: | ||
| # Only pause when not capturing. | ||
| input("---MORE---") |
There was a problem hiding this comment.
In print_all, with_header = is_first or tty is correct for tabular paging, but it breaks machine modes when tty is true (or --tty is forced):
- JSON mode will call
printer.print_header()on every page, emitting multiple[and invalidating the JSON. - The
input("---MORE---")prompt will also intermix with CSV/JSON output.
For csv/json, force tty=False for printing/paging behavior (e.g., pass an effective_tty that is only true in tabular mode), so headers/prompts don’t corrupt the machine-readable stream.
Enhance CQLSH to support machine-readable output formatting (csv, json)
Currently,
cqlshoutput formatting provides tabular formatting designed for human readability, which complicates machine processing in automated pipelines. This patch introduces a--modeargument to switch between output formats.Changes included in this patch:
--modeargument supportingtabular,csv, andjsonformats.tabularas the default mode to ensure no breaking changes for existing users.(X rows)) whencsvorjsonmode is selected to ensure purely machine-readable output.cqlsh.adocdocumentation andcqlshrc.sampleconfiguration file[cite: 1, 2].test_cqlsh_output.pyto verify both CSV and JSON formats[cite: 1, 2].patch by Arvind Kandpal; reviewed by for CASSANDRA-19985
The Cassandra Jira