Skip to content

CASSANDRA-19985: Enhance CQLSH to support machine-readable output formatting#4750

Open
arvindKandpal-ksolves wants to merge 1 commit intoapache:trunkfrom
arvindKandpal-ksolves:CASSANDRA-19985
Open

CASSANDRA-19985: Enhance CQLSH to support machine-readable output formatting#4750
arvindKandpal-ksolves wants to merge 1 commit intoapache:trunkfrom
arvindKandpal-ksolves:CASSANDRA-19985

Conversation

@arvindKandpal-ksolves
Copy link
Copy Markdown
Contributor

Enhance CQLSH to support machine-readable output formatting (csv, json)

Currently, cqlsh output formatting provides tabular formatting designed for human readability, which complicates machine processing in automated pipelines. This patch introduces a --mode argument to switch between output formats.

Changes included in this patch:

  • Added --mode argument supporting tabular, csv, and json formats.
  • Retained tabular as the default mode to ensure no breaking changes for existing users.
  • Disabled ANSI color codes and extra footer text (like (X rows)) when csv or json mode is selected to ensure purely machine-readable output.
  • [cite_start]Updated cqlsh.adoc documentation and cqlshrc.sample configuration file[cite: 1, 2].
  • [cite_start]Added unit tests in test_cqlsh_output.py to verify both CSV and JSON formats[cite: 1, 2].

patch by Arvind Kandpal; reviewed by for CASSANDRA-19985

The Cassandra Jira

Comment thread pylib/cqlshlib/cqlshmain.py Outdated
formatted_values = [list(map(self.myformat_value, [row[c] for c in column_names], cql_types)) for row in result.current_rows]

if self.expand_enabled:
if self.mode == 'csv':
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output formatting should probably move to displaying.py instead of cqlshmain. Rather than if-then-else, it would probably make sense to create a class like TablePrinter with subclasses for TabularTablePrinter, JsonTablePrinter, CsvTablePrinter.

self.assertEqual(0, result)
self.assertEqual(output.splitlines()[3].strip(), "{data: 'I''m newb'}")

def test_csv_output(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests should include all data types and include corner cases.

  • How is the number 10,000 formatted in CVS? The name "Smith, Joe" in CSV?
  • The set "{a, b, c}" in JSON?

return
for row in formatted_values:
row_dict = {self._colnames[i]: col.strval for i, col in enumerate(row)}
serialized = json.dumps(row_dict)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check Cassandra types UUID, Decimal, or LocalDate in JSON and ensure they have corresponding unit tests. Strings in formatted values may require escaping.

Comment thread pylib/cqlshlib/displaying.py
Comment thread pylib/cqlshlib/cqlshmain.py Outdated
self.decoding_errors = []

self.writeresult("")
if self.mode == 'csv':
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below in TablePrinter, this should be moved into a factory method

Comment thread pylib/cqlshlib/cqlshmain.py Outdated
if not result.current_rows:
# print header only
self.print_formatted_result(formatted_names, None, with_header=True, tty=tty)
if printer is None or isinstance(printer, TabularTablePrinter):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably shouldn't allow printer to be undefined. Couldn't this be delegated TabularTablePrinter.print_header()?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a --mode option to cqlsh to support machine-readable query output formats (CSV/JSON) alongside the existing tabular format, aiming to make cqlsh easier to integrate into automated pipelines.

Changes:

  • Introduces a TablePrinter abstraction with tabular, csv, and json printer implementations.
  • Wires --mode through option parsing and query result rendering, suppressing some human-oriented output in machine modes.
  • Updates docs/config samples and adds unit tests covering CSV/JSON output formatting.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pylib/cqlshlib/displaying.py Adds printer classes for tabular/CSV/JSON result rendering.
pylib/cqlshlib/cqlshmain.py Adds --mode plumbing and routes result printing through TablePrinter.
pylib/cqlshlib/test/test_cqlsh_output.py Adds unit tests validating CSV and JSON output.
doc/modules/cassandra/pages/managing/tools/cqlsh.adoc Documents the new --mode option.
conf/cqlshrc.sample Adds sample config entry for ui.mode.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

self.print_static_result(result, self.parse_for_update_meta(statement.query_string), with_header=True, tty=self.tty,
printer=TablePrinter.factory(self.mode, self))
if self.elapsed_enabled:
self.writeresult("(%dms elapsed)" % elapsed)
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For csv/json modes, perform_simple_statement still prints elapsed timing to stdout when --elapsed is enabled. This will corrupt the machine-readable stream (e.g., JSON array followed by "(12ms elapsed)").

Consider suppressing elapsed output in csv/json mode, or redirecting it to stderr (consistent with keeping stdout strictly CSV/JSON).

Suggested change
self.writeresult("(%dms elapsed)" % elapsed)
elapsed_msg = "(%dms elapsed)" % elapsed
if self.mode in ('csv', 'json'):
self.printerr(elapsed_msg)
else:
self.writeresult(elapsed_msg)

Copilot uses AI. Check for mistakes.
Comment on lines +296 to +299
self.mode = mode

if self.mode in ('csv', 'json'):
self.color = False
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mode handling is case-sensitive here, but TablePrinter.factory() lowercases the mode. If cqlshrc sets mode = CSV/JSON (or any non-lowercase), the printer will switch formats while the if self.mode in ('csv','json') gates won’t trigger, reintroducing row-count/footer output and colors. Also, self.color = False here is later overwritten by self.color = color further down in __init__, so ANSI colors won’t actually be disabled for csv/json.

Normalize once (e.g., self.mode = mode.lower()) and apply the self.color = False override after self.color = color (or incorporate it into option parsing).

Suggested change
self.mode = mode
if self.mode in ('csv', 'json'):
self.color = False
self.mode = mode.lower()
if self.mode in ('csv', 'json'):
color = False

Copilot uses AI. Check for mistakes.
Comment on lines 954 to +957
# CAS INSERT/UPDATE
self.writeresult("")
self.print_static_result(result, self.parse_for_update_meta(statement.query_string), with_header=True, tty=self.tty)
self.print_static_result(result, self.parse_for_update_meta(statement.query_string), with_header=True, tty=self.tty,
printer=TablePrinter.factory(self.mode, self))
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the CAS INSERT/UPDATE path, csv/json mode still emits a leading blank line (writeresult("")) and never calls printer.finish(). For JSON output this means the opening [ is printed but the closing ] is never written, producing invalid JSON (and for empty results, CSV may never flush the header).

Use the same gating as print_result (no extra blank line for csv/json) and ensure the printer is finished in this path as well.

Copilot uses AI. Check for mistakes.
Comment on lines 974 to 980
if result.current_rows or is_first:
with_header = is_first or tty
self.print_static_result(result, table_meta, with_header, tty, num_rows)
self.print_static_result(result, table_meta, with_header, tty, num_rows, printer)
num_rows += len(result.current_rows)
if result.has_more_pages:
if self.shunted_query_out is None and tty:
# Only pause when not capturing.
input("---MORE---")
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In print_all, with_header = is_first or tty is correct for tabular paging, but it breaks machine modes when tty is true (or --tty is forced):

  • JSON mode will call printer.print_header() on every page, emitting multiple [ and invalidating the JSON.
  • The input("---MORE---") prompt will also intermix with CSV/JSON output.

For csv/json, force tty=False for printing/paging behavior (e.g., pass an effective_tty that is only true in tabular mode), so headers/prompts don’t corrupt the machine-readable stream.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants