Skip to content

fix(formatter): strip terminal escape sequences from non-JSON output#708

Open
nuthalapativarun wants to merge 3 commits intogoogleworkspace:mainfrom
nuthalapativarun:fix/formatter-sanitize-terminal-escapes
Open

fix(formatter): strip terminal escape sequences from non-JSON output#708
nuthalapativarun wants to merge 3 commits intogoogleworkspace:mainfrom
nuthalapativarun:fix/formatter-sanitize-terminal-escapes

Conversation

@nuthalapativarun
Copy link
Copy Markdown

Problem

value_to_cell() returns raw strings from API responses without sanitizing terminal escape sequences. Any response field containing ANSI escape codes (e.g. \x1b]0;...) renders them directly when using --format table, --format yaml, or --format csv.

JSON output (--format json) is safe because serde automatically escapes control characters as \uXXXX. Non-JSON formats pass untrusted content straight to the terminal, allowing an attacker to craft an API response value that injects terminal escape sequences — for example to set the window title, move the cursor, or trigger other VT sequences.

Raised in #635.

Fix

Adds strip_control_chars() — a zero-dependency function that removes:

  • CSI sequences (ESC [ ... <final byte>) — SGR colours, cursor movement, etc.
  • OSC sequences (ESC ] ... BEL or ESC ] ... ESC \) — window title injection, hyperlinks, etc.
  • Other Fe two-char escape sequences (ESC <0x40–0x5F>)
  • Bare C0/C1 control characters — NUL, BEL, BS, CR, etc. (tab and newline are preserved)

value_to_cell() now calls strip_control_chars() for every Value::String, so all string fields rendered through format_value() in non-JSON modes are sanitized before output.

Tests

  • test_strip_control_chars_clean_string — passthrough for safe strings
  • test_strip_control_chars_csi_sequence — SGR colour codes stripped
  • test_strip_control_chars_osc_sequence — BEL- and ST-terminated OSC stripped
  • test_strip_control_chars_c0_control — NUL, BEL, BS, CR stripped; tab/newline kept
  • test_value_to_cell_sanitizes_escape_sequences — end-to-end through value_to_cell

Fixes #635

API responses may contain user-generated content with embedded ANSI escape
codes or C0/C1 control characters. JSON output is safe because serde escapes
them as \uXXXX, but table/CSV/YAML formats passed strings through verbatim,
allowing a malicious API value to inject terminal sequences into the user's
terminal.

Adds strip_control_chars() which removes CSI sequences (ESC [ ... final),
OSC sequences (ESC ] ... BEL/ST), other Fe two-char sequences, and bare
control characters (except tab and newline). Called from value_to_cell() so
every string field rendered by format_value() in non-JSON modes is sanitized.

Fixes googleworkspace#635
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 12, 2026

🦋 Changeset detected

Latest commit: 27bd7b6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@googleworkspace/cli Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a security vulnerability where terminal escape sequences in API responses could be rendered directly in non-JSON output formats, potentially allowing for terminal injection attacks. By implementing a robust sanitization layer, the CLI now ensures that all string values are cleaned of malicious control characters before being displayed to the user.

Highlights

  • Security Improvement: Introduced a new utility function, strip_control_chars, to sanitize strings by removing ANSI escape sequences and control characters.
  • Formatter Update: Updated value_to_cell to automatically apply the sanitization logic to all string fields, ensuring safe output for non-JSON formats like table, CSV, and YAML.
  • Comprehensive Testing: Added a suite of unit tests covering various escape sequence types, including CSI, OSC, and C0/C1 control characters, to verify the sanitization logic.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Generative AI Prohibited Use Policy, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@googleworkspace-bot googleworkspace-bot added the area: core Core CLI parsing, commands, error handling, utilities label Apr 12, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a strip_control_chars function to sanitize terminal escape sequences and control characters in non-JSON outputs, mitigating terminal injection risks. The feedback identifies that the current implementation misses several control string sequences (DCS, SOS, PM, APC) and that the YAML formatter remains unsanitized as it bypasses the updated logic. It is also suggested to expand character filtering to include the Unicode 'Format' category for better security.

…ings

Two issues from review:

1. DCS (ESC P), SOS (ESC X), PM (ESC ^), and APC (ESC _) sequences were not
   handled. The previous 'Other Fe sequences' arm consumed only the introducer
   byte, leaving the sequence body and ST terminator in the output. Each now
   calls consume_until_st() (extracted helper shared with OSC) which drains
   chars until BEL or ESC-backslash ST.

2. The YAML formatter (json_to_yaml) built strings directly from the raw API
   value without going through value_to_cell, so escape sequences survived in
   YAML output. Apply strip_control_chars() at the top of the String branch in
   json_to_yaml so all three non-JSON formats are covered.

Adds tests for DCS/SOS/PM/APC stripping and YAML sanitization.
@googleworkspace-bot
Copy link
Copy Markdown
Collaborator

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a strip_control_chars function to sanitize terminal escape sequences and C0/C1 control characters from non-JSON outputs, such as Table, CSV, and YAML, to prevent terminal injection vulnerabilities. The sanitizer is applied within json_to_yaml and value_to_cell. A review comment suggests refining the consume_until_st logic to ensure it correctly identifies the String Terminator sequence (ESC ) by checking for the backslash before consuming the next character, preventing potential over-consumption of characters.

Previously, any ESC byte inside a control string caused the next
character to be consumed unconditionally. If the ESC was not followed
by a backslash (e.g., a malformed or nested sequence), that character
would be silently dropped. Now peek() checks for '\' before consuming,
so only the valid ESC \ String Terminator is consumed; other ESC bytes
cause an immediate break without over-consuming.
@googleworkspace-bot
Copy link
Copy Markdown
Collaborator

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a security fix to strip terminal escape sequences and control characters from non-JSON outputs, such as Table, CSV, and YAML. This prevents malicious API responses from injecting terminal commands or manipulating the user's terminal display. The implementation includes a new strip_control_chars function that handles various ANSI/VT sequences while preserving tabs and newlines, along with comprehensive tests to verify the sanitization logic. I have no feedback to provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Core CLI parsing, commands, error handling, utilities

Projects

None yet

Development

Successfully merging this pull request may close these issues.

formatter: multiple issues with non-JSON output formats (table, CSV, YAML)

2 participants