Skip to content

PARQUET-3364: Allow hyphens in column names in AvroSchemaConverter#3524

Open
yadavay-amzn wants to merge 1 commit intoapache:masterfrom
yadavay-amzn:fix/3364-cli-hyphen-column
Open

PARQUET-3364: Allow hyphens in column names in AvroSchemaConverter#3524
yadavay-amzn wants to merge 1 commit intoapache:masterfrom
yadavay-amzn:fix/3364-cli-hyphen-column

Conversation

@yadavay-amzn
Copy link
Copy Markdown
Contributor

@yadavay-amzn yadavay-amzn commented Apr 21, 2026

What changes were proposed?

Widen the catch in CatCommand.run() from SchemaParseException to RuntimeException so that Avro-related failures (including name validation errors during record reading) fall back to the GroupReader path, which handles non-standard field names correctly.

Why are the changes needed?

The parquet cat CLI command fails on valid Parquet files with hyphenated column names (e.g., Creation-Time). The Avro path throws because Avro rejects hyphens in field names. The GroupReader fallback already handles these files correctly, but the catch was too narrow (SchemaParseException only) to trigger the fallback when the failure occurs during record reading rather than schema conversion.

How was this tested?

The existing CatCommandTest.testCatCommandWithHyphenatedFieldNames test validates this fix — it creates a Parquet file with hyphenated field names and verifies parquet cat reads it successfully.

Closes #3364

…olumn names

Widen the catch in CatCommand.run() from SchemaParseException to
RuntimeException so that Avro-related failures (including name validation
errors during record reading) fall back to the GroupReader path, which
handles non-standard field names correctly.

The existing test testCatCommandWithHyphenatedFieldNames validates this fix.
@yadavay-amzn yadavay-amzn force-pushed the fix/3364-cli-hyphen-column branch from e9f5257 to fa9682b Compare May 6, 2026 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[parquet-cli] Illegal character in column name

1 participant