[SPARK-55372][SQL] Fix SHOW CREATE TABLE for tables / views with default collation#54159
Closed
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
Closed
[SPARK-55372][SQL] Fix SHOW CREATE TABLE for tables / views with default collation#54159ilicmarkodb wants to merge 1 commit intoapache:masterfrom
SHOW CREATE TABLE for tables / views with default collation#54159ilicmarkodb wants to merge 1 commit intoapache:masterfrom
Conversation
JIRA Issue Information=== Bug SPARK-55372 === This comment was automatically generated by GitHub Actions |
761a801 to
fad7a52
Compare
Contributor
Author
|
@dongjoon-hyun can you please take a look? |
e720672 to
cc1d409
Compare
Contributor
Author
|
@cloud-fan can you please take a look? |
402e3c7 to
ea8cd91
Compare
ilicmarkodb
commented
Feb 11, 2026
Comment on lines
+217
to
+225
| val stringBuilder = proto.DataType.String.newBuilder() | ||
| // Send collation only for explicit collations (including explicit UTF8_BINARY). | ||
| // Default STRING (case object) has no explicit collation and should omit it. | ||
| if (!s.eq(StringType)) { | ||
| stringBuilder.setCollation(CollationFactory.fetchCollation(s.collationId).collationName) | ||
| } | ||
| proto.DataType | ||
| .newBuilder() | ||
| .setString( | ||
| proto.DataType.String | ||
| .newBuilder() | ||
| .setCollation(CollationFactory.fetchCollation(s.collationId).collationName) | ||
| .build()) | ||
| .setString(stringBuilder.build()) |
Contributor
Author
There was a problem hiding this comment.
Without this change, a user who uses JDBC, for example, and doesn't care about collations would suddenly get COLLATE UTF8_BINARY, as in this test case:
618815f to
bef4d19
Compare
bef4d19 to
d0da1e4
Compare
cloud-fan
approved these changes
Feb 16, 2026
Contributor
|
thanks, merging to master! |
Contributor
|
@ilicmarkodb can you open a backport PR for branch 4.1? |
rpnkv
pushed a commit
to rpnkv/spark
that referenced
this pull request
Feb 18, 2026
…fault collation ### What changes were proposed in this pull request? Fixed `SHOW CREATE TABLE` for tables / views to correctly print `DEFAULT COLLATION collationName`. For example: `CREATE TABLE t (c1 STRING) DEFAULT COLLATION UTF8_LCASE`. Previously, it was printing `COLLATE 'UTF8_LCASE'`, which produces a parsing error. For `UTF8_BINARY` collated / non collated columns (for example, `c1`), the output of `SHOW CREATE TABLE` should print `c1 STRING COLLATE UTF8_BINARY`, so that we don’t inherit the collation from the table or schema, if defined. To achieve this, I changed `typeName` in `StringType` to print `COLLATE UTF8_BINARY` for explicitly collated `UTF8_BINARY` columns. For non-collated `StringType` (case object), `typeName` does not print `COLLATE UTF8_BINARY`, which matches the old behaviour. ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? Yes, corrects `SHOW CREATE TABLE` command. ### How was this patch tested? `show-create-table.sql` golden file. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#54159 from ilicmarkodb/fix_show_create_table. Authored-by: ilicmarkodb <marko.ilic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Member
|
@cloud-fan do you plan to backport this to 4.1? if so, I will |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Fixed
SHOW CREATE TABLEfor tables / views to correctly printDEFAULT COLLATION collationName.For example:
CREATE TABLE t (c1 STRING) DEFAULT COLLATION UTF8_LCASE. Previously, it was printingCOLLATE 'UTF8_LCASE', which produces a parsing error.For
UTF8_BINARYcollated / non collated columns (for example,c1), the output ofSHOW CREATE TABLEshould printc1 STRING COLLATE UTF8_BINARY, so that we don’t inherit the collation from the table or schema, if defined.To achieve this, I changed
typeNameinStringTypeto printCOLLATE UTF8_BINARYfor explicitly collatedUTF8_BINARYcolumns. For non-collatedStringType(case object),typeNamedoes not printCOLLATE UTF8_BINARY, which matches the old behaviour.Why are the changes needed?
Bug fix.
Does this PR introduce any user-facing change?
Yes, corrects
SHOW CREATE TABLEcommand.How was this patch tested?
show-create-table.sqlgolden file.Was this patch authored or co-authored using generative AI tooling?
No.