-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Fix omission of fields in deeply nested joins with long identifiers (#7513) #7519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix omission of fields in deeply nested joins with long identifiers (#7513) #7519
Conversation
d8589e6 to
19746ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking a crack at this bug!
I left a couple comments, but am not a maintainer (or even very familiar with Gorm internals)
schema/naming.go
Outdated
| ns.IdentifierMaxLength = 64 | ||
| } | ||
|
|
||
| if utf8.RuneCountInString(formattedName) > ns.IdentifierMaxLength { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres (can't speak for others) does not limit names to 63 characters, but instead limits it to 63 bytes. Counting runes instead of bytes can mean an incorrect truncation limit.
For example, using the query:
SELECT id AS "aręęęľlly_long_name_that_isnt_64_runes_long_is_still_truncated" FROM forms LIMIT 1;Yields the results:
| aręęęľlly_long_name_that_isnt_64_runes_long_is_still_trunca |
|---|
| 1 |
Which you'll note is still being truncated, despite it being only 62 runes long.
Here's a go playground with the differences.
It appears within the playground linked above that the truncation that yields similar results to Postgres is the simple 63 byte truncation, not 63 rune truncation.
As a "fun" edge case, if a multi-byte utf8 character is the final character in the string and being truncated, Postgres appears to drop the invalid character (though doing a simple identifer[:63] would result in the last byte being invalid).
e.g.:
SELECT id AS "a_multi_byte_character_that_is_64_runes__long__is__truncated__漢" FROM forms LIMIT 1;Results
| a_multi_byte_character_that_is_64_runes__long__is__truncated__ |
|---|
| 1 |
Mutli byte ending - Go playground
(All my testing was done with Postgres 13.20)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Thanks. I'll fix it today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
callbacks/query.go
Outdated
| if parentTableName != clause.CurrentTable { | ||
| curAliasName = utils.NestedRelationName(parentTableName, curAliasName) | ||
| aliasName := db.NamingStrategy.JoinNestedRelationNames([]string{parentTableName, curAliasName}) | ||
| db.Statement.TruncatedFields[aliasName] = utils.NestedRelationName(parentTableName, curAliasName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be clearer to name this TruncatedAliases since not all aliases being added to the map are field aliases, as seen here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
bc99639 to
a4270cc
Compare
|
Would it be possible to reimplement this feature using ColumnMapping in combination with NamingStrategy? Also, the current truncateName implementation seems to share a lot of duplicated logic with formatName. |
a3054ec to
654aecb
Compare
|
Hello. Sorry for the delay. Looks like current linter issue is not affected by me. |
|
@jinzhu any comments? |
654aecb to
9a48c44
Compare
|
Fix: Truncation Handling for Long Column Identifiers and Aliases in Nested Joins This pull request addresses an issue where long column and table aliases resulting from deeply nested joins in GORM exceed PostgreSQL's identifier byte limit (63 bytes), causing fields to be omitted or SQL errors. The patch introduces systematic truncation and mapping for long identifiers using a byte-aware approach, improving the reliability of queries that generate lengthy aliases in complex join scenarios. The changes affect the naming strategies, query building, and associated tests to ensure compatibility and coverage of edge cases (including multibyte UTF-8 characters). Key Changes• Introduced truncation logic in Affected Areas• This summary was automatically generated by @propel-code-bot |
What did this pull request do?
Fixes #7513
User Case Description
PostgreSQL has 63-character limit for identifiers (including select field aliases).
In a long chain of joins we may have a troubles:
This code truncates long aliases similar
Namer.UniqueNameand stores a truncated to original map indb.Statementthat allow to not touch old fileld mapping strategy.